세미나 게시판읽기 ( [ BK21 ] 초청세미나 [ 9 / 23 ] DKM : Differentiable K - means clustering layer for Neural Network Compression )

제목: [BK21] 초청세미나 [9/23] DKM: Differentiable K-means clustering layer for Neural Network Compression

작성일: 2022.09.19

작성자: 전기전자공학부

게시글 내용

아래와 같이 초청 세미나를 개최하오니 많은 참여 부탁드립니다.

◎ 일시: 2022년 9월 23일(금) 14:00~

◎ 장소: 연세대학교 제2공학관 B039호

◎ 제목: DKM: Differentiable K-means clustering layer for Neural Network Compression

◎ 연사: 조민식 박사 / 미국 Apple, Machine Learning R&D

◎ 초청: 전기전자공학과 김한준/양준성 교수

◎ 약력:

Dr. Minsik Cho is a machine learning researcher at Apple Machine Intelligence. Before joining Apple, he was with IBM research working on Deep-Learning/Machine-Learning/BigData acceleration through HW/SW codesign, and Scalable System design for Large-scale Deep-Learning. He received a Ph.D. in ECE from UT-Austin in 2008, and a BS in EE from SNU in 1999.

◎ Abstract

Deep neural network (DNN) model compression for efficient on-device inference is becoming increasingly important to reduce memory requirements and keep user data on-device. To this end, we propose a novel differentiable k-means clustering layer (DKM) and its application to train-time weight clustering-based DNN model compression. DKM casts k-means clustering as an attention problem and enables joint optimization of the DNN parameters and clustering centroids. Unlike prior works that rely on additional regularizers and parameters, DKM-based compression keeps the original loss function and model architecture fixed. We evaluated DKM-based compression on various DNN models for computer vision and natural language processing (NLP) tasks. Our results demonstrate that DKM delivers superior compression and accuracy trade-off on ImageNet1k and GLUE benchmarks. For example, DKM-based compression can offer 74.5% top-1 ImageNet1k accuracy on ResNet50 DNN model with 3.3MB model size (29.4x model compression factor). For MobileNet-v1, which is a challenging DNN to compress, DKM delivers 63.9% top-1 ImageNet1k accuracy with 0.72 MB model size (22.4x model compression factor). This result is 6.8% higher top-1accuracy and 33% relatively smaller model size than the current state-of-the-art DNN compression algorithms. Additionally, DKM enables compression of DistilBERT model by 11.8x with minimal (1.1%) accuracy loss on GLUE NLP benchmarks.

이전글: [BK21] 초청세미나 [9/21] Topological spintronics

다음글: [BK21] 초청세미나 [9/28] Recent Progress in Modeling of Metasurface Apertures using the Coupled Dipolar Framework

연구

세미나