上海交通大学电子工程系

【图象通信研究所】基于云计算的图像分类机器学习

报告人简介

Dr. Zhi Ding (S'88-M'90-SM'95-F'03, IEEE) is a Professor of Electrical and Computer Engineering at the University of California, Davis. He received his Ph.D. degree in Electrical Engineering from Cornell University in 1990. From 1990 to 2000, he was a faculty member of Auburn University and later, University of Iowa. Prof. Ding has held visiting positions in Australian National University, Hong Kong University of Science and Technology, NASA Lewis Research Center and USAF Wright Laboratory. His major research interests lie in the general field of signal processing and communications. Prof. Ding has active collaboration with researchers from many universities including those in Australia, China, Finland, Japan, Canada, Taiwan, Korea, and Singapore. He has coauthored over 400 technical papers and two books. Dr. Ding is a coauthor of the text: Modern Digital and Analog Communication Systems, 5th edition, Oxford University Press, 2019.

Dr. Ding is a Fellow of IEEE and has served on technical programs of a number of workshops and conferences. He served both as a Member and also the Chair of the IEEE Transactions on Wireless Communications Steering Committee from 2007-2001. Dr. Ding was the Technical Program Chair of the 2006 IEEE Globecom and the General Chair of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). He served as an IEEE Distinguished Lecturer (Circuits and Systems Society, 2004-06, Communications Society, 2008-09). He received the 2012 Wireless Communications Recognition Award from the IEEE Communications Society. He currently also serves as the Chief Information Officer of the IEEE Communications Society.

报告摘要

JPEG2000 (j2k) is a highly popular format for image and video compression. With the rapidly growing applications of cloud based image classification, most existing j2k-compatible schemes would stream compressed color images from the source before reconstruction at the processing center as inputs to deep CNNs. We propose to remove the computationally costly reconstruction step by training a deep CNN image classifier using the CDF 9/7 Discrete Wavelet Transformed (DWT) coefficients directly extracted from j2k-compressed images. We demonstrate additional computation savings by utilizing shallower CNN to achieve classification of good accuracy in the DWT domain. Furthermore, we show that traditional augmentation transforms such as flipping/shifting are ineffective in the DWT domain and present different augmentation transformations to achieve more accurate classification without any additional cost. This way, faster and more accurate classification is possible for j2k encoded images without image reconstruction. Through experiments on CIFAR-10 and Tiny ImageNet data sets, we show that the performance of the proposed solution is consistent for image transmission over limited channel bandwidth.