张小云研究员

址:上海市东川路800号电信学院5号楼东303A

话:+86 21 34204510

箱:xiaoyun.zhang@sjtu.edu.cn

研究中心:图像通信研究所

个人主页:https://mediabrain.sjtu.edu.cn/xiaoyun-zhang/

个人简介

现为上海交通大学电信学院研究员,博士生导师。

西安交通大学应用数学本科和硕士,上海交通大学模式识别与智能系统博士,博士论文获上海市优秀博士论文奖和全国百篇优秀博士论文提名奖,2017年受国家留学基金委资助以访问学者身份赴哈佛大学进修一年。

研究领域
主要研究方向为视频图像处理、计算机视觉及其与人工智能技术的结合,并注重相关行业应用技术的研发。承担和完成了多个国家级科研项目,包括国家核高基重大专项、863重点研发、国家自然基金等,主持研发了超高清压缩编码、视频画质增强、视频帧率提升等关键技术和系统。在本领域顶级国际期刊IEEE PAMI、TIP、CSVT和国际会议CVPR、ECCV等发表论文50余篇,申请发明专利30多项,已授权20余项。承担的国家核高基重大专项“高品质电视图像显示处理芯片研发及小批量应用”,为青岛海信电视高品质图像显示处理提供视频倍帧等关键技术,项目成果获得青岛市科学技术一等奖。承担的科技部重点研发计划 “4K超高清媒体的制作与分发关键技术研究”, 为中央广播电视总台的历史影像4K增强提供关键技术支撑,应用于建党百年系列珍贵历史影像、北京冬奥会等重大活动,相关成果获得中国电影电视学会科技进步一等奖(2021)、上海市科技发明一等奖(2022)。
研究方向

1. 视频图像处理、计算机视觉

2. 深度学习,包括深度生成模型、生成式AI

获奖情况

1.  全国百篇优秀博士论文提名奖

2.  上海市优秀博士论文

授课
1.   EE346:  “数字图像处理” (本科生)
科研项目


  1. 国家自然科学基金(面上),面向历史影像的深度先验驱动超分辨率, 2023.1-2026.12
  2. 上海市科委人工智能专项,基于深度生成模型的条件图像生成, 2022.9-2024.9
  3. 国家自然科学基金(面上),帧率提升与压缩编码的联合优化, 2018.1-2021.12
  4. 国家自然科学基金(青年),HEVC的低复杂度和并行编码方法, 2014.1-2016.12
  5. 国家重点研发计划课题, 4K超高清媒体的制作与分发关键技术,2020.1-2022.10
  6. 国家重点研发计划子课题,视频智能感知技术,2020.11-2023.10
  7. 核高基重大专项,高品质电视图像显示处理芯片研发及小批量应用,2013.1-2016.6


著作及专利



1.       一种基于运动区域分割的视频帧率上变换方法及系统

2.       基于深度学习的感兴趣区域图像编码、解码系统及方法

3.       基于深度学习的可变码率图像编码、解码系统及方法

4.       基于深度融合网络的视频运动物体检测系统、方法及终端

5.       一种基于光流的视频拼接方法

6.       基于H.265的多路编码方法

7.       一种用于超高清帧率上变换的片外缓存压缩系统

8.       一种多信息融合的帧率上变换运动估计方法及系统

9.       一种HEVC编码中使用预测模式进行复杂度控制的方法

10.    一种基于编码单元层次的HEVC复杂度控制方法

11. HEVC/H.265的亚像素插值的SIMD快速实现方法

12.    基于边缘的图像显著性区域检测方法

13.    基于贝叶斯最小风险决策的SKIP模式快速选择方法

14. HEVC/H.265DCT变换和反变换的SIMD优化方法

15. x265编码器中帧间编码模式快速选择方法

16.    一种减少存储带宽需求的超高清帧率上变换系统

17.    一种基于不规则采样的电影模式检测方法和装置

18.    一种用于超高清视频帧率上变换的片上缓存系统

19. HEVCB帧的基于SKIP/Merge RD CostCU提前终止方法

20.    一种智能提升运动流畅性的视频帧率上变换方法及系统

21.    基于编码单元层次和率失真代价的HEVC复杂度控制方法

22.    基于逻辑回归分类器的帧内编码单元快速选择方法

23.    一种用于超高清视频处理系统的片外缓存压缩方法

24.    基于卡尔曼滤波的帧率上变换运动估计方法及系统



重要论文


Journal papers:

[1] Guo Lu, Xiaoyun Zhang, Wanli Ouyang, Li Chen, Zhiyong Gao, Dong Xu, "An End-to-End Learning Framework for Video Compression," IEEE Transactions on Pattern Analysis and Machine Intelligence,vo1.43, no.10, pp.3292-3308, Oct., 2021.

[2] Wenbo Bao, Wei-Sheng Lai, Xiaoyun Zhang, Zhiyong Gao, Ming-Hsuan Yang, "MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement," IEEE Transactions on Pattern Analysis and Artificial Intelligence, vol. 43, no. 3, pp.933-948, March 2021.

[3] Shixiang Feng, Beibei Liu, Ya Zhang, Xiaoyun Zhang, Yuehua Li,"Two-Stream Compare and Contrast Network for Vertebral Compression Fracture Diagnosis", IEEE Transactions on Medical Imaging, 40(9):2496-2506, 2021.

[4] Chaofan MaQisen XuXiangfeng WangBo JinXiaoyun ZhangYanfeng WangYa Zhang, "Boundary-aware Supervoxel-level Iteratively Refined Interactive 3D Image Segmentation with Multi-agent Reinforcement Learning", IEEE Transactions on Medical Imaging, 40(10):2563-2574, 2021.

[5] Guo Lu, Xiaoyun Zhang, Wanli Ouyang, Dong Xu, Li Chen, Zhiyong Gao, " Deep Non-local Kalman Network for Video Compression Artifact Reduction," IEEE Transaction on Image Processing, vol.29, pp. 1725-1737, 2020.

[6] Chunlei Cai, Li Chen, Xiaoyun Zhang, Zhiyong Gao, "End-to-End Optimized ROI Image Compression," IEEE Transactions on Image Processing, vol. 29, pp. 3442-3457, 2020.

[7] Qiang Hu, Jun Zhou, Xiaoyun Zhang, Zhiyong Gao, Ming-ting Sun, "In-loop perceptual model-based rate-distortion optimization for HEVC real-time encoder," Journal of Real-Time Image Processing, 2020, 17(2): 293-311.

[8] Qiang Hu, Jun Zhou, Xiaoyun Zhang, Zhiru Shi, Zhiyong Gao, "Viewport-Adaptive 360-degree Video Coding," Multimedia Tools and Applications, Oct. 2019.

[9] Wenbo Bao, Xiaoyun Zhang, Li Chen, Zhiyong Gao, "KalmanFlow 2.0: Efficient Video Optical Flow Estimation via Context-Aware Kalman Filtering," IEEE Transaction on Image Processing, vol.28, no.9, pp. 4233 – 4246, Sept. 2019.

[10]  Chunlei Cai, Li Chen, Xiaoyun Zhang, Zhiyong Gao, "Efficient Variable Rate Image Compression with Multi-scale Decomposition Network," IEEE Transactions on Circuits and Systems for Video Technology, vol.29, no.12, pp. 3687-3700, Dec. 2019.

[11]  Wenbo Bao, Xiaoyun Zhang, Li Chen, Lianghui Ding, Zhiyong Gao, "High-Order Model and Dynamic Filtering for Frame Rate Up-Conversion, " IEEE Transaction on Image Processing, vol. 27, no. 8, pp. 3813-3826, Aug, 2018.

[12]  Guo Lu, Xiaoyun Zhang, Li Chen, Zhiyong Gao, "Novel Integration of Frame Rate Up Conversion and HEVC Coding Based on Rate-Distortion Optimization," IEEE Transaction on Image Processing, vol. 27, no. 2, pp. 678-691, Feb. 2018.

[13]  Bing Yang, Xiaoyun Zhang, Li Chen, Zhiyong Gao, "Spatiotemporal salient object detection based on distance transform and energy optimization," Neurocomputing, vol. 266, pp. 165-175, 2017.

[14]  Bing Yang, Xiaoyun Zhang, Li Chen, Hua Yang, Zhiyong Gao, "Edge Guided Salient Object Detection," Neurocomputing, vol. 221, pp. 60-71, 2017.

[15]  Yong Guo, Li Chen, Zhiyong Gao, Xiaoyun Zhang, "Frame Rate Up-Conversion Using Linear Quadratic Motion Estimation and Trilateral Filtering Motion Smoothing," Journal of Display Technology, vol. 12, no.1, pp. 89-98, Jan. 2016.

[16]  Qiang Hu, Xiaoyun Zhang, Zhiru Shi, Zhiyong Gao, "Neyman-Pearson Based Early Mode Decision for HEVC Encoding," IEEE Transactions on Multimedia, vol. 18, no. 3, pp. 379-391, Mar. 2016.

[17]  Bing Yang, Xiaoyun Zhang, Li Chen, Zhiyong Gao, "Principal Component Analysis-based Visual Saliency Detection," IEEE Transactions on Broadcasting, vol. 62, no. 4, pp.842-854, Dec 2016.

[18]  Yong Guo, Li Chen, Zhiyong Gao, Xiaoyun Zhang, "Frame Rate Up-Conversion Method for Video Processing Applications," IEEE Transactions on Broadcasting, vol. 60, no. 4, pp. 659-669, Dec. 2014.

[19]  Chen Gang, Yang Bing, Zhang Xiaoyun, Gao Zhiyong,Complexity control algorithm based on adaptive mode selection for interframe coding in high efficiency video coding,Journal of Electronic Imaging, 2017.


Conference papers (recent 5 years)

[1] Zhixin Wang, Xiaoyun Zhang, Ziying Zhang, Huangjie Zheng, Mingyuan Zhou, Ya Zhang, Yanfeng Wang.  DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration, accepted by CVPR2023.

[2] Qinye Zhou, Ziyi Li, Weidi Xie, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang. A Simple Plugin for Transforming Images to Arbitrary Scales[J]. British Machine Vision Virtual Conference 2022.

[3] Yixuan Huang, Xiaoyun Zhang, et al., Task Decoupled Framework for Reference-based Super-Resolution, CVPR 2022.

[4] Baisong Guo, Xiaoyun Zhang, et al., LAR-SR: A Local Autoregressive Model for Image Super-Resolution, CVPR 2022.

[5] Yangyi Dong, Xiaoyun Zhang, et c., Unpaired Face Restoration via Learnable Cross-Quality Shift, CVPR 2022 NTIRE.

[6] Tianyue Cao, Lianyu Du, Xiaoyun Zhang, Siheng Chen, Ya Zhang, Yan-Feng Wang, "CaT: Weakly Supervised Object Detection with Category Transfer", in ICCV2021, pp. 3070-3079.

[7] Lianyu Du, Liwei Hu, Xiaoyun Zhang, Yumin Zhong, Ya Zhang, Yanfeng Wang, "Unsupervised Segmentation Framework with Active Contour Models for Cine Cardiac MRI", in ICIP 2021.

[8] Xingyue Pu, Tianyue Cao, Xiaoyun Zhang, Xiaowen Dong, Siheng Chen, "Learning to learn graph topologies", NeurIPS 2021.

[9] Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu,Zhiyong Gao, “Content adaptive and error propagation aware deep video compression,” in European Conference on Computer Vision (ECCV), pp.456-472, Springer, Cham, 2020.

[10]  Yingying Xue, Shixiang Feng, Ya Zhang, Xiaoyun Zhang, Yanfeng Wang, "Dual-task Self-supervision for Cross-Modality Domain Adaptation", MICCAI 2020, pp. 408-417.

[11]  Xuan Liao, Wenhao Li, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Yanfeng Wang , Ya Zhang, "Iteratively-Refined Interactive 3D Medical Image Segmentation with Multi-Agent Reinforcement Learning", CVPR 2020, pp. 9394-9402.

[12]  Chunlei Cai, Li Chen, Xiaoyun Zhang, Zhiyong Gao, “A Novel Deep Progressive Image Compression Framework,” in Picture Coding Symposium (PCS), pp. 1–5, Ningbo, China, November 2019.

[13]  Yuan Tian, Xiongkuo Min, Guangtao Zhai, Zhiyong Gao, "Video-based early Autism Detection via Temporal Pyramid Networks," in IEEE International Conference on Multimedia and Expo (ICME), 2019.

[14]  Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Chunlei Cai, Zhiyong Gao, "DVC: An End-to-End Deep Video Compression Framework," in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

[15]  Wenbo Bao, Chao Ma, Wei-Sheng Lai, Xiaoyun Zhang, Zhiyong Gao, Ming-Hsuan Yang, "Depth-Aware Video Frame Interpolation, " in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

[16]  Shangpeng Yan, Wenbo Bao, Xiaoyun Zhang, Zhiyong Gao, Li Chen, “Large Scale Near-duplicate Image Retrieval via Patch Embedding”, in 4th International Workshop on Compact and Efficient Feature Representation and Learning in Computer Vision 2019.

[17]  Shengyang Li, Xiaoyun Zhang, Xiaoxia Wang, Yumin Zhong, Xiaofen Yao, Ya Zhang,Yanfeng Wang. “Children’s Neuroblastoma Segmentation Using Morphological Features.” In International Workshop on Machine Learning in Medical Imaging, pp. 81-88. Springer, Cham, 2019.

[18]  Lai, Bolin, Shiqi Peng, Guangyu Yao, Ya Zhang, Xiaoyun Zhang, Yanfeng Wang, and Hui Zhao. "Spatial Regularized Classification Network for Spinal Dislocation Diagnosis." In International Workshop on Machine Learning in Medical Imaging, pp. 9-17. Springer, Cham, 2019.

[19]  Peng, Shiqi, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya Zhang, Yan-Feng Wang, and Hui Zhao. "Learning-Based Bone Quality Classification Method for Spinal Metastasis." In International Workshop on Machine Learning in Medical Imaging, pp. 426-434. Springer, Cham, 2019.

[20]  Peng, Shiqi, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya Zhang, Yan-Feng Wang, and Hui Zhao. "Weakly Supervised Segmentation of Vertebral Bodies with Iterative Slice-Propagation." In Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data, pp. 120-128. Springer, Cham, 2019.

[21]  Yuan Tian, Zhaohui Che, Guangtao Zhai, Zhiyong Gao, "BAN, A Barcode Accurate Detection Network," in IEEE International Conference on Visual Communications and Image Processing (VCIP), 2018.

[22]  Cong Geng, Li Chen, Xiaoyun Zhang, Peng Zhou, Zhiyong Gao, "A Wavelet-based Learning for Face Hallucination with Loop Architecture," in IEEE International Conference on Visual Communications and Image Processing (VCIP), 2018.

[23]  Wenbo Bao, Xiaoyun Zhang, Li Chen, Zhiyong Gao, "KalmanFlow: Efficient Kalman Filtering for Video Optical Flow," in IEEE International Conference on Image Processing (ICIP), 2018, pp. 3343-3347.

[24]  Chunlei Cai, Li Chen, Lei Zhou, Xiaoyun Zhang, Zhiyong Gao, "Rcdfnn: Robust Change Detection Based on Convolutional Fusion Neural Network," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 1912-1916.

[25]  Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Zhiyong Gao, Ming-ting Sun, "Deep Kalman Filtering Network for Video Compression Artifact Reduction, " in European Conference on Computer Vision (ECCV), 2018, pp. 568-584.

[26]  Chunmei Xie, Xiaoyun Zhang, Hua Yang, Li Chen, Zhiyong Gao, "Video Stitching Based on Optical Flow," in IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), 2018.

[27]  Lin Chen, Hua Yang, Shuang Wu, Zhiyong Gao, "Data generation for improving person re-identification," in ACM on Multimedia Conference (MM), 2017, pp. 609-617.

[28]  Lin Chen, Hua Yang, Ji Zhu, Qin Zhou, Shuang Wu, Zhiyong Gao, "Deep spatial-temporal fusion network for video-based person re-identification," in IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 63-70.

[29]  Wenbo Bao, Xiaoyun Zhang, Shangpeng Yan, Zhiyong Gao, "Iterative convolutional neural network for noisy image super-resolution," in IEEE International Conference on Image Processing (ICIP), 2017, pp. 4038-4042.