宋利研究员

址:上海市东川路800号电信群楼 5-407

话:+86 21 34204504

箱:song_li@sjtu.edu.cn

研究中心:图像通信研究所

个人简介

宋利,教授,博士生导师,上海交通大学图像通信与网络工程研究所副所长,人工智能研究院、未来媒体网络协同创新中心双聘教授,中国视频用户体验联盟副秘书长及标准组组长。研究方向是媒体智能信号处理、媒体通信与计算系统,主持国家级科研项目10余项,发表学术论文200余篇,授权发明专利40项,软件著作权5项。获国家科技进步二等奖(2015)、上海市科技进步一等奖(2011)、上海市技术发明一等奖(2011)、日本大川基金研究奖(2013)、国际会议优秀论文奖(VCIP2016, WCSP2010)、国际竞赛奖(ICME 2017, 2020)、TVP腾讯云最具价值专家(2019, 2020)。担任IEEE Trans on Broadcasting特邀编委, Springer Multidimensional Systems and Signal Processing编委、是IEEE 高级会员、LiveVideoStackCon(2019,2020)上海峰会主席、中国智慧家庭产业联盟、中国超高清产业联盟、上海市超高清产业联盟、上海市物联网协会、上海市信息家电协会的技术咨询专家,领域知名公众号“媒矿工厂”创建者。更多信息参见其实验室主页:
http://medialab.sjtu.edu.cn/

研究领域

视频编码、图像处理、多媒体系统、计算机视觉及人工智能

研究方向

1.  新型视频编码(Next Generation Video Coding)

-- 视觉感知编码优化

-- AI驱动的视频编码

-- 低延迟弹性编码

2.  视频处理(Video Processsing )

-- Low-level Vision Problem

-- Video Enhancement & Experience Assessment 

-- Computational Camera and Display

3.  视频生成(Video Making)

-- GAN for Vision Modeling

-- DL for Video Synthesis

-- AI for Video Rendering

获奖情况

1、国家科技进步二等奖(排名第5),2015

2. 上海市科技进步一等奖(排名第3),2011

3. 上海市科技发明一等奖(排名第5),2011

4. 日本大川财团研究奖,2013

5. IEEE VCIP Top10%论文奖,2016

6. 国际会议WCSP最佳论文奖,2010

7. IEEE ICME Grand Challenge奖,2017、2020 
8. 腾讯云TVP、声网TVP

授课
  • Spring, 2011-2017(SJTU), Graduate Students(研究生), F034606: Visual Computing Theory and Engineering
  • Spring, 2013-2017(SJTU), Undergraduate(本科),EE346: Introduction to Image Processing(数字图像处理)
  • Spring, 2020(SJTU), Undergraduate(本科),EE450: Multimedia Communication Systems and Applications多媒体通信系统与实现)
科研项目

1、国家重点研发计划,课题负责人,媒体融合架构与编码传送(2019YFB1802701),1131万元,2020.1-2022.12

2、国家自然科学基金,课题负责人,延迟约束的视频编码并行优化研究(61671296), 60万,2017.1-2020.12

3、教育部-中移动联合,课题联合负责人,5G新文娱数字化体验(MCM20180702), 550万,2019.9-2022.9

4、国家科技部863,课题负责人,高端视频编码与网络适配技术研究(2012AA011703),634万,2012.12-2014.12

5、横向项目:华为、谷歌、上汽、中兴、腾讯、中电信、中移动、字节跳动等,>600万, 2018.1-至今 

学术兼职

IEEE Trans on Broadcasting特邀编委

Springer Multidimensional Systems and Signal Processing编委

IEEE ICME, VCIP, ICIP等会议Session/Review Chairs

LiveVideoStackCon(2019,2020)上海峰会主席

中国智慧家庭产业联盟技术咨询专家

中国超高清产业联盟技术咨询专家

上海市超高清产业联盟技术咨询专家

上海市物联网协会技术咨询专家

上海市信息家电协会的技术咨询专家

著作及专利

Book Chapter:

1.  Y. Xu, X. Yang, Li Song, L. Traversoni and W. Lu, “QWT: Retrospective and New Applications”, in Geometric Algebra Computing in Engineering and Computer Science, Springer Bayro-Corrochano, Eduardo Jose; Scheuermann, Gerik (Eds.), 2010, XVI, 524 p, ISBN: 978-1-84996-107-3.

2.  J. Hu, Y. Fang, N. Ling, and Li Song, “Cloud-based Large-scale Topic Modeling for Multimedia Analysis and Retrieval”. In Cloud Computing and Digital Media: Fundamentals, Techniques, and Applications, Chapman and Hall/CRC, 2015.

3.  J. Feng, X. Huo, Li Song, X. Yang and W. Zhang, “Image Nonnegative Factorization: Formulation and Numerical Strategies”, In Master Lectures on Mathematics, TSIMF Conference Book, International Press(USA) & Higher Education Press(China), 2014.


Patents: 

1. 宋利,马悦,黄琰,一种适用于视频切片的码率控制比特分配方法,ZL201810418031.7

2. 解蓉,魏如君,宋利,一种帧内预测性能提升编码方法,ZL201610726981.7

3. 解蓉,白立勋,宋利,张文军,一种适用于高动态范围的码率控制比特分配方法,ZL201711207765.2

4. 宋利,黄琰,杨小康,安平,基于蒙特卡洛法马尔可夫链的编码参数自适应调整方法,ZL 201810424004.0

5. 宋利, 谢剑锋, 解蓉, 一种时域依赖的码率控制比特分配方法, ZL201510483514.1

6. 宋利,沐方顺,杨小康, 一种加速HEVC编码速度的级联方法, ZL201410719598.X

7. 宋利,才琦,一种HEVC解码器并行任务划分方法,ZL201410077419.7

8. 宋利,赵亚男,王嘉,一种基于先入先出队列的HEVC细粒度并行预测方法,ZL201410005285.8

9. 宋利, 周强, 张文军, 基于关键帧超分辨率重建的视频编码系统, ZL201010292294.1

10. 宋利, 骆政屹, 郑世宝, 基于预测残差自适应调整的视频编码系统, ZL201010203917.3

11. 宋利,骆政屹,郑世宝,王嘉,一种改进的带有预测残差调整的视频编码方法,ZL201210462123.8

12. 宋利,骆政屹,基于Raptor 码的多媒体数据非均等差错保护方法, ZL201310289535.0

13. 宋利,骆政屹,王沛,基于码率切换的多媒体信源信道联合编码传输方法,ZL201110215414.2

14. 宋利,朱雨桐,解蓉,张文军,一种基于视频内容及聚类分析的压缩码率预测方法,ZL201610378960.0

15. 宋利,张玮,杨小康,基于视频空间和时间信息的视频质量及压缩码率估计方法,ZL201410029987.X

16. 宋利, 赵成,解蓉,一种基于代理服务器的DASH流媒体直播方法,ZL201610697879.9

17. 解蓉, 魏如君,宋利,一种帧内预测性能提升编码方法,ZL201610726981.7

18. 宋利,白立勋,解蓉,一种基于显著性的码率控制比特分配方法,ZL201610778561.3

19. 宋利,张翰,杨小康,用于视频编码分数像素插值的卷积神经网络的构建方法,ZL201711207766.7

20. 宋利,张智峰,解蓉,陈立,基于卷积神经网络的视频帧率上变换方法及系统,ZL201811059317.7

21. 宋利,张智峰,解蓉,陈立,基于循环卷积神经网络的视频帧率上变换方法及系统,ZL201811059369.4 

22. 宋利,许经纬,解蓉,张文军,基于深度学习的无监督视频分割方法, ZL201711004135.5

23. 宋利,陈欣苑,杨小康,基于深度递归神经网络的视频去噪方法,ZL201610729038.1

24. 宋利,刘彦凯,解蓉,张文军,一种基于内容分类的视频模糊度检测人眼视觉修正方法, ZL201710874527.0

25. 宋利,陈忱,一种用于无参考图像质量评价的特征构建方法,ZL201410029622.7

26. 宋利,蓝劲鹏,瞿晖,一种视频抖动自动检测方法 ,ZL201410318324.X

27. 宋利,童文靖,杨小康,基于深度学习的视频镜头检测方法, ZL201510332345.1

28. 宋利,瞿辉,基于稀疏度和保真度约束的视频稳像方法,  ZL201310046191.0

29. 宋利,薛耿剑,孙军,基于稀疏性和平滑性的运动目标检测方法,ZL201310029803.5

30. 宋利,刘兵,基于Harr-like特征的文档倾斜校正方法, ZL201210170270.8

31. 宋利, 薛耿剑, 孙军, 基于线性回归模型的目标检测方法, ZL201110223892.8

32. 宋利, 徐振超,薛耿剑,基于训练自适应增强和支持矢量机的目标识别方法, ZL201110099202.2

33. 宋利, 王嘉, 徐奕, 李铀, 主从摄像机系统的全自动标定方法, ZL201010139976.9

34. 徐奕, 宋利, 解蓉, 张文军, 低复杂度的尺度自适应视频目标跟踪方法, ZL200810036762.1

35. 熊红凯, 宋利, 孙军等, 网络视频编码器多维尺度码率控制方法, ZL200510026390.0

36. L. Liu, G. Li, N. Ling, J. Zheng, P. Zhang, Li Song, Reference pixel reduction for intra LM prediction, US patent, US9307237 B2, April 5, 2016.

37. Li Song, J. Xu and F. Wu, Adaptive Updates in Motion Compensated Temporal Filtering, US patent, US 8442108, May.14.2014.


重要论文



  • H. Zhang, Li Song, L. Li, Z. Li, X. Yang, Compression Priors Assisted Convolutional Neural Network for Fractional Interpolation, IEEE Transactions on Circuits and Systems for Video Technology, accept.
  • X. Wang, E. Yang, D. He, Li Song, X. Yu, Rate Distortion Optimization: A Joint Framework and Algorithms for Random Access Hierarchical Video Coding, IEEE Transactions on Image Processing, accept.
  • Y. Dong, Li Song, R. Xie, W. Zhang, Real-time UHD video super-resolution and transcoding on heterogeneous hardware, Journal of Real-Time Image Processing, 2019, Vol.9 pp 1–17.
  •  X. Chen, Chang Xu, X. Yang, Li Song, and D. Tao, “Gated-GAN: Adversarial Gated Networks for Multi-Collection Style Transfer,” IEEE Trans on Image Processing, vol. 28, no. 2, pp.546-559, 2019.
  • B. Wang, G. Chen, L. Fu, Li Song, X. Wang, “DRIMUX: Dynamic Rumor Influence Minimization with User Experience in Social Networks,” IEEE Transactions on Knowledge and Data Engineering, vol.29 , no.10, pp.2168-2181, 2017.
  • X. Lu, Li Song, R. Xie, X. Yang and W. Zhang, “Deep Binary Representation For Efficient Image Retrieval, ” Advances in Multimedia, 2017
  • Y. Zhang, L. Li, Li Song, R. Xie, W. Zhang, FACT: Fused Attention for Clothing Transfer with Generative Adversarial Networks, 2020 AAAI Conference on Artificial Intelligence, Feb 7-12, 2020, New York, USA.
  • S. Yu, X. Tong, Y. Huang, R. Xie and Li Song, Learning-based Quality Enhancement for Scalable Coded Video over Packet Lossy Networks, 2020 IEEE International Conference on Multimedia and Expo (ICME2020), 6-10 July 2020,London, United Kingdom,Virtual
  • J. Ling, H. Xue, Li Song, Shuhui Yang, Rong Xie, Xiao Gu, Toward Fine-grained Facial Expression Manipulation, 16th European Conference on Computer Vision(ECCV2020), 23-28 August 2020, Online
  • S. Peng, Li Song, J. Ling, R. Xie, etc., A Deep Tracking and Segmentation Approach for Soccer Videos Visual Effects, The 3rd Chinese Conference on Pattern Recognition and Computer Vision(PRCV 2020), 2020,10.16-18, Nanjing, China
  •  X. Wang, Z. Luo, P. Li, Li Song, Learning Based Estimation of Video Coding Distortion,  IEEE International Symposium on Circuits and Systems (ISCAS2020) ,virtual event during 11-14 October 2020
  • H. Xue, J. Ling, Li Song, R. Xie, W. Zhang, Realistic Talking Face Synthesis with Geometry-aware Feature Transformation, IEEE International Conference on Image Proceesing (ICIP2020), Oct.25-28, Ubited Arab Emirates, online event.
  • Z. Yang, Li Song, Rong Xie, Wenjun Zhang, etc., TSGAN: A Two-Stream Genrative Adversarial Network for Biit-depth Expansion, IEEE Symposium on Broadband Multimedia Systems and Broadcasting(BMSB 2020), 26-29, October, online
  • Z. Yang, Y. Dong, Li Song, R. Xie, etc., Native Resolution Detection for 4K-UHD Videos, IEEE Symposium on Broadband Multimedia Systems and Broadcasting(BMSB 2020), 26-29, October, online
  • J. Wu, R. Xie, Li Song, B. Liu, Deep Feature Guided Image Retargeting, 2019 IEEE International Conference on Visual Communications and Image Processing(VCIP 2020), Dec.1-4, 2019, Sydney, Australia.
  • Z. Yang, T. He, Li Song, R. Xie, X. Gu, An Improved QoE Evaluation Model for HTTP Adaptive Streaming, 2019 IEEE International Conference on Visual Communications and Image Processing(VCIP 2020), Dec.1-4, 2019, Sydney, Australia.
  • C. Zhu, Li Song, R. Xie, J. Han, Y. Xu, JND-based Perceptual Rate Distortion Optimization for AV1 Encoder, IEEE Picture Coding Symposium(PCS2019) , Ningbo, China, Nov.12-15, 2019.
  • Y. Huang, Li Song, E. Izquierdo, CNN Accelerated Intra Video Coding, Where Is the Upper Bound?, IEEE Picture Coding Symposium(PCS2019), Ningbo, China, Nov.12-15, 2019.
  • Y. Xu, S. Ning, R. Xie, Li Song, GAN Based Multi-exposure Inverse Tone Mapping, IEEE ICIP 2019, Taipei, Taiwan, Sep.22-25, 2019.
  • H. Zhang, L. Li, Li Song, X. Yang, Z. Li, Advanced CNN Based Motion Compensation Fractional Interpolation, IEEE ICIP 2019, Taipei, Taiwan, Sep.22-25, 2019.
  • Y. Xu, Li Song, R. Xie and W. Zhang, Deep Video Inverse Tone Mapping, IEEE International Conference on Multimedia Big Data (BigMM), Singapore, Sep. 11-13, 2019.
  • X. Li, S. Wang, C. Zhu, Li Song, R. Xie and W. Zhang, Viewport Prediction for Panoramic Video with Multi-CNN, IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Jeju, South Korea, June 5-7th, 2019.
  • X. Wu, X. Li, X. Tong, R. Xie, Li Song, Reinforcement Learning Based Adaptive Bitrate Algorithm for Transmitting Panoramic Videos, IEEE International Symposium on Circuits and Systems(ISCAS), Sapporo, Hokkaido, Japan, May 26-29, 2019.
  • Z. Luo, Y. Huang, X. Wang, R. Xie and Li Song, VMAF Oriented Perceptual Optimization for Video Coding, IEEE International Symposium on Circuits and Systems(ISCAS), Sapporo, Hokkaido, Japan, May 26-29, 2019.
  • Z. Ma, S. Yu, Y. Huang, R. Xie, Li Song, An improved Real-Time Video Communication System, IEEE International Conference on Visual Communications and Image Processing (VCIP), Taiwan, China, Dec 12-17, 2018
  • Y. Luo, X. Liu, C. Zhu, R. Xie, Li Song, Rate-mixed HEVC Tile based 360 Video Streaming System, IEEE International Conference on Visual Communications and Image Processing (VCIP), Taiwan, China, Dec 12-17, 2018
  • Z. Zhang, Li Song, R. Xie, L. Chen, Video Frame Interpolation Using Recurrent Convolutional Layers, IEEE International Conference on Multimedia Big Data (BigMM), Sep.13-16, 2018, Xian, China.
  • Z. Zhang, L. Chen, R. Xie, Li Song, Frame Interpolation via Refined Deep Voxel Flow, IEEE International Conference on Image Processing (ICIP), October 7-10, 2018, Athens, Greece.
  • J. Tang, Z. Luo, R. Xie, Y. Huang, Li Song, GPU Based Motion-Compensated Frame Interpolation Acceleration for Future Video Coding, IEEE International Conference on Image Processing(ICIP), October 7-10, 2018, Athens, Greece.
  • Y. Ma, Y. Huang, R. Xie, Li Song, A Segment Constraint ABR Algorithm for HEVC Encoder, The 13th IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), June 6-8, 2018, Valencia, Spain.
  • Y. Dong, Li Song, R. Xie, W. Zhang, A Generic Distributed Scheduling Algorithm for Frame Rate Up Convert Video Transcoding, The 13th IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), June 6-8, 2018, Valencia, Spain.
  • Y. Huang, Li Song, R. Xie, Z. Luo, X. Wang, An MCMC Based Efficient Parameter Selection Model for x265 Encoder, IEEE International Symposium on Circuits and Systems(ISCAS) , Florence, Italy, May 27-30, 2018
  • H. Wang, Li Song, R. Xie, Z. Luo, X. Wang, Masking Effects Based Rate Control Scheme for High Efficiency Video Coding, IEEE International Symposium on Circuits and Systems(ISCAS) , Florence, Italy, May 27-30, 2018
  • S. Ning, H. Xu, Li Song, R. Xie, W. Zhang, Learning an Inverse Tone Mapping Network with a Generative Adversarial Regularizer, IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), April 15-20, 2018, Calgary, Alberta, Canada.
  • Y. Dong, X. Zhang, Y. Zhao, Li Song, A Containerized Media Cloud for Video Transcoding Service, IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, USA, Jan.12-14, 2018.X.
  • B. Li, Li Song, R. Xie, W. Zhang, Weight-Based Bit Allocation Scheme for VR Videos in HEVC, IEEE International Conference on Visual Communications and Image Processing (VCIP), St. Petersberg, Florida, US, Dec 10-13, 2017
  • L. Bai, Li Song, R. Xie, L. Zhang, Z. Luo, Rate Control Model for High Dynamic Range Video,” IEEE International Conference on Visual Communications and Image Processing (VCIP), St. Petersberg, Florida, US, Dec 10-13, 2017
  • H. Zhang, Li Song, Z. Luo, X. Yang, Learning a Convolutional Neural Network for Fractional Interpolation in HEVC Inter Coding, International Conference on Visual Communications and Image Processing (VCIP), St. Petersberg, Florida, US, Dec 10-13, 2017
  • C. Li, Li Song, R. Xie, W. Zhang, CNN Based Post-Processing to Improve HEVC,  IEEE International Conference on Image Processing(ICIP), Beijing, China, Sep.17-20, 2017.
  • X. Wang, Li Song, Z. Luo, R. Xie, Lagrangian Method Based Rate-Distortion Optimization Revisited For Dependent Video Coding, IEEE International Conference on Image Processing(ICIP), Beijing, China, Sep.17-20, 2017
  • Y. Liu, Li Song, X. Yang, R. Xie, W. Zhang, Review of recent ITU-T Parametric Models for Compressed Video Quality Estimation, Asia-Pacific Signal and Information Processing Association (APSIPA) ASC, Jeju, Korea, Dec.13-16, 2016.
  • J. Xie, Li Song, R. Xie, Z. Luo, M. Chen, A novel parallel friendly rate control scheme for HEVC, Asia-Pacific Signal and Information Processing Association (APSIPA) ASC, Jeju, Korea, Dec.13-16, 2016.
  • H. Zhang, Li Song, X. Yang, Z. Luo, Evaluation of Beyond-HEVC Entropy Coding Methods for DCT Transform Coefficients, IEEE International Conference on Visual Communications and Image Processing (VCIP), Chengdu, China, Nov.27-30, 2016.
  • J. Xu, Li Song, R. Xie, Shot Boundary Detection Using Convolutional Neural Networks, IEEE International Conference on Visual Communications and Image Processing (VCIP), Chengdu, China, Nov.27-30, 2016.
  • M. Wang, Li Song, X. Yang, C. Luo, A Parallel Fusion RNN-LSTM Architecture for Image Caption Generation, IEEE International Conference on Image Processing(ICIP), Phoenix, Arizona, USA, Sep.25-28,2016
  • L. Song, Y. Liu, X. Yang, G. Zhai, R. Xie, W. Zhang, The SJTU HDR Video Sequence Dataset,”International Conference on Quality of Multimedia Experience(QoMEX2016), June 6-8, 2016, Lisbon, Portugal
  • X. Chen, Li Song, X. Yang, Deep recurrent neural networks for video denoising, Applications of Digital Image Processing, SPIE Conference, Aug.28-Sep.1, 2016, San Diego, CA, USA.