1. 华中科技大学计算机科学与技术学院,湖北 武汉 430074
2. 华中科技大学大数据技术与系统国家地方联合工程研究中心,服务计算技术与系统教育部重点实验室,湖北武汉 430074
[ "马玮良(1996- ),男,华中科技大学计算机科学与技术学院硕士生,主要研究方向为新体系结构下深度学习系统的优化" ]
[ "彭轩(1995- ),男,华中科技大学计算机科学与技术学院博士生,主要研究方向为分布式深度学习系统平台" ]
[ "熊倩(1997- ),女,华中科技大学计算机科学与技术学院硕士生,主要研究方向为联邦学习" ]
[ "石宣化(1978- ),男,博士,华中科技大学计算机科学与技术学院教授,大数据技术与系统国家地方联合工程研究中心副主任,主要研究方向为并行与分布式计算、多核体系结构与系统软件。当前主要研究云计算与大数据处理、异构并行计算等" ]
[ "金海(1966- ),男,博士,华中科技大学教授,长江学者特聘教授,中国计算机学会会士,IEEE Fellow, ACM终身会员,武汉网络安全战略与发展研究院院长,华中科技大学大数据技术与系统国家地方联合工程研究中心主任,服务计算技术与系统教育部重点实验室主任。主要研究方向为计算机体系结构、计算系统虚拟化、集群计算和云计算、网络安全、对等计算、网络存储与并行I/O等" ]
网络首发:2020-07,
纸质出版:2020-07-15
移动端阅览
马玮良, 彭轩, 熊倩, 等. 深度学习中的内存管理问题研究综述[J]. 大数据, 2020,6(4):2020033-1.
Weiliang MA, Xuan PENG, Qian XIONG, et al. Memory management in deep learning:a survey[J]. Big Data Research, 2020, 6(4): 2020033-1.
马玮良, 彭轩, 熊倩, 等. 深度学习中的内存管理问题研究综述[J]. 大数据, 2020,6(4):2020033-1. DOI: 10.11959/j.issn.2096-0271.2020033.
Weiliang MA, Xuan PENG, Qian XIONG, et al. Memory management in deep learning:a survey[J]. Big Data Research, 2020, 6(4): 2020033-1. DOI: 10.11959/j.issn.2096-0271.2020033.
近年来,深度学习已经在多个领域取得了巨大的成功。深度神经网络向着更深更广的方向发展,训练和部署深度神经网络模型都将面对巨大的内存压力。加速设备有限的内存空间已经成为限制神经网络模型快速发展的重要因素,如何在深度学习中实现高效的内存管理成为深度学习发展的关键问题。为此,介绍了深度神经网络的基本特征;分析了深度学习训练过程中的内存瓶颈;对一些代表性的研究工作进行了分类阐述,并对其优缺点进行了分析;对深度学习中内存管理技术的未来发展趋势进行了探索。
In recent years
deep learning has achieved great success in many fields.As the deep neural network develops towards a deeper and wider direction
the training and inference of a deep neural network face huge memory pressure.The limited memory space of accelerating devices has become an important factor restricting the rapid development of deep neural network.How to achieve efficient memory management in deep learning has become a key point in the development of deep learning.Therefore
the basic characteristics of deep neural network were introduced firstly and memory bottleneck in deep learning training was analyzed.Some representative research works were classified
and their advantages and disadvantages were analyzed.Finally
some important direction and tendency of memory management in deep learning were suggested.
陈游旻 , 李飞 , 舒继武 . 大数据环境下的存储系统构建:挑战、方法和趋势 [J ] . 大数据 , 2019 , 5 ( 4 ): 27 - 40 .
CHEN Y M , LI F , SHU J W . Building storage systems in big data era:challenges,methods and trends [J ] . Big Data Research , 2019 , 5 ( 4 ): 27 - 40 .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // The 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2016 : 770 - 77 8.
IOFFE S , SZEGEDY C . Batch normalization:accelerating deep network training by reducing internal covariate shift [C ] // The 32nd International Conference on Machine Learning . New York:ACM Press , 2015 : 448 - 456 .
KRIZHEVSKY A , SUTSKEVER I , HINTON G . ImageNet classification with deep convolutional neural networks [C ] // The 26th Annual Conference on Neural Information Processing Systems . Cambridge:MIT Press , 2012 : 1106 - 1114 .
HOCHREITER S , SCHMIDHUBER J . Long short-term memory [J ] . Neural Computation , 1997 , 9 ( 8 ): 1735 - 1780 .
SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for largescale image recognition [C ] // The 3rd International Conference on Learning Representations.[S.l.:s.n] . 2014
SZEGEDY C , LIU W , JIA Y Q , et al . Going deeper with convolutions [C ] // The 2015 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2015 : 1 - 9 .
ZAGORUYKO S , KOMODAKIS N . Wide residual networks [J ] . Computer Science , 2016 ,arXiv:1605.07146.
DEVLIN J , CHANG M-W , LEE K , et al . BERT:pretraining of deep bidirectional transformers for language understanding [J ] . Computer Science , 2018 ,arXiv:1810.04805.
王孝远 , 廖小飞 , 刘海坤 , 等 . 面向大数据的异构内存系统 [J ] . 大数据 , 2018 , 4 ( 4 ): 15 - 34 .
WANG X Y , LIAO X F , LIU H K , et al . Big data oriented hybrid memory systems [J ] . Big Data Research , 2018 , 4 ( 4 ): 15 - 34 .
李鑫 , 陈璇 , 黄志球 . 面向大数据应用的混合内存架构特征分析 [J ] . 大数据 , 2018 , 4 ( 3 ): 61 - 80 .
LI X , CHEN X , HUANG Z Q . Analysis on hybrid memory architecture for big data application [J ] . Big Data Research , 2018 , 4 ( 3 ): 61 - 80 .
LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J ] . Proceedings of the IEEE , 1998 , 86 : 2278 - 2324 .
ABADI M , BARHAM P , CHEN J M , et al . TensorFlow:a system for large-scale machine learning [C ] // The 12th USENIX Symposium on Operating Systems Design and Implementation . Berkeley:USENIX Association , 2016 : 265 - 283 .
BERGSTRA J , BREULEUX O , FREDERIC B , et al . Theano:a CPU and GPU math compiler in Python [C ] // The 9th Python for Scientific Computing Conference.[S.l.:s.n] . 2010 : 1 - 7 .
PASZKE A , GROSS S , MASSA F , et al . PyTorch:an imperative style,highperformance deep learning library [C ] // The 2019 Annual Conference on Neural Information Processing Systems . Cambridge:MIT Press , 2019 : 8024 - 803 5.
CHEN T Q , LI M , LI Y T , et al . MXNet:a flexible and efficient machine learning library for heterogeneous distributed systems [J ] . Computer Science , 2015 ,arXiv:1512.01274.
DEAN J , CORRADO G , MONGA R , et al . Large scale distributed deep networks [C ] // The 26th Annual Conference on Neural Information Processing Systems . Cambridge:MIT Press , 2012 : 1232 - 1240 .
RHU M , GIMELSHEIN N , CLEMONS J , et al . vDNN:virtualized deep neural networks for scalable,memoryefficient neural network design [C ] // The 49th Annual IEEE/ACM International Symposium on Microarchitecture . Piscataway:IEEE Press , 2016 : 1 - 13 .
CHEN M , SUN M M , YANG J , et al . Training deeper models by GPU memory optimization on TensorFlow [C ] // Advances in Neural Information Processing Systems 30.[S.l.:s.n] . 2017 .
CHEN X M , CHEN D Z , HAN Y H , et al . moDNN:memory optimal deep neural network training on graphics processing units [J ] . IEEE Transactions on Parallel and Distributed Systems , 2019 , 30 ( 3 ): 646 - 661 .
JIN H , LIU B , JIANG W B , et al . Layercentric memory reuse and data migration for extreme-scale deep learning on many-core architectures [J ] . ACM Transactions on Architecture and Code Optimization , 2018 , 15 ( 3 ): 1 - 26 .
WANG L N , YE J M , ZHAO Y Y , et al . Superneurons:dynamic GPU memory management for training deep neural networks [C ] // The 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming . New York:ACM Press , 2018 : 41 - 53 .
PENG X , SHI X H , DAI H L , et al . Capuchin:tensor-based GPU memory management for deep learning [C ] // The 24th International Conference on Architectural Support for Programming Languages and Operating Systems . New York:ACM Press , 2020 : 891 - 905 .
CHEN T Q , XU B , ZHANG C Y , et al . Training deep nets with sublinear memory cost [J ] . Computer Science , 2016 ,arXiv:1604.06174.
GRUSLYS A , MUNOS R , DANIHELKA I , et al . Memory-efficient backpropagation through time [C ] // The 2016 Annual Conference on Neural Information Processing Systems . Cambridge:MIT Press , 2016 : 4125 - 4133 .
KUSUMOTO M , INOUE T , WATANABE G , et al . A graph theoretic framework of recomputation algorithms for memoryefficient backpropagation [C ] // The 2019 Annual Conference on Neural Information Processing Systems . Cambridge:MIT Press , 2019 : 1161 - 1170 .
JAIN P , JAIN A , NRUSIMHA A , et al . Checkmate:breaking the memory wall with optimal tensor rematerialization [C ] // Machine Learning and Systems 2020.[S.l.:s.n] . 2020 : 497 - 511 .
RHU M,O’CONNOR M , CHATTERJEE N , et al . Compressing DMA engine:leveraging activation sparsity for training deep neural networks [C ] // The IEEE 24th International Symposium on High Performance Computer Architecture . Piscataway:IEEE Press , 2018 : 78 - 91 .
JAIN A , PHANISHAYEE A , MARS J , et al . Gist:efficient data encoding for deep neural network training [C ] // The 45th ACM/IEEE Annual International Symposium on Computer Architecture . Piscataway:IEEE Press , 2018 : 776 - 789 .
HAN S , MAO H Z , DALLY W . Deep compression:compressing deep neural network with pruning,trained quantization and Huffman coding [C ] // The 4th International Conference on Learning Representations.[S.l.:s.n] . 2016 .
0
浏览量
1230
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621