[ "胡正丁(1997- ),男,清华大学计算机科学与技术系硕士生,主要研究方向为高性能计算" ]
[ "薛巍(1974- ),男,博士,清华大学计算机科学与技术系副教授,高性能计算研究所所长,中国计算机学会高级会员,主要研究方向为大规模科学计算、量化不确定分析" ]
网络首发:2020-07,
纸质出版:2020-07-15
移动端阅览
胡正丁, 薛巍. 面向异构众核超级计算机的大规模稀疏计算性能优化研究[J]. 大数据, 2020,6(4):2020032-1.
Zhengding HU, Wei XUE. Research on performance optimization for large-scale sparse computation over many-core heterogenous supercomputer[J]. Big Data Research, 2020, 6(4): 2020032-1.
胡正丁, 薛巍. 面向异构众核超级计算机的大规模稀疏计算性能优化研究[J]. 大数据, 2020,6(4):2020032-1. DOI: 10.11959/j.issn.2096-0271.2020032.
Zhengding HU, Wei XUE. Research on performance optimization for large-scale sparse computation over many-core heterogenous supercomputer[J]. Big Data Research, 2020, 6(4): 2020032-1. DOI: 10.11959/j.issn.2096-0271.2020032.
随着超级计算机技术的发展,大数据应用中大规模稀疏问题的求解成为可能,而稀疏问题的不规则计算和访存特性又给应用实现和性能优化带来了挑战。异构众核是超级计算机系统中的常见架构,其设计向应用开发者提出了高要求,如何发挥其强大的计算能力成为一个难题。分析了稀疏计算的性能优化挑战,介绍了基于典型异构众核计算机系统的3种大规模稀疏处理类应用设计和性能优化案例,以期为在新一代异构众核系统上开展大规模稀疏计算问题求解提供借鉴。
With development of supercomputer technique
it is possible to solve extra-scale sparse problems in big data applications.However
irregular feature in computation and memory access of sparse problems brings challenges to implementation and optimization of applications.Many-core heterogenous architecture is popular in supercomputer design
which advances a higher requirement for application developers.How to utilize its extraordinary computing ability becomes a very difficult problem.Challenges in optimizing sparse computing problems were analyzed
and three cases of implementation and optimization based on typical many-core heterogenous computer system were introduced
which of all achieve very high performance.Experiences in those successful cases were summed up
to better solve extra-scale sparse computing problems on many-core heterogenous system of new generation.
CHEN Y M , LI F , SHU J W . Building storage systems in big data era:challenges,methods and trends [J ] . Big Data Research , 2019 , 5 ( 4 ): 27 - 40 .
WANG X Y , LIAO X F , LIU H K , et al . Big data oriented hybrid memory systems [J ] . Big Data Research , 2018 , 4 ( 4 ): 15 - 34 .
LI X , CHEN X , HUANG Z Q . Analysis on hybrid memory architecture for big data application [J ] . Big Data Research , 2018 , 4 ( 3 ): 61 - 80 .
XU Z G , LIN J , MATSUOKA S . Benchmarking SW26010 many-core processor [C ] // 2017 IEEE International Parallel & Distributed Processing Symposium Workshops . Piscataway:IEEE Press , 2017 .
FU H H , LIAO J F , YANG J Z , et al . The Sunway TaihuLight supercomputer:system and applications [J ] . Science China Information Sciences , 2016 , 59 ( 7 ): 1 - 16 .
ZHANG T J , GAN L , FU H H , et al . SW_GROMACS:accelerate GROMACS on Sunway TaihuLight [C ] // International Conference for High Performance Computing,Networking,Storage and Analysis . New York:ACM Press , 2019 : 1 - 14 .
DUAN X H , GAO P , ZHANG T J , et al . Redesigning LAMMPS for peta-scale and hundred-billion-atom simulation on Sunway TaihuLight [C ] // International Conference for High Performance Computing,Networking,Storage and Analysis . Piscataway:IEEE Press , 2018 .
CHEN B W , FU H H , WEI Y W , et al . Simulating the Wenchuan earthquake with accurate surface topography on Sunway TaihuLight [C ] // International Conference for High Performance Computing,Networking,Storage and Analysis . Piscataway:IEEE Press , 2018 .
YANG C , XUE W , FU H H , et al . A petascalable CPU-GPU algorithm for global atmospheric simulations [C ] // The 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming . New York:ACM Press , 2013 .
XUE W , YANG C , FU H H , et al . Enabling and scaling a global shallow-water atmospheric model on Tianhe-2 [C ] // The 28th IEEE International Parallel& Distributed Processing Symposium . Piscataway:IEEE Press , 2014 : 745 - 754 .
XUE W , YANG C , FU H H , et al . Ultrascalable CPU-MIC acceleration of mesoscale atmospheric modeling on Tianhe-2 [J ] . IEEE Transactions on Computers , 2015 , 64 ( 8 ): 2382 - 2393 .
YANG C , XUE W , FU H H , et al . 10m-core Scalable fully-implicit solver for nonhydrostatic atmospheric dynamics [C ] // International Conference for High Performance Computing,Networking,Storage and Analysis . Piscataway:IEEE Press , 2016 .
CHOW E , PATEL A . Fine-grained parallel incomplete LU factorization [J ] . Siam Journal on Scientific Computing , 2015 , 37 ( 2 ): 169 - 193 .
BURSTEDDE C , GHATTAS O , GURNIS M , et al . Scalable adaptive mantle convection simulation on petascale supercomputers [C ] // The 2008 ACM/IEEE Conference on Supercomputing . New York:ACM Press , 2008 : 1 - 15 .
ROTEN D , CUI Y F , OLSEN K B , et al . High-frequency nonlinear earthquake simulations on petascale heterogeneous supercomputers [C ] // International Conference for High Performance Computing,Networking,Storage and Analysis . Piscataway:IEEE Press , 2016 : 1 - 12 .
FU H H , YIN W W , YANG G W , et al . 18.9 PFlops nonlinear earthquake simulation on Sunway TaihuLight:enabling depiction of 18 Hz and 8 meter scenarios [C ] // International Conference for High Performance Computing,Networking,Storage and Analysis . New York:ACM Press , 2017 : 1 - 12 .
MUSTAFA H , SCHILKEN I , KARASIKOV M , et al . Dynamic compression schemes for graph coloring [J ] . Bioinformatics , 2018 , 35 ( 3 ):3.
PAKKENBERG B , GUNDERSEN H J G . Total number of neurons and glial cells in human brain nuclei estimated by disectorand fractionator [J ] . Journal of Microscopy , 1988 , 150 ( Pt 1 ): 1 - 20 .
LIN H , ZHU X W , YU B W , et al . ShenTu:processing multi-trillion edge graphs on millions of cores in seconds [C ] // International Conference for High Performance Computing,Networking,Storage and Analysis . Piscataway:IEEE Press , 2018 .
SHUN J , BLELLOCH G E . Ligra:a lightweight graph processing framework for shared memory [J ] . ACM SIGPLAN Notices , 2013 , 48 ( 8 ): 135 - 146 .
0
浏览量
947
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621