[ "邓镇龙(1995- ),男,中山大学计算机学院硕士生,主要研究方向为分布式存储。" ]
[ "陈志广(1984- ),男,博士,中山大学计算机学院副教授,主要研究方向为大数据存储与处理、并行与分布式计算、高性能计算与超级计算机。" ]
网络首发:2021-03,
纸质出版:2021-03-15
移动端阅览
邓镇龙, 陈志广. 面向非易失内存的MPI-IO接口优化[J]. 大数据, 2021,7(2):2021020.
Zhenlong DENG, Zhiguang CHEN. An optimization of MPI-IO interface for non-volatile memory[J]. Big data research, 2021, 7(2): 2021020.
邓镇龙, 陈志广. 面向非易失内存的MPI-IO接口优化[J]. 大数据, 2021,7(2):2021020. DOI: 10.11959/j.issn.2096-0271.2021020.
Zhenlong DENG, Zhiguang CHEN. An optimization of MPI-IO interface for non-volatile memory[J]. Big data research, 2021, 7(2): 2021020. DOI: 10.11959/j.issn.2096-0271.2021020.
在高性能计算环境中,MPI应用多个计算节点同时访问底层存储系统文件时,其I/O开销受到访问模式和外存设备性能的影响。针对MPI应用访问文件的特征,利用非易失内存高带宽、低时延、可字节寻址、数据可持久化等优势,提出面向非易失内存的MPI-IO接口优化方案;对文件数据建立分布式的缓存并维护持久性的元数据、对进程间数据传输策略进行优化,使应用可以有效管理、利用非易失内存设备,保持缓存数据一致有效。实验结果证明,所提系统为应用带来数十倍的读写性能提升。未来将进一步优化本方案的并行性。
In an HPC system where multiple computation nodes of an MPI application simultaneously access files in underlying storage systems
the I/O overhead is affected by the access mode and the properties of external storage devices.Based on the patterns of MPI applications to access files
an optimization for MPI-IO interface for persistent memories was introduced on high-bandwidth
low-latency
byte-addressable
data-persistent memories.By constructing distributed data cache
maintaining persistent metadata and leveraging optimizations on data movements among processes
applications were enabled to efficiently manage and utilize persistent memories with data consistency guaranteed
resulting in tens of times improvement on read/write bandwidth.Further optimizations on parallelism were set for future work.
THAKUR R , GROPP W , LUSK E , et al . On implementing MPI-IO portably and with high performance [C ] // The 6th Workshop on I/O in Parallel and Distributed Systems . New York:ACM Press , 1999 : 23 - 32 .
THAKUR R , GROPP W , LUSK E , et al . Data sieving and collective I/O in ROMIO [C ] // The 7th Symposium on the Frontiers of Massively Parallel Computation . Piscataway:IEEE Press , 1999 : 182 - 189 .
SANKARAN S , SQUYRES J M , BARRETT B , et al . The LAM/MPI checkpoint/restart framework:system-initiated checkpointing [J ] . The International Journal of High Performance Computing Applications , 2005 , 19 ( 4 ): 479 - 493 .
IZRAELEVITZ J , YANG J , ZHANG L , et al . Basic performance measurements of the Intel Optane DC persistent memory module [J ] . arXiv preprint,2019,arXiv:1903.05714 .
陈游旻 , 李飞 , 舒继武 . 大数据环境下的存储系统构建:挑战、方法和趋势 [J ] . 大数据 , 2019 , 5 ( 4 ): 27 - 40 .
CHEN Y M , LI F , SHU J W . Building storage systems in big data era:challenges,methods and trends [J ] . Big Data Research , 2019 , 5 ( 4 ): 27 - 40 .
VOLOS H , TACK A J , SWIFT M M , et al . Mnemosyne:lightweight persistent memory [J ] . ACM SIGARCH Computer Architecture News , 2011 , 39 ( 1 ): 91 - 104 .
COBURN J , CAULFIELD A M , AKEL A , et al . NV-Heaps:making persistent objects fast and safe with next-generation,non-volatile memories [J ] . ACM SIGARCH Computer Architecture News , 2011 , 39 ( 1 ): 105 - 118 .
WU X J , QIU S , REDDY A L N , et al . SCMFS:a file system for storage class memory and its extensions [J ] . ACM Transactions on Storage , 2013 , 9 ( 3 ).
CONDIT J , NIGHTINGALE E B , FROST C , et al . Better I/O through byteaddressable,persistent memory [C ] // The ACM SIGOPS 22nd Symposium on Operating Systems Principles . New York:ACM Press , 2009 : 133 - 146 .
CHEN Y M , SHU J W , OU J X , et al . HiNFS:a persistent memory file system with both buffering and direct-access [J ] . ACM Transactions on Storage , 2018 , 14 ( 1 ): 1 - 30 .
XU J , SWANSON S . NOVA:a logstructured file system for hybrid volatile/non-volatile main memories [C ] // The 14th USENIX Conference on File and Storage Technologies .[S.l.:s.n. ] , 2016 : 323 - 338 .
XU J , ZHANG L , MEMARIPOUR A , et al . NOVA-Fortis:a fault-tolerant non-volatile main memory file system [C ] // The 26th Symposium on Operating Systems Principles . New York:ACM Press , 2017 : 478 - 496 .
ZUO P F , HUA Y . A write-friendly and cache-optimized hashing scheme for non-volatile memory systems [J ] . IEEE Transactions on Parallel and Distributed Systems , 2018 , 29 ( 5 ): 985 - 998 .
ZUO P F , HUA Y , WU J . Level Hashing:a high-performance and flexible-resizing persistent hashing index structure [J ] . ACM Transactions on Storage , 2019 , 15 ( 2 ).
LU Y Y , SHU J W , CHEN Y M , et al . Octopus:an RDMA-enabled distributed persistent memory file system [C ] // The 2017 USENIX Annual Technical Conference . New York:ACM Press , 2017 : 773 - 785 .
YANG J , IZRAELEVITZ J , SWANSON S . Orion:a distributed file system for nonvolatile main memories and RDMAcapable networks [C ] // The 17th USENIX Conference on File and Storage Technologies . New York:ACM Press , 2019 : 221 - 234 .
RAHMAN M W , ISLAM N S , LU X , et al . NVMD:non-volatile memory assisted design for accelerating MapReduce and DAG execution frameworks on HPC systems [C ] // 2017 IEEE International Conference on Big Data . Piscataway:IEEE Press , 2017 : 369 - 374 .
吴昊 , 陈康 , 武永卫 , 等 . 基于RDMA和NVM的大数据系统一致性协议研究 [J ] . 大数据 , 2019 , 5 ( 4 ): 89 - 99 .
WU H , CHEN K , WU Y W , et al . Research on the consensus of big data systems based on RDMA and NVM [J ] . Big Data Research , 2019 , 5 ( 4 ): 89 - 99 .
0
浏览量
677
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621