1. 北京联合大学智慧城市学院,北京 100101
2. 北京航空航天大学计算机学院,北京 100191
3. 软件开发环境国家重点实验室,北京 100191
4. 清华大学计算机科学与技术系,北京 100084
5. 中国科学院计算机网络信息中心,北京 100190
6. 中国科学院大学,北京 100190
7. 中山大学计算机学院,广东 广州 510006
[ "秦广军(1977- ),男,博士,北京联合大学智慧城市学院讲师,中国计算机学会会员,主要研究方向为高性能计算、存储系统、大数据和机器学习等。作为项目骨干参与了国家863计划项目、国家重点研发计划项目、国家自然科学基金项目、北京市自然科学基金项目等。" ]
[ "肖利民(1970- ),男,博士,北京航空航天大学计算机学院教授、博士生导师,计算机科学技术系主任,计算机系统结构研究所副所长,中国计算机学会大数据专家委员会委员、高性能计算专业委员会常务委员、容错计算专业委员会委员,中国电子学会云计算专家委员会委员,国家计算机科学技术名词审定委员会委员,国家科技基础条件平台专家组成员,工业和信息化部电子科学技术委员会委员,中国工程院中国信息与电子工程科技发展战略研究中心专家委员会特聘专家。主要研究方向为计算机体系结构、计算机软件系统、高性能计算、云计算、虚拟化技术等。先后获得国家科技进步奖二等奖、北京市科学技术奖一等奖、中国科学院科技进步奖一等奖、原信息产业部信息产业重大技术发明奖、科技部国家重点新产品奖等国家级和省部级科技奖励。" ]
[ "张广艳(1976- ),男,博士,清华大学计算机系长聘副教授、博士生导师,主要从事大数据存储与分析的理论和方法研究,包括大数据计算、存储系统与分布式处理等方面。研究得到了国家杰出青年科学基金项目、国家重点研发计划项目、国家973项目和国家863项目等的支持。近年来提出了大规模存储系统构建及访问的方法与关键技术,有效提高了存储系统的性能、扩展性和可用性。发表学术论文40余篇,其中在FAST、USENIX ATC、ACM TOS、IEEE TC、IEEE TPDS等计算机系统领域高水平国际会议和期刊发表论文20余篇。近五年以第一发明人获得美国发明专利授权1项、中国发明专利授权7项。" ]
[ "牛北方(1978- ),男,博士,中国科学院计算机网络信息中心研究员,中国科学院大学岗位教授、博士生导师。中国计算机学会高性能计算专业委员会委员。主要研究方向为高性能计算、数据分析算法与软件技术。" ]
[ "陈志广(1984- ),男,博士,中山大学计算机学院副教授,主要研究方向为大数据存储与处理、并行与分布式计算、高性能计算与超级计算机。" ]
网络首发:2021-03,
纸质出版:2021-03-15
移动端阅览
秦广军, 肖利民, 张广艳, 等. 面向国家高性能计算环境的虚拟数据空间系统[J]. 大数据, 2021,7(2):2021016.
Guangjun QIN, Limin XIAO, Guangyan ZHANG, et al. Virtual data space system for national highperformance computing environment[J]. Big data research, 2021, 7(2): 2021016.
秦广军, 肖利民, 张广艳, 等. 面向国家高性能计算环境的虚拟数据空间系统[J]. 大数据, 2021,7(2):2021016. DOI: 10.11959/j.issn.2096-0271.2021016.
Guangjun QIN, Limin XIAO, Guangyan ZHANG, et al. Virtual data space system for national highperformance computing environment[J]. Big data research, 2021, 7(2): 2021016. DOI: 10.11959/j.issn.2096-0271.2021016.
高性能计算环境是支撑国家科技创新、经济发展、国防建设的核心信息基础设施,世界高性能计算强国纷纷建设基于多超算中心资源的广域高性能计算环境。然而,高性能计算环境中资源种类繁多且地域分布广,无法有效发挥资源的聚合效应,难以满足大型应用对广域分布数据的统一管理和高效访问需求。为此,提出了一套可用于构建广域全局虚拟数据空间的完整技术体系,包括虚拟数据空间模型、跨域虚拟数据空间构建、广域环境中数据高效迁移、广域环境中存算协同调度、跨域高并发数据聚合处理等技术,并研发了一个可运行于国家高性能计算环境的虚拟数据空间系统,可有效支撑广域分散异构存储资源的统一高效访问,实现广域环境中分布数据的跨域共享和协同处理。目前,该软件系统已在国家高性能计算环境实验性部署,并验证了分子对接、全基因组关联分析、天气预报模式3类典型大型应用。验证结果表明,所研虚拟数据空间构建方法和系统可有效聚合广域分散的存储资源,满足大型应用的数据空间需求。
High-performance computing (HPC) environment is the core information infrastructure supporting national scientific and technological innovation
economic development and national defense construction.High-performance computing powers around the world have been building wide-area HPC environments based on multi-supercomputing center resources.However
in the high-performance computing environment
there are many kinds of resources and wide geographical distribution
which cannot effectively exert the aggregation effect of resources
and it is difficult to meet the requirements of large-scale applications for unified management and efficient access to wide-area distributed data.To this end
a complete set of technologies were proposed
which could be used to build wide-area global virtual data space
including virtual data space model
cross-domain virtual data space constructing
efficiently migrating data in a wide-area environment
co-scheduling of storage resources and computing job and cross-domain high concurrency data aggregation processing
etc.Based on the above
a virtual data space system has been developed for the national high-performance computing environment (NHPCE)
which can effectively support the unified and efficient access to the wide area distributed heterogeneous storage resources
and the distributed data in the wide-area environment can be shared and cooperative processed in a cross-domain manner.At present
the system was experimental deployed in NHPCE and three typical large-scale applications
such as molecular docking
genome-wide association study and weather forecasting model
have been verified.The verification results show that the developed technology and software system can effectively aggregate the wide area distributed storage resources and meet the data space requirements of large-scale applications.
QIAN D P . High performance computing:a brief review and prospects [J ] . National Science Review , 2016 , 3 ( 1 ): 16 .
VILJOEN M , DUTKA Ł , KRYZA B , et al . Towards European open science commons:the EGI open data platform and the EGI dataHub [J ] . Procedia Computer Science , 2016 , 97 : 148 - 152 .
WRZESZCZ M , TRZEPLA K , SOTA R , et al . Metadata organization and management for globalization of data access with OneData [C ] // International Conference on Parallel Processing and Applied Mathematics . Heidelberg:Springer , 2015 : 312 - 321 .
GRIMSHAW A , MORGAN M , KALYANARAMAN A . GFFS—the XSEDE global federated file system [J ] . Parallel Processing Letters , 2013 , 23 ( 2 ): 134005 .
CATLETT C , ALLCOCK W E , ANDREWS P , et al . TeraGrid:analysis of organization,system architecture,and middleware enabling new types of applications [M ] // High Performance Computing and Grids in Action . Amsterdam : IOS Press , 2008 .
TOWN J , BOISSEAU J , ROSKIES J , et al . XSEDE:extreme science and engineering discovery environment (OAC 15-48562) [R ] . 2020 .
NEWHOUSE S . Seeking new horizons:EGI’s role in 2020(EGI-1098-D230-V3) [R ] . 2021 .
KUBIATOWICZ J , BINDEL D , CHEN Y , et al . OceanStore:an architecture for global-scale persistent storage [J ] . ACM SIGPLAN Notices , 2002 , 35 ( 11 ).
MAYMOUNKOV P , MAZIÈRES D , . Kademlia:a peer-to-peer information system based on the XOR metric [C ] // The 1st International Workshop on Peer-toPeer Systems . Heidelberg:Springer , 2002 : 53 - 65 .
CORBETT J C , DEAN J , EPSTEIN M , et al . Spanner:Google’s globally-distributed database [J ] . ACM Transactions on Computer Systems , 2012 , 31 ( 3 ): 8 .
THOMSON A , ABADI D J . CalvinFS:consistent WAN replication and scalable metadata management for distributed file systems [C ] // The 13th USENIX Conference on File and Storage Technologies . Berkeley:USENIX Association , 2015 : 1 - 14 .
WU Z , BUTKIEWICZ M , PERKINS D , et al . SPANStore:cost-effective georeplicated storage spanning multiple cloud services [C ] // The 24th ACM Symposium on Operating Systems . New York:ACM Press , 2013 : 292 - 308 .
BERMBACH D , KLEMS M , TAI S , et al . MetaStorage:a federated cloud storage system to manage consistencylatency tradeoffs [C ] // The 2011 IEEE International Conference on Cloud Computing . Piscataway:IEEE Press , 2011 : 452 - 459 .
CALDER B , WANG J , OGUS A . Windows Azure Storage:a highly available cloud storage service with strong consistency [C ] // The 23rd ACM Symposium on Operating Systems . New York:ACM Press , 2011 : 143 - 157 .
HENSCHEL R , SIMMS S , HANCOCK D , et al . Demonstrating Lustre over a 100Gbps wide area network of 3500km [C ] // The International Conference on High Performance Computing,Networking,Storage and Analysis . Piscataway:IEEE Press , 2012 : 1 - 8 .
TATEBE O , HIRAGA K , SODA N . Gfarm grid file system [J ] . New Generation Computing , 2010 , 28 ( 3 ): 257 - 275 .
LANDERS M , ZHANG H , TAN K . PeerStore:better performance by relaxing in peer-to-peer backup [C ] // The 4th International Conference on Peer-to-Peer Computing . Piscataway:IEEE Press , 2004 : 72 - 79 .
CAO W , LIU Z J , WANG P , et al . PolarFS:an ultra-low latency and failure resilient distributed file system for shared storage cloud database [J ] . Proceedings of the VLDB Endowment , 2018 , 11 ( 12 ): 1849 - 1862 .
胡进锋 , 洪春辉 , 郑纬民 . 一种面向对象的Internet存储服务系统Granary [J ] . 计算机研究与发展 , 2007 , 44 ( 6 ): 1071 - 1078 .
HU J F , HONG C H , ZHENG W M . Granary:an architecture of object oriented Internet storage service [J ] . Journal of Computer Research and Development , 2007 , 44 ( 6 ): 1071 - 1078 .
ZHANG Z J , XIAO L M , SU S B , et al . HSAStore:a hierarchical storage architecture for computing systems containing large-scale intermediate data [C ] // International Conference on Collaborative Computing:Networking,Applications and Worksharing . Heidelberg:Springer , 2017 : 591 - 601 .
XIAO H , WU H , CHI X . SCE:grid environment for scientific computing [C ] // International Conference on Networks for Grid Applications . Heidelberg:Springer , 2008 : 35 - 42 .
胡正丁 , 薛巍 . 面向异构众核超级计算机的大规模稀疏计算性能优化研究 [J ] . 大数据 , 2020 , 6 ( 4 ): 40 - 55 .
HU Z D , XUE W . Research on performance optimization for largescale sparse computation over many-core heterogenous supercomputer [J ] . Big Data Research , 2020 , 6 ( 4 ): 40 - 55 .
韦冰 . 面向广域高性能计算环境的文件数据访问和容错方法研究 [D ] . 北京:北京航空航天大学 , 2020 .
WEI B . A study on file data access and fault tolerance in the wide area high performance computing environment [D ] . Beijing:Beihang University , 2020 .
周汉杰 . 广域虚拟数据空间副本技术研究与实现 [D ] . 北京:北京航空航天大学 , 2020 .
ZHOU H J . Research and implementation of replication technology for global virtual data space [D ] . Beijing:Beihang University , 2020 .
SONG Y , XIAO L M , WANG L , et al . GCSS:a global collaborative scheduling strategy for wide-area high-performance computing [J ] . Frontiers of Computer Science , 2021 ,accepted.
0
浏览量
667
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621