1. 西北工业大学计算机学院,陕西 西安 710129
2. 西北工业大学软件学院,陕西 西安 710129
[ "程文迪(1995- ),女,西北工业大学计算机学院博士生,主要研究方向为分布式存储系统设计与优化。" ]
[ "张晓(1978- ),男,西北工业大学计算机学院教授,主要研究方向为分布式存储系统设计、评测、仿真与优化。" ]
[ "潘兆辉(2000- ),男,西北工业大学软件学院硕士生,主要研究方向为分布式存储系统优化。" ]
[ "赵友军(1998- ),男,西北工业大学软件学院硕士生,主要研究方向为大数据存储、分布式文件系统设计。" ]
[ "孙晨光(1999- ),男,西北工业大学计算机学院硕士生,主要研究方向为分布式存储系统设计。" ]
[ "单学强(1999- ),男,西北工业大学计算机学院硕士生,主要研究方向为分布式存储系统设计。" ]
[ "金雨展(2000- ),女,西北工业大学计算机学院硕士生,主要研究方向云计算资源调度、云资源预测。" ]
[ "赵晓南(1979- ),女,博士,西北工业大学计算机学院副教授,主要研究方向为分布式文件系统管理与优化、存储系统资源管理与智能配置、云存储系统性能建模与性能优化。" ]
网络首发:2024-07,
纸质出版:2024-07-15
移动端阅览
程文迪, 张晓, 潘兆辉, 等. 面向湍流大数据的高效存储与访问关键技术研究[J]. 大数据, 2024,10(4):3-20.
Wendi CHENG, Xiao ZHANG, Zhaohui PAN, et al. Research on key technologies for efficient storage and access of turbulent big data[J]. Big data research, 2024, 10(4): 3-20.
程文迪, 张晓, 潘兆辉, 等. 面向湍流大数据的高效存储与访问关键技术研究[J]. 大数据, 2024,10(4):3-20. DOI: 10.11959/j.issn.2096-0271.2024046.
Wendi CHENG, Xiao ZHANG, Zhaohui PAN, et al. Research on key technologies for efficient storage and access of turbulent big data[J]. Big data research, 2024, 10(4): 3-20. DOI: 10.11959/j.issn.2096-0271.2024046.
随着测量技术和数值模拟技术的发展,数据驱动的湍流研究成为该领域的新研究方法。我国已建立了多个风洞实验室和多个超算中心来模拟湍流,这些研究积累了大量的湍流数据,但是国内没有集中的湍流数据管理平台,耗资巨大的实验和仿真数据难以实现交流和共享。湍流数据具有数据量大、维度高、精度高和多源异构等特点,其存储、访问与管理存在数据集成困难、数据访问低效和存储效率低等问题。设计了一个面向航空、航天和航海典型流动问题的湍流大数据分布式存储系统TDFS。结合湍流大数据的访问特点,在TDFS中设计了新的元数据组织方式和数据访问接口。实验结果表明,与HDFS和GlusterFS相比,TDFS分别实现了54.38%和57.7%的接口响应速度提升。同时,为了降低湍流大数据的存储开销,设计了基于HDF5的副本延迟压缩机制,相比原有的副本存储方式,节省了34%的存储空间。
With the development of measurement techniques and numerical simulation technologies
data-driven turbulence research has become a new approach in this field.In China
several wind tunnel laboratories and supercomputing centers have been established for turbulence simulations
resulting in a substantial collection of turbulence data.However
there is currently no centralized turbulence data management platform in China
which makes it difficult to achieve the exchange and share of the expensive experimental and simulation data.Turbulence data is characterized by its large volume
high dimensionality
precision and heterogeneity
which present problems in terms of storage
access and management efficiency.A turbulence big data distributed storage system called TDFS was designed
specifically targeting typical flow problems in aviation
aerospace
and marine applications.Considering the access characteristics of turbulence big data
the novel metadata management methods and data access interfaces were designed in TDFS.Experimental results demonstrate that TDFS achieves interface response speed improvements of 54.38% and 57.7% compared with HDFS and GlusterFS
respectively.Additionally
to reduce the storage overhead of turbulence big data
a lazy replication compression mechanism based on HDF5 was designed
resulting in 34% reduction in storage space
compared to the original replication storage approach.
ELSINGA G E , SCARANO F , WIENEKE B , et al . Tomographic particle image velocimetry [J ] . Experiments in Fluids , 2006 , 41 ( 6 ): 933 - 947 .
李新亮 . 高超声速湍流直接数值模拟技术 [J ] . 航空学报 , 2015 , 36 ( 1 ): 147 - 158 .
LI X L . Direct numerical simulation techniques for hypersonic turbulent flows [J ] . Acta Aeronautica et Astronautica Sinica , 2015 , 36 ( 1 ): 147 - 158 .
王圣业 , 邓小刚 , 董义道 , 等 . 面向工程湍流的高精度数值方法 [J ] . 航空学报 , 2023 , 44 ( 15 ): 528728 .
WANG S Y , DENG X G , DONG Y D , et al . Highorder numerical methods for engineering turbulence simulation [J ] . Acta Aeronautica et Astronautica Sinica , 2023 , 44 ( 15 ): 528728 .
MENEVEAU C , MARUSIC I . Turbulence in the era of big data:recent experiences with sharing large datasets [M ] // Whither Turbulence and Big Data in the 21st Century? . Cham : Springer , 2017 : 497 - 507 .
WRAY A A . A selection of test cases for the validation of large-eddy simulations of turbulent flows [J ] . AGARD Advisory Report , 1998 ,345.
SILLERO J A , JIMÉNEZ J . Public dissemination of raw turbulence data [M ] // Whither Turbulence and Big Data in the 21st Century? . Cham : Springer , 2017 : 509 - 515 .
RUMSEY C L . Turbulence modeling verification and validation [C ] // Proceedings of 52nd Aerospace Sciences Meeting . Maryland:ARC , 2014 :0201.
LI Y , PERLMAN E , WAN M P , et al . A public turbulence database cluster and applications to study Lagrangian evolution of velocity increments in turbulence [J ] . Journal of Turbulence , 2008 , 9 : 1 - 29 .
KANOV K , BURNS R , LALESCU C , et al . The johns Hopkins turbulence databases:an open simulation laboratory for turbulence research [J ] . Computing in Science & Engineering , 2015 , 17 ( 5 ): 10 - 17 .
TOWNE A , DAWSON S T M , BRÈS G A , , et al . A database for reduced-complexity modeling of fluid flows [J ] . AIAA Journal , 2023 , 61 ( 7 ): 2867 - 2892 .
CHEN Z , ZHANG J B , LEE C H . Direct numerical simulation of the turbulent MHD channel flow at low magnetic Reynolds number for electric correlation characteristics [J ] . Science China Physics,Mechanics and Astronomy , 2010 , 53 : 1901 - 1913 .
LU Y T , CHENG P , CHEN Z G . Design and implementation of the Tianhe-2 data storage and management system [J ] . Journal of Computer Science and Technology , 2020 , 35 ( 1 ): 27 - 46 .
PETERS A J , SINDRILARU E A , ADDE G . EOS as the present and future solution for data storage at CERN [J ] . Journal of Physics:Conference Series , 2015 , 664 ( 4 ): 042042 .
GOMES V C F , QUEIROZ G R , FERREIRA K R . An overview of platforms for big earth observation data management and analysis [J ] . Remote Sensing , 2020 , 12 ( 8 ): 1253 .
AMBATIPUDI S , BYNA S . A comparison of HDF5,zarr,and netCDF4 in performing common I/O operations [EB ] . arXiv preprint,2022,arXiv:2207.09503 .
ZHANG X , WANG L , HUANG Z J , et al . ConeSSD:a novel policy to optimize the performance of HDFS heterogeneous storage [C ] // Proceedings of the 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor,Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) . Piscataway:IEEE Press , 2022 : 876 - 881 .
JAYASANKAR U , THIRUMAL V , PONNURANGAM D . A survey on data compression techniques:from the perspective of data quality,coding schemes,data type and applications [J ] . Journal of King Saud University Computer and Information Sciences , 2021 , 33 ( 2 ): 119 - 140 .
MOHAMED S M A , WANG Y L . A survey on novel classification of deduplication storage systems [J ] . Distributed and Parallel Databases , 2021 , 39 ( 1 ): 201 - 230 .
CHINIAH A , MUNGUR A . On the adoption of erasure code for cloud storage by major distributed storage systems [J ] . EAI Endorsed Transactions on Cloud Systems , 2022 , 7 ( 21 ): 170955 .
0
浏览量
104
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621