[ "蔡涛(1976- ),男,博士,江苏大学计算机科学与通信工程学院副教授,主要研究方向为网络存储系统与NVM。" ]
[ "雷天乐(1999- ),男,江苏大学计算机科学与通信工程学院硕士生,主要研究方向为存储系统与NVM。" ]
[ "牛德姣(1978- ),女,博士,江苏大学计算机科学与通信工程学院副教授,主要研究方向为存储系统与神经网络。" ]
[ "戴健飞(1999- ),男,江苏大学计算机科学与通信工程学院硕士生,主要研究方向为存储系统与NVM。" ]
[ "黄泽宇(1998- ),男,江苏大学计算机科学与通信工程学院硕士生,主要研究方向为存储系统与NVM。" ]
[ "倪强强(1998- ),男,江苏大学计算机科学与通信工程学院硕士生,主要研究方向为存储系统与NVM。" ]
网络首发:2024-07,
纸质出版:2024-07-15
移动端阅览
蔡涛, 雷天乐, 牛德姣, 等. 面向NVM的IoT时序数据多态协作压缩策略[J]. 大数据, 2024,10(4):34-50.
Tao CAI, Tianle LEI, Dejiao NIU, et al. A polymorphic cooperative compression strategy for IoT time series data based on NVM[J]. Big data research, 2024, 10(4): 34-50.
蔡涛, 雷天乐, 牛德姣, 等. 面向NVM的IoT时序数据多态协作压缩策略[J]. 大数据, 2024,10(4):34-50. DOI: 10.11959/j.issn.2096-0271.2024048.
Tao CAI, Tianle LEI, Dejiao NIU, et al. A polymorphic cooperative compression strategy for IoT time series data based on NVM[J]. Big data research, 2024, 10(4): 34-50. DOI: 10.11959/j.issn.2096-0271.2024048.
压缩策略是影响IoT时序数据存储系统性能的重要因素,而现有压缩策略缺乏针对NVM与IoT时序数据特性的优化机制。因此,提出了面向NVM的IoT时序数据多态协作压缩策略。首先,给出了IoT时序数据的组织结构。然后,针对IoT时序数据在一段时间内较稳定以及在用户态与内核态读写NVM适合的粒度差异较大的情况,设计了分层压缩策略。在用户态接收数据时,采用轻量级的数据压缩算法减少需存储的数据量,也减小了对IoT时序数据的存储效率的影响;针对IoT系统以查询和分析异常时序数据为主的特性,设计了深度压缩算法,在内核态对历史IoT时序数据进行深度压缩。其次,针对深度压缩历史IoT时序数据与存储新接收的IoT时序数据之间对NVM带宽的竞争,提出了写带宽保证的动态调整算法。最后,构建了面向NVM的IoT时序数据多态协作压缩策略原型PCCTSMS,并使用YCSB-TS工具进行测试与分析。实验结果表明,与InfluxDB、OpenTSDB、KairosDB和TVStore相比,PCCTSMS最高能提升161.3%的写吞吐率以及减少14.6%的存储空间。
The compression strategy plays an important role in the performance of IoT time series data storage system.However
the current compression strategies can not adapt to the characteristics of NVM and IoT time series data.This paper proposes a polymorphic cooperative compression strategy for IoT time-series data based on NVM.Firstly
the overall structure of IoT time series data is given.Then
to address the consistent patterns in IoT time series data and the different granularity between user-space and kernel-space operations on NVM
a dual-compression strategy is devised.Initially
a lightweight compression method is applied directly as IoT time series data is received in user-space.This method efficiently reduces the volume of data for storage
while minimizing the impact on the timeliness of data storage.Moreover
a deep compression algorithm is designed for the kernel-space
primarily focusing on querying and analyzing anomalous time series data.Additionally
to address the competition for NVM bandwidth between deep compression and data storage
a dynamic adjustment algorithm that guarantees write bandwidth is proposed.Finally
a prototype of the polymorphic cooperative compression strategy is implemented and YCSB-TS is used to evaluate the results.The results show that the proposed method can effectively improve the write throughput of IoT time-series data by up to 161.3% and reduce the storage space by up to 14.6%
compared with InfluxDB
OpenTSDB
KairosDB and TVStore.
陈游旻 , 李飞 , 舒继武 . 大数据环境下的存储系统构建:挑战、方法和趋势 [J ] . 大数据 , 2019 , 5 ( 4 ): 27 - 40 .
CHEN Y M , LI F , SHU J W . Building storage systems in big data era:challenges,methods and trends [J ] . Big Data Research , 2019 , 5 ( 4 ): 27 - 40 .
王杰 . 多态协作的高并发NVM存储系统 [D ] . 镇江:江苏大学 , 2020 .
WANG J . The kernel and user space collaborative and highly concurrent NVM storage system [D ] . Zhenjiang:Jiangsu University , 2020 .
YI J F , DONG B C , DONG M K , et al . MT 2 :memory bandwidth regulation on hybrid NVM/DRAM platforms [C ] // Proceedings of the 20th USENIX Conference on File and Storage Technologies (FAST 22) . Berkeley:USENIX Association , 2022 : 199 - 216 .
AN Y Z , SU Y , ZHU Y Q , et al . TVStore:automatically bounding time series storage via time-varying compression [C ] // Proceedings of the 20th USENIX Conference on File and Storage Technologies(FAST 22) . Berkeley:USENIX Association , 2022 : 83 - 100 .
BLALOCK D , MADDEN S , GUTTAG J . Sprintz:time series compression for the Internet of Things [J ] . Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies , 2018 , 2 ( 3 ): 1 - 23 .
PAUL D , PENG Y Q , LI F F . Bursty event detection throughout histories [C ] // Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE) . Piscataway:IEEE Press , 2019 : 1370 - 1381 .
KHELIFATI A , KHAYATI M , CUDRÉMAUROUX P . CORAD:correlation-Aware Compression of Massive Time Series using Sparse Dictionary Coding [C ] // Proceedings of the 2019 IEEE International Conference on Big Data (Big Data) . Piscataway:IEEE Press , 2019 : 2289 - 2298 .
CAI T , LIU P Y , NIU D J , et al . The embedded IoT time series database for hybrid solid-state storage system [J ] . Scientific Programming,2021 , 2021 :9948533.
HAN S K , JIANG D J , XIONG J . SplitKV:splitting IO paths for different sized key-value items with advanced storage devices [C ] // Proceedings of the 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 20) . Berkeley:USENIX Association , 2020 : 1 - 18 .
KAIYRAKHMET O , LEE S Y , NAM B , et al . SLM-DB:single-level key-value store with persistent memory [C ] // Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST 19) . Berkeley:USENIX Association , 2019 : 191 - 205 .
CHEN Y M , LU Y Y , ZHU B H , et al . Scalable persistent memory file system with kernel-userspace collaboration [C ] // Proceedings of the 19th USENIX Conference on File and Storage Technologies (FAST 21) . Berkeley:USENIX Association , 2021 : 81 - 95 .
ZHOU D Y , QIAN Y C , GUPTA V , et al . ODINFS:scaling PM performance with opportunistic delegation [C ] // Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22) . Berkeley:USENIX Association , 2022 : 179 - 193 .
MA S N , CHEN K , CHEN S M , et al . ROART:range-query optimized persistent ART [C ] // Proceedings of the 19th USENIX Conference on File and Storage Technologies (FAST 21) . Berkeley:USENIX Association , 2021 : 1 - 16 .
0
浏览量
97
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621