[ "林清音(1999- ),女,中山大学计算机学院硕士生,主要研究方向为存储系统" ]
[ "陈志广(1984- ),男,博士,中山大学计算机学院副教授,主要研究方向为大数据存储与处理、并行与分布式计算、高性能计算与超级计算机" ]
网络首发:2023-01,
纸质出版:2023-01-15
移动端阅览
林清音, 陈志广. 基于更新热点感知的LSM-Tree查询优化[J]. 大数据, 2023,9(1):126-140.
Qingyin LIN, Zhiguang CHEN. A hot-update-aware optimization to the query of LSM-Tree[J]. Big data research, 2023, 9(1): 126-140.
林清音, 陈志广. 基于更新热点感知的LSM-Tree查询优化[J]. 大数据, 2023,9(1):126-140. DOI: 10.11959/j.issn.2096-0271.2022049.
Qingyin LIN, Zhiguang CHEN. A hot-update-aware optimization to the query of LSM-Tree[J]. Big data research, 2023, 9(1): 126-140. DOI: 10.11959/j.issn.2096-0271.2022049.
基于LSM-Tree的键值存储已经得到广泛使用。LSM-Tree通过将更新的数据缓存在内存中、随后批量写入磁盘的优化措施取得极高的写性能。然而,在基于LSM-Tree的键值存储中,被更新键值对的旧数据不会立即从存储系统中清除,导致整个存储系统中积累大量的无效数据,最终会显著降低键值存储的读性能。针对以上问题,提出一种更积极的压缩(compaction)方法,通过记录键值对更新的历史信息,识别出更新热点,在整个LSM-Tree存储系统中寻找无效数据大量聚集的SSTable,尽早实施压缩,清除无效数据,缓解写放大效应,从而提升读性能。实验表明,该方法能够降低LevelDB 65.2%的平均读时延、69.4%的99%读尾时延以及71.4%的写放大。
Key-value stores based on LSM-Tree have been widely used.LSM-Tree gains excellent write performance by collecting updated data in memory and then flushing data into storage in batches.However
in LSMTree-based key-value stores
old data generated by update operations will not be eliminated immediately from the storage system
resulting in a large amount of invalid data accumulated in the entire storage system
which will eventually significantly reduce the read performance of key-value stores.For the above problems
an active compaction method was proposed.By recording the history information of updated key-value pairs
recognizing hot-updated keys
finding SSTables that contain a large amount of invalid data in the storage system
and triggering compaction as soon as possible to clear much more invalid data
the proposed method could reduce write amplification and improve the read performance of LSM-Tree based key-value stores.Experiments showed that this method could reduce the average read latency of LevelDB by 65.2%
99% read tail latency by 69.4%
and write amplification by 71.4%.
O’NEIL P , CHENG E , GAWLICK D , et al . The log-structured merge-tree (LSM-tree) [J ] . Acta Informatica , 1996 , 33 ( 4 ): 351 - 385 .
GHEMAWAT S , DEAN J . LevelDB [Z ] . 2016 .
LAKSHMAN A , MALIK P . Cassandra [J ] . ACM SIGOPS Operating Systems Review , 2010 , 44 ( 2 ): 35 - 40 .
YAO T , ZHANG Y W , WAN J G , et al . MatrixKV:reducing write stalls and write amplification in LSM-tree based KVStores with matrix container in NVM [C ] // Proceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference . Berkeley:USENIX Association , 2020 : 17 - 31 .
LU L Y , PILLAI T S , GOPALAKRISHNAN H , et al . WiscKey [J ] . ACM Transactions on Storage , 2017 , 13 ( 1 ): 1 - 28 .
RAJU P , KADEKODI R , CHIDAMBARAM V , et al . PebblesDB:building key-value stores using fragmented log-structured merge trees [C ] // Proceedings of the 26th Symposium on Operating Systems Principles . New York:ACM Press , 2017 : 497 - 514 .
YAO T , WAN J G , HUANG P , et al . A light-weight compaction tree to reduce I/O amplification toward efficient key-value stores [C ] // Proceedings of 33rd International Conference on Massive Storage Systems and Technology .[S.l.:s.n. ] , 2017 .
LI Y K , TIAN C J , GUO F , et al . Elasticbf:elastic bloom filter with hotness awareness for boosting read performance in large key-value stores [C ] // Proceedings of the 2019 USENIX Annual Technical Conference . Berkeley:USENIX Association , 2019 : 739 - 752 .
WU F G , YANG M H , ZHANG B Q , et al . AC-key:adaptive caching for LSMbased key-value stores [C ] // Proceedings of the 2020 USENIX Annual Technical Conference . Berkeley:USENIX Association , 2020 : 603 - 615 .
WU Y H , XU S , JIANG Z L , et al . LSMtrie:an LSM-tree-based ultra-large key-value store for small data items [C ] // Proceedings of the 2015 USENIX Annual Technical Conference . Berkeley:USENIX Association , 2015 : 71 - 82 .
SHIN J , WANG J G , AREF W G . The LSM RUM-tree:a log structured merge R-tree for update-intensive spatial workloads [C ] // Proceedings of 2021 IEEE 37th International Conference on Data Engineering . Piscataway:IEEE Press , 2021 : 2285 - 2290 .
XIONG X P , AREF W G . R-trees with update memos [C ] // Proceedings of 22nd International Conference on Data Engineering . Piscataway:IEEE Press , 2006 :22.
CHANDRAMOULI B , PRASAAD G , KOSSMANN D , et al . FASTER:a concurrent key-value store with inplace updates [C ] // Proceedings of the 2018 International Conference on Management of Data . New York:ACM Press , 2018 : 275 - 290 .
ALSUBAIEE S , BEHM A , BORKAR V , et al . Storage management in AsterixDB [J ] . Proceedings of the VLDB Endowment , 2014 , 7 ( 10 ): 841 - 852 .
COOPER B F , SILBERSTEIN A , TAM E , et al . Benchmarking cloud serving systems with YCSB [C ] // Proceedings of the 1st ACM symposium on Cloud computing . New York:ACM Press , 2010 : 143 - 154 .
0
浏览量
449
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621