[ "严赵峰(1993- ),男,复旦大学并行处理研究所硕士生,主要研究方向为异构计算。" ]
[ "张为华(1974- ),男,博士,复旦大学并行处理研究所教授,主要研究方向为编译器、系统软件、并行处理等。" ]
网络首发:2019-07,
纸质出版:2019-07-15
移动端阅览
严赵峰, 张为华. 面向大数据的索引结构研究进展[J]. 大数据, 2019,5(4):3-15.
Zhaofeng YAN, Weihua ZHANG. A survey of index structure in big data era[J]. Big Data Research, 2019, 5(4): 3-15.
严赵峰, 张为华. 面向大数据的索引结构研究进展[J]. 大数据, 2019,5(4):3-15. DOI: 10.11959/j.issn.2096-0271.2019028.
Zhaofeng YAN, Weihua ZHANG. A survey of index structure in big data era[J]. Big Data Research, 2019, 5(4): 3-15. DOI: 10.11959/j.issn.2096-0271.2019028.
为了保证各类大数据系统的性能,以并发索引结构为核心的高效检索技术变得越来越重要。然而,数据存储体量的增加和用户对性能要求的提高使得并发索引结构面临诸多挑战,设计高效且易用的并发控制策略和提升索引结构操作性能成为系统领域关心的重要问题。结合现有的研究成果,综合分析了大数据时代并发索引结构的研究进展。首先讨论了关于设计更优化且易用的并发控制策略的研究现状,接着针对各类新型高性能硬件,探讨了基于新硬件结构特点的优化加速研究成果,并对可能的发展方向进行了展望。
With the advent of the big data era
the volume of data storage has exploded.In order to ensure the performance of system
the concurrent index structure becomes more and more important.However
the increase of storage volume and the increase of user performance requirements have made the index structure face many challenges.Designing an efficient and easy-to-use index structure has become an important issue in the system field.Based on current research status
firstly
the research status of designing more optimized and easy-to-use concurrency control strategies were discussed
and then the researches based on the new hardware structure features and possible directions were discussed.
GRAEFE G , KUNO H . Modern B-tree techniques [J ] . Foundations & Trends in Databases , 2011 , 3 ( 4 ): 203 - 402 .
SHAHVARANI A , JACOBSEN H A . A hybrid B+-tree as solution for in-memory indexing on CPU-GPU heterogeneous computing platforms [C ] // The 2016 International Conference on Management of Data,June 26-July 1,2016,San Francisco,USA . New York:ACM Press , 2016 : 1523 - 1538 .
VAISMAN A , ZIMÁNYI E . Data warehouses:next challenges [J ] . Lecture Notes in Business Information Processing , 2011 , 96 : 1 - 26 .
VASSILIADIS P , SIMITSIS A . Near real time ETL [M ] // New trends in data warehousing and data analysis . Boston:Springer , 2009 : 1 - 31 .
BAYER R , SCHKOLNICK M . Concurrency of operations on B-trees [J ] . Acta Informatica , 1977 , 9 ( 1 ): 1 - 21 .
LEHMAN P L . Efficient locking for concurrent operations on B-trees [J ] . ACM Transactions on Database Systems (TODS) , 1981 , 6 ( 4 ): 650 - 670 .
M AO Y , KOHLER E , MORRIS R T . Cache craftiness for fast multicore key-value storage [C ] // The 7th ACM European Conference on Computer Systems,April 10-13,2012,Bern,Switzerland . New York:ACM Press , 2012 : 183 - 196 .
CHA S K , HWANG S , KIM K , et al . Cache-conscious concurrency control of main-memory indexes on shared-memory multiprocessor systems [C ] // The 27th International Conference on Very Large Data Bases,September 11-14,2001,Roma,Italy.San Francisco:Morgan Kaufmann Publishers Inc . , 2001 : 181 - 190 .
LEVANDOSKI J J , LOMET D B , SENGUPTA S . The Bw-Tree:a B-tree for new hardware platforms [C ] // 2013 IEEE 29th International Conference on Data Engineering (ICDE),April 8-12,2013,Brisbane,Australia . Piscataway:IEEE Press , 2013 : 302 - 313 .
SEWALL J , CHHUGANI J , KIM C , et al . PALM:parallel architecture-friendly latchfree modifications to B+ trees on manycore processors [J ] . Proceedings of the VLDB Endowment , 2011 , 4 ( 11 ): 795 - 806 .
CHEN Y , WEI X , SHI J , et al . Fast and general distributed transactions using RDMA and HTM [C ] // The 11th European Conference on Computer Systems,April 18-21,2016,London,UK . New York:ACM Press , 2016 .
WANG X , ZHANG W , WANG Z , et al . Eunomia:scaling concurrent search trees under contention using HTM [C ] // The 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming,February 4-8,2017,Austin,USA . New York:ACM Press , 2017 : 385 - 399 .
ZHANG W , WANG X , JI S , et al . Scaling concurrent index structures under contention using HTM [J ] . IEEE Transactions on Parallel and Distributed Systems , 2018 , 29 ( 8 ): 1837 - 1850 .
WANG Z , QIAN H , CHEN H , et al . Opportunities and pitfalls of multicore scaling using hardware transaction memory [C ] // The 4th Asia-Pacific Workshop on Systems,July 29-30,2013,Singapore . New York:ACM Press , 2013 .
WANG Z , QIAN H , LI J , et al . Using restricted transactional memory to build a scalable in-memory database [C ] // The 9th European Conference on Computer Systems,April 14-16,2014,Amsterdam,the Netherlands . New York:ACM Press , 2014 .
CHEN S , PENG L . Efficient GPU hardware transactional memory through early conflict resolution [C ] // 2016 IEEE International Symposium on High Performance Computer Architecture(HPCA),March 12-16,2016,Barcelona,Spain . Piscataway:IEEE Press , 2016 : 274 - 284 .
HERLIHY M , MOSS J E B . Transactional memory:architectural support for lockfree data structures [M ] . New York : ACM PressPress , 1993 .
XU Y , WANG R , GOSWAMI N , et al . Software transactional memory for GPU architectures [C ] // Annual IEEE/ACM International Symposium on Code Generation and Optimization,February 15-19,2014,Orlando,USA . New York:ACM Press , 2014 .
POLLARI-MALMI K , SOISALONSOININEN E , YLONEN T . Concurrency control in B-trees with batch updates [J ] . IEEE Transactions on Knowledge and Data Engineering , 1996 , 8 ( 6 ): 975 - 984 .
SHNEIDERMAN B . Batched searching of sequential and tree structured files [J ] . ACM Transactions on Database Systems (TODS) , 1976 , 1 ( 3 ): 268 - 275 .
ARGE L , HINRICHS K H , VAHRENHOLD J , et al . Efficient bulk operations on dynamic R-trees [J ] . Algorithmica , 2002 , 33 ( 1 ): 104 - 128 .
GRAEFE G . B-tree indexes for high update rates [J ] . ACM Sigmod Record , 2006 , 35 ( 1 ): 39 - 44 .
YAN Z , LIN Y , PENG L , et al . Harmonia:a high throughput B+ tree for GPUs [C ] // The 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming(PPoPP),February 16-20,2019,Washington,DC,USA . New York:ACM Press , 2019 : 133 - 144 .
AWAD M A , ASHKIANI S , JOHNSON R , et al . Engineering a high-performance GPU B-Tree [C ] // The 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming(PPoPP),February 16-20,2019,Washington,DC,USA . New York:ACM Press , 2019 .
RAO J , ROSS K A . Cache conscious indexing for decision-support in main memory [C ] // The 25th International Conference on Very Large Data Bases,September 7-10,1999,Edinburgh,Scotland.San Francisco:Morgan Kaufmann Publishers Inc . , 1999 : 78 - 89 .
RAO J , ROSS K A . Making B+-trees cache conscious in main memory [C ] // The 2000 ACM SIGMOD International Conference on Management of Data,May 15-18,2000,Dallas,USA . New York:ACM Press , 2000 : 475 - 486 .
CHEN S , GIBBONS P B , MOWRY T C . Improving index performance through prefetching [M ] . New York : ACM PressPress , 2001 .
CHEN S , GIBBONS P B , MOWRY T C , et al . Fractal prefetching B+trees:optimizing both cache and disk performance [C ] // The 2002 ACM SIGMOD International Conference on Management of Data,June 3-6,2002,Madison,USA . New York:ACM Press , 2002 : 157 - 168 .
HANKINS R A , PATEL J M . Effect of node size on the performance of cache-conscious B+-trees [C ] // The 2003 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems,June 11-14,2003,San Diego,USA . New York:ACM Press , 2003 : 283 - 294 .
KIM C , CHHUGANI J , SATISH N , et al . FAST:fast architecture sensitive tree search on modern CPUs and GPUs [C ] // The 2010 ACM SIGMOD International Conference on Management of Data,June 6-10,2010,Indianapolis,USA . New York:ACM Press , 2010 : 339 - 350 .
KACZMARSKI K . Experimental B+-tree for GPU [J ] . Advances in Databases and Information Systems , 2011 ( 2 ):11.
KACZMARSKI K , . B+-tree optimized for GPGPU [C ] // OTM Confederated International Conferences “On the Move to Meaningful Internet Systems”,September 10-14,2012,Rome,Italy . Heidelberg:Springer , 2012 : 843 - 854 .
DAGA M , NUTTER M . Exploiting coarsegrained parallelism in b+ tree searches on an apu [C ] // 2012 SC Companion:High Performance Computing,Networking,Storage and Analysis(SCC),November 10-16,2012,Salt Lake City,USA . Piscataway:IEEE Press , 2012 : 240 - 247 .
GRAEFE G . A survey of B-tree locking techniques [J ] . ACM Transactions on Database Systems (TODS) , 2010 , 35 ( 3 ):16.
YANG Y H E , PRASANNA V K . High throughput and large capacity pipelined dynamic search tree on FPGA [C ] // The 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays,February 21-23,2010,Monterey,USA . New York:ACM Press , 2010 : 83 - 92 .
HEINRICH D , WERNER S , STELZNER M , et al . Hybrid FPGA approach for a B+ tree in a semantic web database system [C ] // 2015 10th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC),June 29-July 1,2015,Bremen,Germany . Piscataway:IEEE Press , 2015 : 1 - 8 .
QU Y , PRASANNA V . Scalable highthroughput architecture for large balancedtree structures on FPGA [C ] // The ACM/SIGDA International Symposium on Field Programmable Gate Arrays,February 11-13,2013,Monterey,USA . New York:ACM Press , 2013 :278.
WEI X , SHI J , CHEN Y , et al . Fast inmemory transaction processing using RDMA and HTM [C ] // The 25th Symposium on Operating Systems Principles,October 4-7,2015,Monterey,USA . New York:ACM Press , 2015 : 87 - 104 .
MITCHELL C , MONTGOMERY K , NELSON L , et al . Balancing CPU and network in the cell distributed B-tree store [C ] // 2016 USENIX Annual Technical Conference,June 22-24,2016,Denver,USA . Berkeley:USENIX Association , 2016 : 451 - 464 .
CHEN S M , QIN J . Persistent b+-trees in non-volatile main memory [J ] . Proceedings of the VLDB Endowment , 2015 , 8 ( 7 ): 786 - 797 .
CHEN S M , GIBBONS P B , NATH S . Rethinking database algorithms for phase change memory [C ] // The 5th Biennial Conference on Innovative Data Systems Research,January 9-12,2011,Asilomar,USA.[S.l.:s.n] . 2011 .
YANG J , WEI Q S , CHEN C , et al . NVTree:reducing consistency cost for NVMbased single level systems [C ] // The 13th USENIX Conference on File and Storage Technologies(FAST15),February 16-19,2015,Santa Clara,USA . Berkeley:USENIX Association , 2015 : 167 - 181 .
OUKID I , LASPERAS J , NICA A , et al . FPTree:a hybrid SCM-DRAM persistent and concurrent B-tree for storage class memory [C ] // The 2016 International Conference on Management of Data,June 26-July 1,2016,San Francisco,USA . New York:ACM Press , 2016 .
ARULRAJ J , . BzTree:a high-performance latch-free range index for non-volatile memory [C ] // The 44th International Conference on Very Large Data Bases,August 27-31,2018,Rio De Janeiro,Brazil . New York:ACM Press , 2018 : 553 - 565 .
0
浏览量
485
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621