[ "钱卫宁,男,华东师范大学数据科学与工程研究院教授、博士生导师,研究兴趣包括互联网环境下的数据管理、大数据管理系统评测基准、社交媒体数据分析、知识图谱构建与应用等。" ]
[ "夏帆,男,华东师范大学数据科学与工程研究院博士生,研究兴趣包括分布式查询处理、社交媒体数据基准测试、社交媒体数据管理。" ]
[ "周敏奇,男,华东师范大学数据科学与工程研究院副教授、硕士生导师,研究兴趣主要包括内存事务处理系统、内存分析处理系统、计算广告学。" ]
[ "金澈清,男,华东师范大学数据科学与工程研究院教授、博士生导师,研究兴趣主要包括基于位置的服务、数据流管理、不确定数据管理和数据基准评测。" ]
[ "周傲英,男,华东师范大学长江学者特聘教授、数据科学与工程研究院院长,研究兴趣主要包括Web数据管理、数据密集型计算、内存集群计算、分布事务处理、大数据基准测试和性能优化。" ]
网络首发:2015-05,
纸质出版:2015-05-20
移动端阅览
钱卫宁, 夏帆, 周敏奇, 等. 大数据管理系统评测基准的挑战与研究进展[J]. 大数据, 2015,1(1):81-95.
Weining qian, Fan Xia, Minqi Zhou, et al. Challenges and Progress of Big Data Management System Benchmarks[J]. BIG DATA RESEARCH, 2015, 1(1): 81-95.
钱卫宁, 夏帆, 周敏奇, 等. 大数据管理系统评测基准的挑战与研究进展[J]. 大数据, 2015,1(1):81-95. DOI: 10.11959/j.issn.2096-0271.2015.01.008.
Weining qian, Fan Xia, Minqi Zhou, et al. Challenges and Progress of Big Data Management System Benchmarks[J]. BIG DATA RESEARCH, 2015, 1(1): 81-95. DOI: 10.11959/j.issn.2096-0271.2015.01.008.
数据库评测基准在数据库发展历史中的作用不可替代,而大数据环境中传统评测基准不敷应用。因此,从评测基准3要素,即数据、负载、度量体系入手,研究具有高仿真性、可适配性、可测量性的大数据管理系统评测基准,对大数据管理系统的研发和应用系统选型至关重要。基于此,在简要分析评测基准的基本要素和大数据管理系统发展过程的基础上,重点分析大数据管理系统的基准评测需求与挑战,然后通过社交媒体分析型查询评测基准BSMA,探讨了面向应用的大数据管理系统基准评测的设计和实现问题。
Database benchmarking has stimulated the development of data management systems and technologies. In big data environments
benchmarking should be revisited. Therefore
research on benchmarks for big data management systems is a key problem for big data research and applications. Benchmark design can be achieved from three different perspectives
i.e. data
workload
and performance measurements. After the brief introduction to these three elements and the progress of big data management system research
the requirements and challenges to benchmarking big data management systems were analyzed. Through the introduction to a benchmark for analytical queries over social media data
named as BSMA
the issues of design and implementation of a benchmark for big data management systems were discussed.
Gray J , Benchmark handbook for database and transaction system (2nd edition) . San Francisco: Morgan Kaufmann , 1993
Bitton D , DeWitt D J , Turbyfil C . Benchmarking database systems: a systematic approach . Proceedings of the 9th VLDB Conference , Florence, Italy , 1983
Laney D . 3D Data Management:Controlling Data Volume, Velocity and Variety . Technical Report, Meta Group , 2011
Pavlo A , Paulson E E , Rasin A . et al . A comparison of approaches to large-scale data analysis . Proceedings of ACM SIGMOD/PODS Conference , Providence, Rhode Island, USA , 2009
Carey M J . BDMS performance evaluation:practices, pitfalls, and possibilities . Proceedings of the 4th TPC Technology Conference , Istanbul, Turkey , 2012
. Big Data . VLDB Database Summer School (China) Slides 2013
Stonebraker M . Technical perspective one size fits all: an idea whose time has come and gone . Communications of the ACM 2008 , 51 ( 12 )
Ma H X , Wei J X , Qian W N , et al . On benchmarking online social media analytical queries . Proceedings of Graph Data-management Experiences &Systems , New York, USA 2013
Xia F , Li Y , Yu C C , et al . BSMA: A benchmark for analytical queries over social media data . Proceedings of the VLDB Endowment , 2014 , 7 ( 13 ): 1573 ~ 1576
Yu C C , Fan X , Qian W N , et al . BSMA-Gen: a parallel synthetic data generator for social media timeline structures . Proceedings of ACM Sigcomm ’98 , Vancouver Canada 2014
金澈清 , 钱卫宁 , 周敏奇 等 . 数据管理系统评测基准: 从传统数据库到新兴大数据 . 计算机学报 2015 , 38 ( 1 ): 18 ~ 34 .
Jin C Q , Qian W N , Zhou M Q . et al . Benchmarking data management systems:from traditional database to emergent big data Chinese Journal of Computers , 2015 38 ( 1 ): 18 ~ 34 .
Nambiar R , Wakou N , Masland A , et al . Shaping the landscape of industry standard benchmarks: contributions of the transaction processing performance council (TPC) . Proceedings of the 3rd TPC Technology Conference , Seattle, Wa, USA , 2011
Bitton D , Brown M , Catell R , et al . A measure of transaction processing power . Datamation , 1985 , 31 ( 7 ): 112 ~ 118 .
Turbyfill C , Orji C , Bitton D . AS3AP-An ANSI SQL Standard Scalable and Portable Benchmark for Relational Database Systems . Chapter 5, Benchmark handbook for database and transaction system (2nd edition).San Francisco: Morgan Kaufmann , 1993
O’Neil . Revisiting DBMS benchmarks . Datamation , 1989 , 35 ( 9 ): 47 ~ 52 .
O’Neil P , O’Neil B , Chen X D . The Star Schema Benchmark (SSB) . University of Massachusetls,Boston , 2007
Bog A . Benchmarking Transaction and Analytical Processing Systems: The Creation of a Mixed Workload Benchmark and Its Application . Berlin : Springer , 2013
Cattell R G G , Skeen J . Object operations benchmark . ACM Transactions on Database Systems , 1992 , 17 ( 1 ): 1 ~ 31 .
Carey M J , DeWitt D J , Naughton J F . The OO7 benchmark . Proceedings of ACM SIGMOD International Conference on Management of Data , Washington, DC, USA , Proceedings of ACM SIGMOD International Conference on Management of Data , 1993
Anderson T L , Berre A J , Mallison M , et al . The HyperModel benchmark . Proceedings of the 2nd International Conference on Extending Database Technology: Advances in Database Technology , Venice, Italy , 1990
Carey M J , DeWitt D J , Naughton J F , et al . The BUCKY object-relational benchmark . Proceedings of ACM SIGMOD International Conference on Management of Data , Tucson, Arizona, USA , 1997
Runapongsa K , Patel J M , Jagadish H V , et al . The Michigan benchmark: towards XML query performance diagnostics . Information Systems , 2006 , 31 ( 2 ): 73 - 97 .
Yao B , Ozsu M T , Khandelwal N . XBench benchmark and performance testing of XML DBMSs . Proceedings of the 30th IEEE International Conference on Data Engineering , Chicago, IL, USA , 2004
Bōhme T , Rahm E . Multi-user evaluation of XML data management systems with XMach-1 . Proceedings of the Workshop on Efficiency and Effectiveness of XML Tools and Techniques (EEXTT) , Heidelberg, Germany , 2002
Schmidt A , Waas F , Kersten M , et al . XMark: a benchmark for XML data management . Proceedings of the 28th International Conference on Very Large Data Bases , Hong Kong, China , 2002
Li Y , Bressan S , Dobbie G , et al . XOO7:applying OO7 benchmark to XML query processing tools . Proceedings of Conference on Information and Knowledge Management , Washington, DC, USA , 2001
Nicola M , Kogan I , Schiefer B . An XML transaction processing benchmark . Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems , Beijing, China , 2007
Werstein P . A performance benchmark for spatiotemporal databases . Proceedings of the 10th Annual Colloquium of the Spatial Information Research Centre , Dunedin, New Zealand , 1998
Myllymaki J , Kaufman J . DynaMark: a benchmark for dynamic spatial indexing . Proceedings of the 4th International Conference on Mobile Data Management , Melbourne, Australia , 2003
Jensen C , Tiesyte D , Tradisauskas N , et al . The COST benchmark-comparison and evaluation of spatio-temporal indexes . Proceedings of the 11th International Conference on Database Systems for Advanced Applications , Singapore , 2006
Düntgen C , Behr T , Güting R H , et al . BerlinMOD: a benchmark for moving object databases . The VLDB Journal , 2009 , 18 ( 6 ): 1335 - 1368
Arasu A , Cherniack M , Galvez E , et al . Linear road: a stream data management benchmark . Proceedings of the 30th International Conference on Very Large Data Bases , Toronto, Canada , 2004
Kim K , Jeon K , Han H , et al . MRBench:a benchmark for MapReduce framework . Proceedings of the 14th IEEE International Conference on Parallel and Distributed Systems , Melbourne, Victoria, Australia , 2008
White T . Hadoop权威指南(第二版) . 周敏奇 , 王晓玲 , 金澈清 等 译. 北京 : 清华大学出版社 , 2011
White T . Hadoop: The Definitive Guide .Translated by Zhou M Q , Wang X L , Jin C Q et al . Beijing : Tsinghua University Press , 2011
Daniel . Pig mix . https://cwiki.apache.org/confluence/display/PIG/PigMix,2013 https://cwiki.apache.org/confluence/display/PIG/PigMix,2013 .
Luo C , Zhan J , Jia Z et al . CloudRank-D:benchmarking and ranking cloud computing systems for data processing applications . Frontiers of Computer Science , 2012 , 6 ( 4 ): 347 ~ 362 .
Cooper B , Silberstein A , Tam E . et al . Benchmarking cloud serving systems with YCSB . Proceedings of ACM Symposium on Cloud Computing , Indianapolis, IN, USA , 2010
Patil S , Polte M , Ren K , et al . YCSB++:benchmarking and performance debugging advanced features in scalable table stores . Proceedings of ACM Symposium on Cloud Computing , Cascais, Portugal , 2011
Floratou AD , Teletia N , DeWitt D J , et al . Can the elephants handle the NoSQL onslaught . Proceedings of the VLDB Endowment , 2012 , 5 ( 12 ): 1712 - 1723
Rabl T , Gómez-Villamor S , Sadoghi M , et al . Solving big data challenges for enterprise application performance management . Proceedings of the VLDB Endowment , 2012 , 5 ( 12 ): 1724 ~ 1735 .
Ghazal A , Rabl T , Hu M , et al . BigBench:towards an industry standard benchmark for big data analytics . Proceedings of ACM SIGMOD/PODS Conference , New York, USA , 2013
Armstrong T G , Ponnekanti V , Borthakur D , et al . LinkBench: a database benchmark based on the Facebook social graph . Proceedings of the ACM SIGMOD/PODS Conference , New York, USA , 2013
Boncz P A , Fundulaki I , Gubichev A , et al . The linked data benchmark council project . Datenbank-Spektrum , 2013 , 13 ( 2 ): 121 - 129 .
Jia Z , Wang L , Zhan J , et al . Characterizing data analysis workloads in data centers . Proceedings of IEEE International Symposium on Workload Characterization , Portland, OR, USA , 2013
Xi H F , Zhan J F , Zhan J , et al . Characterization of Real Workloads of Web Search Engines . Proceedings of IEEE International Symposium on Workload Characterization , Austin, TX , USA , 2011
Wang L , Zhan J F , Luo C J , et al . BigDataBench: a big data benchmark suite from internet services . Proceedings of the 24th IEEE International Symposium on High Perfornance Computer Architecture , Orlando, Florida, USA , 2014
Zhu Y , Zhan J . BigOP: generating comprehensive big data workloads as a benchmarking framework . Proceedings of the 19th International Conference on Database Systems for Advanced Applications , Bali, Indonesia , 2014
刘兵兵 , 孟小峰 , 史英杰 . CloudBM:云数据管理系统测试基准 . 计算机科学与探索 . 2012 , 6 ( 6 ): 504 ~ 512 .
LUI B B , Meng X F , Shi Y J . CloudBM: a benchmark for cloud data management systems . Journal of Frontiers of Computer Science and Technology , 2012 , 6 ( 6 ): 504 ~ 512 .
付长冬 , 舒继武 , 沈美明 等 . 网络存储系统性能基准的研究、评价与发展 . 小型微型计算机系统 , 2004 , 25 ( 12 ): 2049 ~ 2054 .
Fu C D , Shu J W , Shen M M . et al . Evaluation, research and development of performance benchmark on network storage system . Journal of Chinese Computer Systems , 2012 , 25 ( 12 ): 2049 - 2054 .
刘大为 , 栾华 , 王珊 等 . 内存数据库在TPC-H负载下的处理器性能 . 软件学报 , 2008 , 19 ( 10 ): 2574 - 2584 .
Liu D W , Luan H , Wang S . et al . Main memory database TPC-H workload characterization on modern processor . Journal of Software , 2008 , 19 ( 10 ): 2574 - 2584 .
Kang Q Q , Jin C Q , Zhang Z . et al . MemTest: a novel benchmark for in-memory database. Proceedings of the 5th Workshop on Big Data Benchmarks . Proceedings of the 5th Workshop on Big Data Benchmarks, Performance Optimization, and Emerging Hardware , Hangzhou, China , 2014
Zhao H W , Ye X J . A practice of TPC-DS multidimensional implementation on NoSQL database systems . Proceedings of the 5th TPC Technology Conference , Trento, Italy , 2013
赵博 叶晓俊 . OLAP性能测试方法研究与实现 . 计算机研究与发展 , 2011 , 48 ( 10 ): 1951 ~ 1959 .
Zhao B , Ye X J . Study and implementation of OLAP performance benchmark . Journal of Computer Research and Development , 2011 , 48 ( 10 ): 1951 - 1959 .
叶晓俊 , 王建民 . DBMS性能评价指标体系 . 计算机研究与发展 , 2009 , 46 ( 增刊 ): 313 ~ 318 .
Ye X J , Wang J M . DBMS performance evaluation indicators . Journal of Computer Research and Development , 2009 , 46 ( suppl. ): 313 ~ 318 .
Ning F F , Weng C L , Luo Y . Virtualization I/O optimization based on shared memory . Proceedings of the IEEE International Conference on Big Data , Santa Clara, USA , 2013
Chen P , Qi Y , Li X , et al . An ensemble MIC-based approach for performance diagnosis in big data platform . Proceedings of the IEEE International Conference on Big Data , Santa Clara, USA , 2013
Gu L , Zhou M Q , Zhang Z J , et al . Chronos: an elastic parallel framework for stream benchmark generation and simulation . Proceedings of the 31st IEEE International Conference on Data Engineering , Seoul, Korea , 2015
Du N Q , Ye X J , Wang J M , . Towards workflow-driven database system workload modeling . Proceedings of the 2nd International Workshop on Testing Database Systems , Providence, Rhode Island, USA , 2009
0
浏览量
516
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621