[ "杜小勇,男,博士,中国人民大学信息学院教授、博士生导师,中国计算机学会会士,数据库专家委员会委员、大数据专家委员会委员,人民邮电出版社《大数据》期刊编委会副主任,Springer出版社《Communications of Computer and Information Systems》系列编委,主要研究方向为智能信息检索、高性能数据库、知识工程。主持和参与多项国家核高基(核心电子器件、高端通用芯片及基础软件产品)、“973”计划、“863”计划、国家自然科学基金项目,近年来在SIGMOD、VLDB、AAAI、IEEE TKDE等国际重要期刊和会议上发表论文百余篇。" ]
[ "陈跃国,男,博士,中国人民大学信息学院副教授、硕士生导师,中国计算机学会高级会员,大数据专家委员会通信委员,Frontiers of Computer Science青年编委,主要研究方向为大数据分析系统和语义搜索。主持国家自然科学基金项目2项,参与多项国家核高基(核心电子器件、高端通用芯片及基础软件产品)、“973”计划、“863”计划项目,近年来在ICDE、AAAI、IEEE TKDE等国际重要期刊和会议上发表论文30余篇。" ]
[ "覃雄派,男,博士,中国人民大学信息学院讲师、硕士生导师,目前主要从事高性能数据库、大数据分析、信息检索等方面的研究工作,主持1项国家自然科学基金面上项目,参与多项国家“863”计划、“973”计划及国家自然科学基金项目,在国内外期刊和会议上发表论文20余篇。" ]
网络首发:2015-05,
纸质出版:2015-05-20
移动端阅览
杜小勇, 陈跃国, 覃雄派. 大数据与OLAP系统[J]. 大数据, 2015,1(1):46-58.
Xiaoyong Du, Yueguo Chen, Xiongpai Qin. Big Data and OLAP Systems[J]. BIG DATA RESEARCH, 2015, 1(1): 46-58.
杜小勇, 陈跃国, 覃雄派. 大数据与OLAP系统[J]. 大数据, 2015,1(1):46-58. DOI: 10.11959/j.issn.2096-0271.2015.01.005.
Xiaoyong Du, Yueguo Chen, Xiongpai Qin. Big Data and OLAP Systems[J]. BIG DATA RESEARCH, 2015, 1(1): 46-58. DOI: 10.11959/j.issn.2096-0271.2015.01.005.
OLAP(online analytical processing,在线联机分析处理)是关系数据基础上实现商业智能的核心技术。在大数据时代,人们迫切希望在由普通机器组成的大规模集群上能实现高性能的OLAP,然而系统性能的挑战巨大。可喜的是,近年来进展迅速,涌现了很多以Hadoop上的数据进行OLAP的所谓SQL on Hadoop系统,并且系统性能不断提升。在综述OLAP技术发展的基础上,重点对几个有代表性的SQL on Hadoop系统进行了测试分析,并展示了这类系统的性能特点。可以预见,未来在低成本的大数据OLAP市场,这类系统会占有重要位置。
OLAP (online analytical processing) is a key technology of business intelligence based on relational data. In big data era
people want to achieve high performance OLAP using a large cluster of ordinary nodes. However
the performance of such systems is a big challenge. Recently
many SQL on Hadoop systems have been proposed to address this challenge. We have seen a significant performance improvement of such systems. A survey of technology development of OLAP technologies was first provided. Then
a study of the performance of three representatives SQL on Hadoop systems was focused on. Based on the results
it is expected that such systems will play an very important role in the market of low cost OLAP analysis.
Codd E F , Codd S B , Salley C T . Providing OLAP (online analytical processing) to user-analysts: an IT mandate . E f codd &Associates , 1998
Thomsen E . OLAP Solutions: Building Multidimensional Information Systems2nd Edition . Hoboken: John Wiley & Sons , 2002
Daniel M S , Abadi D J , Batkin A , et al . C-store: a column-oriented DBMS . Proceedings of the 31st Very Large Data Bases (VLDB) Conference 2005 553 ~ 564 .
Kaufmann M , Manjili A A , Vagenas P , et al . Timeline index: a unified data structure for processing queries on temporal data in SAP HANA . Proceedings of Acm Special Interest Group on Management of Data (SIGMOD) International Conference on Management of Data , New York, USA 2013 : 1173 ~ 1184 .
Dean J , Ghemawat S , Vagenas P , et al . MapReduce:simplified data processing on large clusters . Proceedings of Operating Systems Design and Implementation (OSDI) , San Francisco, CA, USA 2004 137 ~ 150 .
Pavlo A , Paulson E , Rasin A , et al . A comparison of approaches to large-scale data analysis . Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD) International Conference on Management of Data , Providence, USA , 2009 165 - 178 .
Chen Y G , Qin X P , Bian H Q , et al . A study of SQL-on-hadoop systems . Lecture Notes in Computer Science , 2014 ,( 8807 ): 154 - 166 .
Hive cost based optimization . https://cwiki.apache.org/confluence/display/Hive/Cost-based+optimization+in+Hive https://cwiki.apache.org/confluence/display/Hive/Cost-based+optimization+in+Hive 2015
Ion Stoica . http://ampcamp.berkeley.edu/wp-content/uploads/2013/02/Berkeley-Data-Analytics-Stack-BDAS-Overview-Ion-Stoica-Strata-2013.pdf http://ampcamp.berkeley.edu/wp-content/uploads/2013/02/Berkeley-Data-Analytics-Stack-BDAS-Overview-Ion-Stoica-Strata-2013.pdf 2013
0
浏览量
730
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621