[ "柯学翰(1996- ),男,上海交通大学软件学院并行与分布式系统研究所硕士生,主要研究方向为分布式图计算系统。" ]
[ "陈榕(1981- ),男,博士,上海交通大学软件学院并行与分布式系统研究所副教授,主要研究方向为并行与分布式系统、内存计算等。" ]
网络首发:2019-07,
纸质出版:2019-07-15
移动端阅览
柯学翰, 陈榕. 基于图查询系统的图计算引擎[J]. 大数据, 2019,5(4):16-26.
Xuehan KE, Rong CHEN. Graph processing engine based on graph query system[J]. Big Data Research, 2019, 5(4): 16-26.
柯学翰, 陈榕. 基于图查询系统的图计算引擎[J]. 大数据, 2019,5(4):16-26. DOI: 10.11959/j.issn.2096-0271.2019029.
Xuehan KE, Rong CHEN. Graph processing engine based on graph query system[J]. Big Data Research, 2019, 5(4): 16-26. DOI: 10.11959/j.issn.2096-0271.2019029.
在目前的研究中,图查询和图计算系统是相互独立的,但在实际应用中两者通常是同时存在的。为解决相互独立的系统带来的存储空间浪费、数据一致性维护等问题,基于图查询系统设计了一种图计算引擎,使得在单一系统中支持查询和计算操作。通过为键值对存储增加图计算索引、基于拉取模式的数据更新等方式,有效地提高系统中数据遍历的性能和减少数据传输的成本,同时针对数据更新和负载均衡等方面提出了相关优化。实验表明,该图计算引擎能够达到与传统图计算系统PowerLyra和Gemini相近或比其更优的性能,且具有较好的可扩展性。
Recently
graph query and graph processing are emphasis of graph-structured data research.However
independent graph system mismatched combining applications
which needed both query and processing.To avoid some issues brought by independent system
such as wasting resource and data inconsistency
a method that providing a graph processing engine based on graph query system was proposed
in order to support query and processing operation in a unified system.Through adding index for graph processing and applying pull-based graph propagation method to over locality issue
the performance of the computation and transmission was largely improved.Besides
some optimization approaches were put forward for message updating and work balanced.The experimental results show that the processing engine can provide close(reduced by no more than 1x) or even better (up to 20x) performance compared to specific graph processing systems (e.g.
Gemini and PowerLyra) by leveraging new designs and optimizations
and also has good scalability.
SHI J , YAO Y , CHEN R , et al . Fast and concurrent RDF queries with RDMAbased distributed graph exploration [C ] // The 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16),November 2-4,2016,Savannah,USA . Berkeley:USENIX Association , 2016 : 317 - 332 .
GURAJADA S , SEUFERT S , MILIARAKI I , et al . TriAD:a distributed shared-nothing RDF engine based on asynchronous message passing [C ] // The 2014 ACM SIGMOD International Conference on Management of Data,June 22-27,2014,Snowbird,USA . New York:ACM Press , 2014 : 289 - 300 .
ZENG K , YANG J , WANG H , et al . A distributed graph engine for web scale RDF data [J ] . Proceedings of the VLDB Endowment , 2013 , 6 ( 4 ): 265 - 276 .
MALEWICZ G , AUSTERN M H , BIK A J C , et al . Pregel:a system for large-scale graph processing [C ] // The 2010 ACM SIGMOD International Conference on Management of Data,August 11-13,2009,Indianapolis,USA . New York:ACM Press , 2010 : 135 - 146 .
GONZALEZ J E , LOW Y , GU H , et al . Powergraph:distributed graph-parallel computation on natural graphs [C ] // The 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12),October 8-10,2012,Hollywood,USA . Berkeley:USENIX Association , 2012 : 17 - 30 .
CHEN R , SHI J , CHEN Y , et al . Powerlyra:differentiated graph computation and partitioning on skewed graphs [J ] . ACM Transactions on Parallel Computing , 2019 , 5 ( 3 ):13.
ZHU X , CHEN W , ZHENG W , et al . Gemini:a computation-centric distributed graph processing system [C ] // The 12th USENIX Symposium on Operating Systems Design and Implementation,November 2-4,2016,Savannah,USA . Berkeley:USENIX Association , 2016 : 301 - 316 .
LOW Y , BICKSON D , GONZALEZ J , et al . Distributed GraphLab:a framework for machine learning and data mining in the cloud [J ] . Proceedings of the VLDB Endowment , 2012 , 5 ( 8 ): 716 - 727 .
TSOURAKAKIS C , GKANTSIDIS C , RADUNOVIC B , et al . Fennel:streaming graph partitioning for massive scale graphs [C ] // The 7th ACM International Conference on Web Search and Data Mining,February 24-28,2014,New York,USA . New York:ACM Press , 2014 : 333 - 342 .
WU M , YANG F , XUE J , et al . GraM:scaling graph computation to the trillions [C ] // The 6th ACM Symposium on Cloud Computing,August 27-29,2015,Kohala Coast,USA . New York:ACM Press , 2015 : 408 - 421 .
ROY A , MIHAILOVIC I , ZWAENEPOEL W . X-Stream:edge-centric graph processing using streaming partitions [C ] // The 24th ACM Symposium on Operating Systems Principles,November 3-6,2013,Farmington,USA . New York:ACM Press , 2013 : 472 - 488 .
WANG Y , DAVIDSON A , PAN Y , et al . Gunrock:a high-performance graph processing library on the GPU [J ] . ACM SIGPLAN Notices , 2016 , 51 ( 8 ):11.
SHUN J , BLELLOCH G E . Ligra:alightweight graph processing framework for shared memory [J ] . ACM SIGPLAN Notices , 2013 , 48 ( 8 ): 135 - 146 .
ZHANG K , CHEN R , CHEN H . NUMAaware graph-structured analytics [J ] . ACM SIGPLAN Notices , 2015 , 50 ( 8 ): 183 - 193 .
MCCUNE R R , WENINGER T , MADEY G . Thinking like a vertex:a survey of vertex-centric frameworks for largescale distributed graph processing [J ] . ACM Computing Surveys , 2015 , 48 ( 2 ):25.
GROSSMAN S , LITZ H , KOZYRAKIS C . Making pull-based graph processing performant [J ] . ACM SIGPLAN Notices , 2018 , 53 ( 1 ): 246 - 260 .
BLUMOFE R D , LEISERSON C E . Scheduling multithreaded computations by work stealing [J ] . Journal of the ACM , 1999 , 46 ( 5 ): 720 - 748 .
0
浏览量
576
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621