清华大学计算机科学与技术系 北京 100084
[ "吴城文,男,清华大学计算机科学与技术系硕士生,主要研究领域为大数据图计算。" ]
[ "张广艳,男,博士,清华大学计算机科学与技术系副教授,中国计算机学会会员,主要研究领域为大数据计算、网络存储、分布式计算。" ]
[ "郑纬民,男,清华大学教授、博士生导师,中国计算机学会理事长,目前主要从事并行与分布式计算、存储系统的研究工作,主持和参与多项国家“973”计划、“863”计划、国家自然科学基金项目。近年来在IEEE TC/IEEE TPDS/ACM TOS/FAST等本领域顶级期刊与国际会议发表论文40余篇。" ]
网络首发:2015-06,
纸质出版:2015-06-20
移动端阅览
吴城文, 张广艳, 郑纬民. 从系统角度审视大图计算[J]. 大数据, 2015,1(3):41-54.
Chengwen Wu, Guangyan Zhang, Weimin Zheng. Reviewing Large Graph Computing from a System Perspective[J]. BIG DATA RESEARCH, 2015, 1(3): 41-54.
吴城文, 张广艳, 郑纬民. 从系统角度审视大图计算[J]. 大数据, 2015,1(3):41-54. DOI: 10.11959/j.issn.2096-0271.2015028.
Chengwen Wu, Guangyan Zhang, Weimin Zheng. Reviewing Large Graph Computing from a System Perspective[J]. BIG DATA RESEARCH, 2015, 1(3): 41-54. DOI: 10.11959/j.issn.2096-0271.2015028.
大图计算已经成为学术界和工业界的一种基本计算模式,并且已经被应用到许多实际的大数据计算问题上,如社交网络分析、网页搜索以及商品推荐等。对于这些问题,大图的规模约有10亿级的点以及1 000亿级的边,这样的规模给大图的高效处理带来了诸多挑战。为此,介绍了大图计算的基本特征和挑战、典型的计算模型以及具有代表性的分布式、单机处理系统,同时对图处理系统中的关键技术进行总结,最后从系统的角度给出大图计算可能的一些研究方向。
Large graphcomputing has been a fundamental computing pattern in both academic and industry field
and it was applied to a lot of practical big data applications
such as social network analysis
web page search
and goods recommendation.In general
most of large graphs scale to billions of vertices
and corresponding to hundreds billions of edges
which brings us challenges of efficient graph processing.Therefore
the basic feature and challenges of current large graph computing
typical computing models
and representative distributed
and single machine large graph processing systems were introduced.Then
some key technologies employed in large graph computing were summarized.Finally
some research directions in large graph computing from a system perspective were given.
Lumsdaine A , Gregor D , Hendrickson B , et al . Challenges in parallel graph processing . Parallel Processing Letters , 2007 , 17 ( 1 ): 5 ~ 20
Dean J , Ghemawat S . MapReduce:simplified data processing on large clusters . Communications of the ACM , 2008 , 51 ( 1 ): 107 ~ 113
Gregor D , Lumsdaine A . The parallel BGL: a generic library for distributed graph computations . Proceedings of Parallel Object-Oriented Scientic Computing (POOSC) , Glasgow,UK , 2005
Chan A , Dehne F , Taylor R . CGMGRAPH/CGMLIB: implementing and testing CGM graph algorithms on PC clusters and shared memory machines . International Journal of High Performance Computing Applications , 2005 , 19 ( 1 ): 81 ~ 97
Malewicz G , Austern M , Bik A J C , et al . Pregel: a system for large-scale graph processing . Proceedings of ACM Special Interest Group on Management of Data , Indianapolis,IN,USA , 2010 : 135 ~ 146
Low Y C , Bickson D , Gonzalez J , et al . Distributed GraphLab: a framework for machine learning in the cloud . Proceedings of the VLDB Endowment (PVLDB) , 2012 , 5 ( 8 ): 716 ~ 727
Gonzalez J E , Low Y C , Gu H J , et al . Power graph: distributed graph-parallel computation on natural graphs . Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation , Hollywood,CA,USA , 2012 : 17 ~ 30
Gonzalez J E , Xin R S , Dave A , et al . Graphx: graph processing in a distributed dataflow framework . Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation , Broomfield,CO,USA , 2014 : 599 ~ 613
Chen R , Ding X , Wang P , et al . Computation and communication efficient graph processing with distributed immutable view . Proceedings of High-Performance Parallel and Distributed Computing , New York,USA , 2014 : 215 ~ 226
Yan D , Cheng J , Lu Y , et al . Blogel: a block-centric framework for distributed computation on real-world graphs . Proceedings of the VLDB Endowment (PVLDB) , 2014 , 7 ( 14 ): 1981 ~ 1992
Yuan P P , Zhang W Y , Xie C F , et al . Fast iterative graph computation: a path centric approach . Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis , Piscataway,NJ,USA , 2014 : 401 ~ 412
Kyrola A , Blelloch G , Guestrin C , et al . GraphChi: large-scale graph computation on just a PC . Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation , Hollywood,CA,USA , 2012 : 31 ~ 46
Roy A , Mihailovi I , Zwaenepoel W . X-stream: edge-centric graph processing using streaming partitions . Proceedings of ACM Symposium on Operating Systems Principles , Farmington,PA,USA , 2013 : 472 ~ 488
Cheng J F , Liu Q , Li Z G , et al . VENUS:vertex-centric streamlined graph computation on a single PC . Proceedings of the 31st IEEE International Conference on Data Engineering , Seoul,Korea , 2015 : 1131 ~ 1142
Zhu X W , Han W T , Chen W G . Grid graph: large-scale graph processing on a single machine using 2-level hierarchical partitioning . Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference , Santa Clara,CA,USA , 2015 : 375 ~ 386
Valiant Leslie G . A bridging model for parallel computation . Communications of the ACM , 1990 , 33 ( 8 ): 103 ~ 111
Low Y C , Gonzalez J , Kyrola A , et al . GraphLab: a new framework for parallel machine learning . Proceedings of Conference on Uncertainty in Artificial Intelligence , Catalina Island,California,USA , 2010
Baraba′si A L , Albert R . Emergence of scaling in random networks . Science , 1999 , 286 ( 5439 ): 509 ~ 512
Gharaibeh A , Costa L B , Santos-Neto E , et al . On graphs,GPUs,and blind dating:a work load to processor matchmaking quest . Proceedings of IEEE the 27th International Symposium on Parallel and Distributed Processing , Washington DC,USA , 2013 : 851 ~ 862
Fu Z S , Personick M , Thompson B . MapGraph: a high level API for fast development of high performance graph analytics on GPUs . Proceedings of Graph Data-management Experiences &Systems , Utah,USA , 2014 : 1 ~ 6
Khorasani F , Vora K , Gupta R , et al . CuSha: vertex-centric graph processing on GPUs . Proceedings of the International ACM Symposium on High-Performance Parallel and Distributed Computing , Vancouver,Canada , 2014 : 239 ~ 252
Han W S , Lee S , Park K , et al . TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC . Proceedings of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , Chicago,USA , 2013 : 77 ~ 85
Zheng D , Mhembere D , Burns R , et al . FlashGraph: processing billion-node graphs on an array of commodity SSDs . Proceedings of the 13th USENIX Conference on File and Storage Technologies , Santa Clara,CA,USA , 2015 : 45 ~ 58
0
浏览量
580
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621