1. 中山大学数据科学与计算机学院,广东 广州 510006
2. 桂林电子科技大学计算机与信息安全学院,广西 桂林 541004
[ "吴维刚(1976- ),男,博士,中山大学数据科学与计算机学院教授、博士生导师,广东省医疗大数据工程技术研究中心副主任、广州市超算与大数据重点实验室副主任,主要研究方向为云计算与边缘计算、大数据与深度学习、分布式共识与区块链等" ]
[ "常亮(1980- ),男,博士,桂林电子科技大学计算机与信息安全学院院长,中国计算机学会高级会员,主要研究方向为数据与知识工程、形式化方法、智能系统" ]
[ "任江涛(1975- ),男,博士,中山大学数据科学与计算机学院副教授,中国计算机学会会员,主要研究方向为数据挖掘、机器学习与自然语言处理" ]
[ "古天龙(1964- ),男,博士,桂林电子科技大学计算机与信息安全学院教授、博士生导师,国家百千万人才工程人选,教育部高等学校计算机类专业教学指导委员会副主任委员,中国人工智能学会离散智能计算专业委员会主任委员、人工智能教育工作委员会副主任委员,卫星导航定位与位置服务国家地方联合工程研究中心主任,主要研究方向为知识工程与大数据、人工智能伦理、形式化方法等" ]
网络首发:2020-03,
纸质出版:2020-03-15
移动端阅览
吴维刚, 常亮, 任江涛, 等. 面向政府治理大数据的高性能计算系统[J]. 大数据, 2020,6(2):2020013-1.
Weigang WU, Liang CHANG, Jiangtao REN, et al. High performance big data computing systems for government governance[J]. Big Data Research, 2020, 6(2): 2020013-1.
吴维刚, 常亮, 任江涛, 等. 面向政府治理大数据的高性能计算系统[J]. 大数据, 2020,6(2):2020013-1. DOI: 10.11959/j.issn.2096-0271.2020013.
Weigang WU, Liang CHANG, Jiangtao REN, et al. High performance big data computing systems for government governance[J]. Big Data Research, 2020, 6(2): 2020013-1. DOI: 10.11959/j.issn.2096-0271.2020013.
大数据处理系统是未来社会的基础设施之一。政府治理场景下的大数据处理任务具有多域异构、多主体等特点,因此需要针对性地进行研究设计。从应用需求出发,分析各类政府治理场景对大数据处理技术提出的挑战,梳理大数据分布并行处理的关键技术,包括数据存储管理、计算平台、关键算法等,调研总结相关技术的研究现状,并提出面向政府治理大数据的高性能计算系统的技术框架,分析讨论不同技术路线的优劣。最后展望相关技术的未来发展趋势。
The big data processing system would be one of the major infrastructures of future society.The characteristics and requirements of government governance applications
including multi-domain
multi-entity
etc.
bring new challenges to big processing platform
and new design is desirable.Based on the requirements of applications
the challenges in big data processing for government governance scenarios were analyzed
and key techniques in distributed parallel processing for public governance were discussed
including data storage
computing platform
and key algorithms.The state-of-art techniques were investigated and reviewed
and a candidate framework for government governance data process system was proposed
the pros and cons of different technical methodologies were also discussed.Finally
the open problems and future directions were also discussed.
NUAIMI E , NEYADI H , MOHAMED N , et al . Applications of big data to smart cities [J ] . Journal of Internet Services and Applications , 2015 , 6 ( 1 ): 1 - 15 .
DAHL G E , YU D , DENG L , et al . Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition [J ] . IEEE Transactions on Audio Speech & Language Processing , 2012 , 20 ( 1 ): 30 - 42 .
JUN C , CHUNG C . Big data analysis of local government 3.0:focusing on Gyeongsangbuk-do in Korea [J ] . Technological Forecasting & Social Change , 2016 , 110 : 3 - 12 .
夏火松 , 甄化春 . 大数据环境下舆情分析与决策支持研究文献综述 [J ] . 情报杂志 , 2015 ( 2 ): 1 - 6 .
XIA H S , ZHEN H C . Public opinion analysis and decision support study under big data surroundings [J ] . Journal of Intelligence , 2015 ( 2 ): 1 - 6 .
JIANG D , LEUNG K , NG W . Fast topic discovery from web search streams [C ] // The 23rd International Conference on World Wide Web,April 7-11,2014,Seoul,Korea . New York:ACM Press , 2014 : 949 - 960 .
王登峰 . 网络舆情事件热点发现的算法比较分析 [J ] . 信息通信 , 2014 ( 2 ): 32 - 34 .
WANG D F . Algorithm analysis on network public opinion hotspot detection [J ] . Information & Communications , 2014 ( 2 ): 32 - 34 .
RATHORE M , AHMAD A , PAUL A , et al . Urban planning and building smart cities based on the Internet of things using big data analytics [J ] . The International Journal of Computer and Telecommunications Networking , 2016 ( 101 ): 63 - 80 .
KITCHIN R . The real-time city? big data and smart urbanism [J ] . Geo Journal , 2014 , 79 : 1 - 14 .
BAHL P , PADMANABHAN V N . RADAR:an in-building RF-based user location and tracking system [C ] // IEEE INFOCOM 2000,March 26-30,2000,Tel Aviv,Israel . Piscataway:IEEE Press , 2000 : 775 - 784 .
ZHAO F , ZHOU J , NIE C , et al . SmartCrawler:a two-stage crawler for efficiently harvesting deep-web interfaces [J ] . IEEE Transactions on Services Computing , 2016 , 9 ( 4 ): 608 - 620 .
LIAKOS P , NTOULAS A , LABRINIDIS A , et al . Focused crawling for the hidden web [J ] . World Wide Web , 2016 ( 19 ): 605 - 636 .
LIU W , MENG X F , MENG W Y . ViDE:a vision-based approach for deep web data extraction [J ] . IEEE Transactions on Knowledge and Data Engineering , 2010 , 22 ( 3 ): 447 - 460 .
YU X , BEZERRA G , PAVLO A , et al . Staring into the abyss:an evaluation of concurrency control with one thousand cores [J ] . VLDB Endowment , 2014 , 8 ( 3 ): 209 - 220 .
HARDING R , AKEN D V , PAVLO A , et al . An evaluation of distributed concurrency control [J ] . VLDB Endowment , 2017 , 10 ( 5 ): 553 - 564 .
LAKSHMAN S , MELKOTE S , LIANG J , et al . Nitro:a fast,scalable in-memory storage engine for NoSQL global secondary index [J ] . VLDB Endowment , 2013 , 9 ( 13 ): 1413 - 1424 .
DIEGUES N , ROMANO P . STI-BT:a scalable transactional index [J ] . IEEE Transactions on Parallel and Distributed Systems , 2016 , 27 ( 8 ): 2408 - 2421 .
CHU X , ILYAS I . Qualitative data cleaning [J ] . VLDB Endowment , 2016 ( 9 ): 1605 - 1608 .
HELLERSTEIN J M . Quantitative data cleaning for large databases [R ] . United Nations Economic Commission for Europe (UNECE) , 2008 .
FAN W , GEERTS F , JIA X . Semandaq:a data quality system based on conditional functional dependencies [J ] . VLDB Endowment , 2008 : 1460 - 1463 .
FAN W , GEERTS F , JIA X , et al . Conditional functional dependencies for capturing data inconsistencies [J ] . ACM Transactions on Database Systems , 2008 , 33 ( 2 ): 1 - 48 .
CHIANG F , MILLER R . Discovering data quality rules [J ] . VLDB Endowment , 2008 ( 8 ): 1166 - 1177 .
JIN C , LALL A , XU J , et al . Distributed error estimation of functional dependency [J ] . Information Sciences , 2016 , 345 : 156 - 176 .
QUE X , WANG Y , XU C , et al . Hierarchical merge for scalable MapReduce [C ] // Proceedings of the 2012 Workshop on Management of Big Data Systems,September 21,2012,San Jose,USA . New York:ACM Press , 2012 : 1 - 6 .
MICHEAL S , THOTA A , HENSCHEL R . HPCHadoop:a framework to run Hadoop on Cray X-series supercomputers [C ] // Cray User Group Meeting 2014,May 4-8,2014,Lugano,Switzerland.[S.l.:s.n] . 2014 .
WANG W , WU Q , TAN Y , et al . Optimizing the MapReduce framework for CPUMIC heterogeneous cluster [M ] . Berlin : Springer International PublishingPress , 2015 .
HOEFLER T , LUMSDAINE A , DONGARRA J . Towards efficient MapReduce using MPI [C ] // The 16th European PVM/MPI Users’ Group Meeting,September 7-10,2009,Espoo,Finland . Berlin:SpringerVerlag , 2009 : 240 - 249 .
MOHAMED H , MARCHAND-MAILLET S . Distributed media indexing based on MPI and MapReduce [J ] . Multimedia Tools and Applications , 2014 , 69 ( 2 ): 513 - 537 .
RAINA R , MADHAVAN A , NG A Y . Largescale deep unsupervised learning using graphics processors [C ] // The 26th Annual International Conference on Machine Learning,June 14-18,2009,Montreal,Canada . New York:ACM Press , 2009 : 873 - 880 .
SHALEV-SHWARTZ S , SINGER Y , SREBRO N , et al . Pegasos:primal estimated sub-gradient solver for SVM [J ] . Mathematical Programming , 2011 , 127 ( 1 ): 3 - 30 .
HAZAN E , RAKHLIN A , BARTLETT P L . Adaptive online gradient descent [C ] // The 20th International Conference on Neural Information Processing Systems,December 3-6,2007,Vancouver,Canada . New York:ACM Press , 2007 : 65 - 72 .
LIU D C , NOCEDAL J . On the limited memory BFGS method for large scale optimization [J ] . Mathematical Programming , 1989 , 45 ( 3 ): 503 - 528 .
LE Q V , NGIAM J , COATES A , et al . On optimization methods for deep learning [C ] // The 28th International Conference on Machine Learning,June 28 - July 2,2011,Bellevue,USA.[S.l.:s.n] . 2011 .
OWENS J D , HOUSTON M , LUEBKE D , et al . GPU computing [J ] . Proceedings of the IEEE , 2008 , 96 ( 5 ): 879 - 899 .
JIN L , WANG Z , GU R , et al . Training large scale deep neural networks on the Intel Xeon Phi many-core coprocessor [C ] // 2014 IEEE International Parallel & Distributed Processing Symposium Workshops,May 19-23,2014,Phoenix,USA . Piscataway:IEEE Press , 2014 : 1622 - 1630 .
VIEBKE A , PLLANA S . The potential of the Intel (R) Xeon Phi for supervised deep learning [C ] // 2015 IEEE 17th International Conference on High Performance Computing and Communications,August 24-26,2015,New York,USA . Piscataway:IEEE Press , 2015 : 758 - 765 .
XIA L , TANG T , HUANGFU W , et al . Switched by input:power efficient structure for RRAM-based convolutional neural network [C ] // 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC),June 5-9,2016,Austin,USA . Piscataway:IEEE Press , 2016 : 1 - 6 .
BOJNORDI M N , IPEK E . Memristive Boltzmann machine:a hardware accelerator for combinatorial optimization and deep learning [C ] // 2016 IEEE International Symposium on High Performance Computer Architecture(HPCA),March 12-16,2016,Barcelona,Spain . Piscataway:IEEE Press , 2016 : 1 - 13 .
YE S , TANG Y H , LIU H Z , et al . Research on algorithm optimization of Graph500 benchmark program [C ] // The 19th Annual Conference on Computer Engineering and Technology and the 5th Forum on Microprocessor Technology,August 6,2015,Harbin,China . Hunan:Hunan Science & Technology Press , 2015 : 64 - 71 .
PICHIORRI F , SUH S S , ROCCI A , et al . Scalable graph exploration on multicore processors [J ] . International Communications in Heat & Mass Transfer , 2010 , 39 ( 7 ): 937 - 944 .
BEAMER S , ASANOVIC K , PATTERSON D A . Searching for a parent instead of fighting over children:a fast breadth-first search implementation for graph500 [D ] . Berkeley:University of California , 2011 .
YASUI Y , FUJISAWA K , GOTO K . NUMA-optimized parallel breadthfirst search on multicore single-node system [C ] // 2013 IEEE International Conference on Big Data,October 6-9,2013,Silicon Valley,USA . Piscataway:IEEE Press , 2013 : 394 - 402 .
YASUI Y , FUJISAWA K , SATO Y . Fast and energy-efficient breadth-first search on a single NUMA system [M ] . Berlin : SpringerPress , 2014 .
YOO A , CHOW E , HENDERSON K , et al . A scalable distributed parallel breadthfirst search algorithm on BlueGene/L [C ] // The 2005 ACM/IEEE Conference on Supercomputing,November 12-18,2005,Seattle,USA . Piscataway:IEEE Press , 2005 : 25 - 25 .
MIZELL D , MASCHHOFF K . Early experiences with large-scale Cray XMT systems [C ] // 2009 IEEE International Symposium on Parallel & Distributed Processing,May 23-29,2009,Rome,Italy . Piscataway:IEEE Press , 2009 : 1 - 9 .
UENO K , SUZUMURA T . Parallel distributed breadth first search on GPU [C ] // The 20th Annual International Conference on High Performance Computing,December 10-21,2013,Bangalore,India . Piscataway:IEEE Press , 2013 : 314 - 323 .
WADLEIGH K , AMELIO J , COLLINS K , et al . Abstract:hybrid breadth first search implementation for hybrid-core computers [C ] // 2012 SC Companion:High Performance Computing,Networking,Storage and Analysis,November 10-16,2012,Salt Lake City,USA . Piscataway:IEEE Press , 2012 :1354.
FUENTES P , BOSQUE J L , BEIVIDE R , et al . Characterizing the communication demands of the graph500 benchmark on a commodity cluster [C ] // 2014 IEEE/ACM International Symposium on Big Data Computing,December 8-11,2014,London,UK . Piscataway:IEEE Press , 2014 : 83 - 89 .
EISENMAN A , CHERKASOVA L , MAGALHAES G , et al . Parallel graph processing:prejudice and state of the art [C ] // The 7th ACM/SPEC on International Conference on Performance Engineering,March 12-16,2016,Delft,The Netherlands . New York:ACM Press , 2016 : 85 - 90 .
0
浏览量
1294
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621