[ "苏华友(1985- ),男,博士,国防科技大学计算机学院并行与分布处理国防科技重点实验室助理研究员,主要研究方向为高性能计算和并行优化" ]
[ "梅松竹(1984- ),男,博士,国防科技大学计算机学院并行与分布处理国防科技重点实验室助理研究员,主要研究方向为大数据分析及其性能优化" ]
[ "李荣春(1985- ),男,博士,国防科技大计算机学院并行与分布处理国防科技重点实验室副研究员,主要研究方向为深度学习、强化学习与高性能计算" ]
[ "窦勇(1966- ),男,博士,国防科技大计算机学院并行与分布处理国防科技重点实验室研究员、常务副主任,主要研究方向为深度学习、高性能计算、可重构计算等" ]
网络首发:2020-05,
纸质出版:2020-05-15
移动端阅览
苏华友, 梅松竹, 李荣春, 等. 数据流技术在GPU和大数据处理中的应用[J]. 大数据, 2020,6(3):2020028-1.
Huayou SU, Songzhu MEI, Rongchun LI, et al. The usage of dataflow model in GPU and big data processing[J]. Big Data Research, 2020, 6(3): 2020028-1.
苏华友, 梅松竹, 李荣春, 等. 数据流技术在GPU和大数据处理中的应用[J]. 大数据, 2020,6(3):2020028-1. DOI: 10.11959/j.issn.2096-0271.2020028.
Huayou SU, Songzhu MEI, Rongchun LI, et al. The usage of dataflow model in GPU and big data processing[J]. Big Data Research, 2020, 6(3): 2020028-1. DOI: 10.11959/j.issn.2096-0271.2020028.
数据流模型是一种高效的计算模型,由于其在并行性方面具有天然的优势,数据流技术在软硬件领域得到了广泛的应用。在硬件体系结构方面,数据流模型引领计算机体系结构在传统冯·诺伊曼架构下向支持更高并发的方向发展。基于超长向量处理单元的流处理和SIMT的现代GPU就广泛使用了数据流技术的思想。在编程模型方面,数据流思想在大数据编程模型领域得到了广泛应用,例如MapReduce和Spark等。从数据流模型的角度多层次分析了英伟达GPU的体系结构以及CUDA编程模型,阐述了数据流模型在GPU软硬件系统中的应用。分析了数据流思想和GPU大规模并行处理体系结构在大数据处理中的应用和发展趋势。
Dataflow model is an efficient computing model.It has been widely used in software and hardware fields due to its natural advantages in parallelism.In terms of hardware architecture
the dataflow model leads the computer architecture to the direction of supporting higher concurrency from the traditional von Neumann architecture.The stream processor based on the long vector processing unit and the SIMT GPU are two instances of using dataflow technology.In terms of programming models
dataflow ideas have been widely used in the field of big data programming models
such as MapReduce and Spark.The architecture of NVIDIA GPU and CUDA programming model were analyzed from the perspective of dataflow model.The applying and trend of dataflow and GPU were analyzed in big data processing
and ideas and methods were provided for applying GPU-based systems to the field of big data processing.
DENNIS J B . A preliminary architecture for a basic dataflow processor [J ] . ACM SIGARCH Computer Architecture News , 1974 , 3 ( 4 ): 126 - 132 .
NIKHIL R S . Executing a program on the MIT tagged-token dataflow architecture [J ] . IEEE Transactions on Computers , 1990 , 39 ( 3 ): 300 - 318 .
KHAILANY B , DALLY W J , KAPASI U J , et al . Imagine:media processing with streams [J ] . IEEE Micro , 2001 , 21 ( 2 ): 35 - 46 .
TAYLOR M , PSOTA J , SARAF A , et al . Evaluation of the raw microprocessor:an exposed-wire-delay architecture for ILP and streams [J ] . International Symposium on Computer Architecture , 2004 , 32 ( 2 ): 2 - 13 .
KOZYRAKIS C , PERISSAKIS S , PATTERSON D , et al . Scalable processors in the billion-transistor era:IRAM [J ] . IEEE Computer , 1997 , 30 ( 9 ): 75 - 78 .
SANKARALINGAM K , NAGARAJAN R , LIU H , et al . Exploiting ILP,TLP,and DLP with the polymorphous TRIPS architecture [J ] . International Symposium on Computer Architecture , 2003 , 31 ( 2 ): 422 - 433 .
YANG Q M , WU N , HE Y , et al . Implementation of the MASA-I stream processor on FPGA [J ] . Computer Engineering and Science , 2008 , 30 ( 3 ): 114 - 118 .
D ALLY W J , LABONTE F , DAS A , et al . Merrimac:supercomputing with streams [C ] // The 2003 ACM/IEEE Conference on Supercomputing . Piscataway:IEEE Press , 2003 :35.
NVIDIA . Compute unified device architecture programming guide [Z ] . 2007 .
DEAN J , GHEMAWAT S . MapReduce:simplified data processing on large clusters [J ] . Communications of The ACM , 2008 , 51 ( 1 ): 107 - 113 .
ZAHARIA M , CHOWDHURY M , DAS T , et al . Resilient distributed datasets:a fault-tolerant abstraction for in-memory cluster computing [C ] // The 9th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2012 :2.
ABADI D J , CARNEY D , ÇETINTEMEL U , et al . Aurora:a new model and architecture for data stream management [J ] . The International Journal on Very Large Data Bases , 2003 ( 12 ): 120 - 139 .
ABADI D J , AHMAD Y , BALAZINSKA M , et al . The design of the borealis stream processing engine [C ] // The 2nd Biennial Conference on Innovative Data Systems Research(CIDR 2005) . New York:ACM Press , 2005 : 277 - 289 .
THIES W , KARCZMAREK M , AMARASINGHE S , et al . StreamIt:a language for streaming applications [C ] // The 11th International Conference on Compiler Construction . New York:ACM Press , 2002 : 179 - 196 .
GEDIK B , ANDRADE H , WU K , et al . SPADE:the system s declarative stream processing engine [C ] // International Conference on Management of Data.[S.l.:s.n] . 2008 : 1123 - 1134 .
ANDERSON Q . Storm real-time processing cookbook [M ] . Birmingham : Packt Publishing LtdPress , 2013 .
ZAHARIA M , DAS T , LI H , et al . Discretized streams:an efficient and fault-tolerant model for stream processing on large clusters [C ] // The 4th USENIX Workshop on Hot Topics in Cloud Computing . Berkeley:USENIX Association , 2012 :10.
KULKARNi S , BHAGAT N , FU M , et al . Twitter Heron:stream processing at scale [C ] // International Conference on Management of Data . New York:ACM Press , 2015 : 239 - 250 .
CARBONE P , KATSIFODIMOS A , EWEN S , et al . Apache Flink:stream and batch processing in a single engine [J ] . IEEE Database Engineering Bulletin , 2015 , 36 ( 4 ): 28 - 33 .
BUDDHIKA T , PALLICKARA S . NEPTUNE:real time stream processing for Internet of things and sensing environments [C ] // 2016 IEEE International Parallel and Distributed Processing Symposium(IPDPS) . Piscataway:IEEE Press , 2016 : 1143 - 1152 .
AKIDAU T , BALIKOV A , BEKIROGLU K , et al . MillWheel:fault-tolerant stream processing at internet scale [J ] . The International Journal on Very Large Data Bases , 2013 , 6 ( 11 ): 1033 - 1044 .
NEUMEYER L , ROBBINS B , NAIR A , et al . S4:distributed stream computing platform [C ] // International Conference on Data Mining . Piscataway:IEEE Press , 2010 : 170 - 177 .
VAVILAPALLI V K , MURTHY A C , DOUGLAS C , et al . Apache Hadoop YARN:yet another resource negotiator [C ] // The 4th Annual Symposium on Cloud Computing . New York:ACM Press , 2013 .
HE B S , FANG W B , LUO Q , et al . Mars:a MapReduce framework on graphics processors [C ] // 2008 International Conference on Parallel Architectures and Compilation Techniques(PACT) . Piscataway:IEEE Press , 2008 : 260 - 269 .
HONG C T , CHEN D H , CHEN W G , et al . MapCG:writing parallel program portable between CPU and GPU [C ] // 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT) . Piscataway:IEEE Press , 2010 : 217 - 226 .
CATANZARO B , SUNDARAM N , KEUTZER K . A MapReduce framework for programming graphics processors [C ] // The Third Workshop on Software Tools for MultiCore Systems . New York:ACM Press , 2008 .
JI F , MA X S . Using shared memory to accelerate MapReduce on graphicsunits [C ] // 2011 IEEE International Parallel &Distributed Processing Symposium . Piscataway:IEEE Press , 2011 : 805 - 816 .
CHEN L C , AGRAWAL G . Optimizing MapReduce for GPUs with effective shared memory usage [C ] // The 21st International Symposium on High-Performance Parallel and Distributed Computing . New York:ACM Press , 2012 : 199 - 210 .
STUART J A , OWENS J D . MultiGPU MapReduce on GPU clusters [C ] // International Parallel and Distributed Processing Symposium . Piscataway:IEEE Press , 2011 : 1068 - 1079 .
CHEN Y , QIAO Z , JIANG H , et al . MGMR:multi-GPU based MapReduce [C ] // International Conference on Grid and Pervasive Computing . Berlin:SpringerVerlag , 2013 : 433 - 442 .
JIANG H , CHEN Y , QIAO Z , et al . Accelerating MapReduce framework on multi-GPU systems [J ] . Cluster Computing , 2014 , 17 ( 2 ): 293 - 301 .
0
浏览量
800
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621