数据流计算环境下的集群资源管理技术

汤小春; 符莹; 丁朝; 毛安琪; 李战怀

doi:10.11959/j.issn.2096-0271.2020026

您当前的位置：

首页 >

文章列表页 >

数据流计算环境下的集群资源管理技术

专题：面向大数据处理的数据流计算技术 | 更新时间：2024-06-03

- 数据流计算环境下的集群资源管理技术
- State-of-art research of cluster resource management in dataflow computing model
- 大数据 2020年6卷第3期页码：2020026-1
- 作者机构：
- 作者简介：
  
  [ "汤小春（1969- ），男，博士，西北工业大学计算机学院副教授，主要研究方向为大数据计算、大图数据挖掘、集群资源管理等" ]
  [ "符莹（1996- ），女，西北工业大学计算机学院硕士生，主要研究方向为大数据计算、集群资源管理等" ]
  [ "丁朝（1995- ），男，西北工业大学计算机学院硕士生，主要研究方向为大数据计算、集群资源管理等" ]
  [ "毛安琪（1996- ），女，西北工业大学计算机学院硕士生，主要研究方向为大数据计算、集群资源管理等" ]
  [ "李战怀（1961- ），男，博士，西北工业大学计算机学院教授，大数据存储与管理工业和信息化部重点实验室主任，主要研究方向为数据库理论与技术、数据流、数据密集型计算、内存计算、数据挖掘等" ]
- 基金信息：
  
  国家重点研发计划基金资助项目;The National Key Research and Development Program of China(2018YFB1003400)
- DOI：10.11959/j.issn.2096-0271.2020026
  中图分类号： TP31
- 网络首发：2020-05，
  
  纸质出版：2020-05-15
- 稿件说明：
移动端阅览
汤小春, 符莹, 丁朝, 等. 数据流计算环境下的集群资源管理技术[J]. 大数据, 2020,6(3):2020026-1.

Xiaochun TANG, Ying FU, Zhao DING, et al. State-of-art research of cluster resource management in dataflow computing model[J]. Big Data Research, 2020, 6(3): 2020026-1.
汤小春, 符莹, 丁朝, 等. 数据流计算环境下的集群资源管理技术[J]. 大数据, 2020,6(3):2020026-1. DOI： 10.11959/j.issn.2096-0271.2020026.

Xiaochun TANG, Ying FU, Zhao DING, et al. State-of-art research of cluster resource management in dataflow computing model[J]. Big Data Research, 2020, 6(3): 2020026-1. DOI： 10.11959/j.issn.2096-0271.2020026.

摘要

以集群为基础的高性能计算的发展经历了3个阶段的演化，即计算子系统与存储子系统的分离、计算子系统与存储子系统的融合以及以数据并行为基础的dataflow编程模型。随着Spark、Flink等数据流编程模型在大数据计算领域的广泛使用，计算作业类型千变万化，如何保证各种数据流计算作业对集群资源的共享使用是集群资源管理的核心，也是降低基础设施成本的主要手段。分析集群资源管理的历史变化，从数据流编程模型的角度出发，对HoD、集中式、双层调度、分布式以及混合式管理展开了深入的探索，介绍了其各自的优缺点以及应用现状，为数据流计算环境下的集群资源管理和调度的使用或者研发提供一定的参考和借鉴。

Abstract

The development of cluster-based high-performance computing has undergone three stages of evolution.With the widespread use of dataflow programming models such as Spark and Flink in the field of big data computing

how to ensure the fair share with the cluster resources by various dataflow computing applications is extremely important.It is also a main means to reduce the cost of infrastructures.As the drawbacks of traditional cluster resource management have becoming increasingly apparent in dataflow computing model

many alternative cluster resource management

including HoD

centralized scheduling

two-level scheduling

distributed scheduling

and hybrid scheduling management

have been proposed in recent years.Their respective advantages and disadvantages were introduced

and a certain reference for the uses or researches in development of cluster resource management and scheduling in a dataflow computing environment was provided.

关键词

Keywords

references

HOVESTADT M , KAO O , KELLER A , et al . Scheduling in HPC resource management systems:queuing vs planning [J ] . Genetica , 2003 : 112-113 ( 1 ): 445 - 461 .

MISHRA M K , PATEL Y S , ROUT Y , et al . A survey on scheduling heuristics in grid computing environment [J ] . International Journal of Modern Education and Computer Science , 2014 , 6 ( 10 ): 57 - 77 .

杜小勇 , 陈跃国 , 范举 , 等 . 数据整理——大数据治理的关键技术 [J ] . 大数据 , 2019 , 5 ( 3 ): 13 - 22 .

DU X Y , CHEN Y G , FAN J , et al . Data wrangling:a key technique of data governance [J ] . Big Data Research , 2019 , 5 ( 3 ): 13 - 22 .

陈康 , 郑纬民 . 云计算:系统实例与研究现状 [J ] . 软件学报 , 2009 , 20 ( 5 ): 1337 - 1348 .

CHEN K , ZHENG W M . Cloud computing:system instances and current research [J ] . Journal of Software , 2009 , 20 ( 5 ): 1337 - 1348 .

KARANASOS K , RAO S , CURINO C , et al . Mercury:hybrid centralized and distributed scheduling in large shared clusters [C ] // 2015 USENIX Annual Technical Conference . Berkeley:USENIX Association , 2015 : 485 - 497 .

DEAN J , GHEMAWAT S . MapReduce:simplified data processing on large clusters [J ] . Communications of the ACM , 2008 , 51 ( 1 ): 107 - 113 .

PARK J J K , PARK Y , MAHLKE S . Dynamic resource management for efficient utilization of multitasking GPUs [C ] // The 22nd International Conference on Architectural Support for Programming Languages and Operating Systems . New York:ACM Press , 2017 : 527 - 540 .

ZAHARIA M , CHOWDHURY M , DAS T , et al . Resilient distributed datasets:a fault-tolerant abstraction for inmemory cluster computing [C ] // The 9th USENIX Networked Systems Design and Implementation . Berkeley:USENIX Association , 2012 : 2 - 14 .

ARMBRUST M , XIN R S , LIAN C , et al . Spark SQL:relational data processing in Spark [C ] // The 2015 ACM SIGMOD International Conference on Management of Data . New York:ACM Press , 2015 : 1383 - 1394 .

CARBONE P , KATSIFODIMOS A , EWEN S , et al . Apache Flink:stream and batch processing in a single engine [J ] . IEEE Data Engineering Bulletin , 2015 , 38 ( 4 ): 28 - 38 .

FUKUTOMI D , IIDA Y , AZUMI T , et al . GPUhd:augmenting YARN with GPU resource management [C ] // International Conference on High Performance Computing in Asia-Pacific Region . New York:ACM Press , 2018 : 127 - 136 .

VERMA A , PEDROSA L , KORUPOLU M . et al Large-scale cluster management at Google with Borg [C ] // The 10th European Conference on Computer Systems . New York:ACM Press , 2015 : 1 - 17 .

HINDMAN B , KONWINSKI A , ZAHARIA M , et al . Mesos:a platform for finegrained resource sharing in the data center [C ] // The 8th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2011 : 295 - 308 .

BOUTIN E , EKANAYAKE J , LIN W , et al . Apollo:scalable and coordinated scheduling for cloud-scale computing [C ] // The 11th USENIX Conference on Operating Systems Design and Implementation . Berkeley:USENIX Association , 2014 : 285 - 300 .

KONSTANTINOS K , SRIRAM R , CARLO C , et al . Mercury:hybrid centralized and distributed scheduling in large shared clusters [C ] // 2015 USENIX Annual Technical Conference . Berkeley:USENIX Association , 2015 : 485 - 497 .

AKIDAU T , BRADSHAW R , CHAMBERS C , et al . The dataflow model:a practical approach to balancing correctness,latency,and cost in massive-scale,unbounded,out-of-order data processing [J ] . Proceedings of the VLDB Endowment , 2015 , 8 ( 12 ): 1792 - 1803 .

浏览量

708

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

面向大数据的可扩展正则采样并行排序算法

基于大数据技术的甘肃智慧旅游系统

大数据与计算模型

“东数西算”战略与问题的分析研究

高等教育数字化转型的现状与发展研究