1. 清华大学计算机科学与技术系,北京 100084
2. 北京信息科学与技术国家研究中心,北京 100084
[ "吴昊(1992- ),男,清华大学计算机科学与技术系硕士生,主要研究方向为分布式系统。" ]
[ "陈康(1976- ),男,清华大学计算机科学与技术系副教授,中国计算机学会(CCF)会员,主要研究方向为分布式系统、存储系统等。" ]
[ "武永卫(1974- ),男,清华大学计算机科学与技术系教授,CCF高级会员,主要研究方向为并行和分布式处理、云计算和存储等。" ]
[ "郑纬民(1946- ),男,清华大学计算机科学与技术系教授、博士生导师,CCF会士,主要研究方向为计算机架构、操作系统、存储和分布式计算等。" ]
网络首发:2019-07,
纸质出版:2019-07-15
移动端阅览
吴昊, 陈康, 武永卫, 等. 基于RDMA和NVM的大数据系统一致性协议研究[J]. 大数据, 2019,5(4):89-99.
Hao WU, Kang CHEN, Yongwei WU, et al. Research on the consensus of big data systems based on RDMA and NVM[J]. Big Data Research, 2019, 5(4): 89-99.
吴昊, 陈康, 武永卫, 等. 基于RDMA和NVM的大数据系统一致性协议研究[J]. 大数据, 2019,5(4):89-99. DOI: 10.11959/j.issn.2096-0271.2019034.
Hao WU, Kang CHEN, Yongwei WU, et al. Research on the consensus of big data systems based on RDMA and NVM[J]. Big Data Research, 2019, 5(4): 89-99. DOI: 10.11959/j.issn.2096-0271.2019034.
分布式的存储系统以及计算系统是构造大数据处理系统的基础。系统的高可用性是任何一个分布式系统的基石,高可用技术一般依赖于一致性协议。讨论了经典的非拜占庭的分布式一致性协议以及新技术发展下的RDMA通信协议与NVM存储介质,通过RDMA和NVM的结合获得了更高性能的高可用系统。改进了一致性协议,使其能够更好地利用RDMA与NVM的特性。实现的系统在保证系统数据一致和可用的同时,有效地提高了协议实现的性能。实验表明,相比于现有的系统,实现的系统能够得到40%的性能提高。
Distributed storage systems and computing systems are the foundation for constructing big data processing systems.High availability of the system is the cornerstone of any distributed system.High-availability technologies generally rely on consensus protocols.The classic non-Byzantine distributed consensus protocol was discussed
as well as the RDMA communication protocol and NVM storage media under the development of new technologies to achieve higher performance high availability systems by combining them.The consensus protocol to make the better use of the features of RDMA and NVM was modified.The implemented system effectively improves the performance of the protocol while ensuring the consistency and availability of the system data.Experiments show that the system implemented in this paper can achieve 40% performance improvement compared to existing systems.
GRAY J , LAMPORT L . Consensus on transaction commit [J ] . ACM Transactions on Database Systems (TODS) , 2006 , 31 ( 1 ): 133 - 160 .
SKEEN D , . Nonblocking commit protocols [C ] // The 1981 ACM SIGMOD International Conference on Management of Data,April 29-May 1,1981,Ann Arbor,USA . New York:ACM Press , 1981 : 133 - 142 .
GIFFORD D K , . Weighted voting for replicated data [C ] // The 7th ACM Symposium on Operating Systems Principles,December 10-12,1979,Pacific Grove,USA . New York:ACM Press , 1979 : 150 - 162 .
LAMPORT L . Paxos made simple [J ] . ACM Sigact News , 2001 , 32 ( 4 ): 18 - 25 .
CHANDRA T D , GRIESEMER R , REDSTONE J . Paxos made live:an engineering perspective [C ] // The 26th Annual ACM Symposium on Principles of Distributed Computing,August 12-15,2007,Portland,USA . New York:ACM Press , 2007 : 398 - 407 .
HUNT P , KONAR M , JUNQUEIRA F P , et al . ZooKeeper:wait-free coordination for Internet-scale systems [C ] // USENIX Annual Technical Conference,June 23-25,2010,Boston,USA.[S.l.:s.n] . 2010 .
ONGARO D , OUSTERHOUT J . In search of an understandable consensus algorithm [C ] // 2014 USENIX Annual Technical Conference,Junuary 17-20,2014,Philadelphia,USA.[S.l.:s.n . ] , 2014 : 305 - 319 .
LAMPORT L . Fast Paxos [J ] . Distributed Computing , 2006 , 19 ( 2 ): 79 - 103 .
ZHAO W . Fast Paxos made easy:theory and implementation [J ] . International Journal of Distributed Systems and Technologies (IJDST) , 2015 , 6 ( 1 ): 15 - 33 .
LAMPORT L B . Generalized Paxos:U.S.patent 7698465 [P ] .2010-04-13.
SUTRA P , SHAPIRO M . Fast genuine generalized consensus [C ] // 2011 IEEE 30th International Symposium on Reliable Distributed Systems,October 4-7,2011,Madrid,Spain . Piscataway:IEEE Press , 2011 : 255 - 264 .
MORARU I , ANDERSEN D G , KAMINSKY M . There is more consensus in egalitarian parliaments [C ] // The 24th ACM Symposium on Operating Systems Principles,November 3-6,2013,Farminton,USA . New York:ACM Press , 2013 : 358 - 372 .
BURROWS M , . The chubby lock service for loosely-coupled distributed systems [C ] // The 7th Symposium on Operating Systems Design and Implementation,November 6-8,2006,Seattle,USA . Berkeley:USENIX Association , 2006 : 335 - 350 .
BAKER J , BOND C , CORBETT J C , et al . Megastore:providing scalable,highly available storage forinter active services [C ] // The 5th Biennial CIDR Conference,January 9-11,2011,Asilomar,USA.[S.l.:s.n] . 2011 : 223 - 234 .
HUNT P , KONAR M , JUNQUEIRA F P , et al . ZooKeeper:wait-free coordination for internet-scalesystems [C ] // USENIX Annual Technical Conference,June 23-25,2010,Boston,USA . Berkeley:USENIX Association , 2010 .
GUO C , WU H , DENG Z , et al . RDMA over commodity ethernet at scale [C ] // The 2016 ACM SIGCOMM Conference,August 22-26,2016,Florianopolis,Brazil . New York:ACM Press , 2016 : 202 - 215 .
TSAI S Y , ZHANG Y . Lite kernel RDMA support for datacenter applications [C ] // The 26th Symposium on Operating Systems Principles,October 28-31,2017,Shanghai,China . New York:ACM Press , 2017 : 306 - 324 .
MACARTHUR P , RUSSELL R D . A performance study to guide RDMA programming decisions [C ] // 2012 IEEE 14th International Conference on High Performance Computing and Communication,June 25-27,2012,Liverpool,UK . Piscataway:IEEE Press , 2012 : 778 - 785 .
CHEN A . A review of emerging nonvolatile memory (NVM) technologies and applications [J ] . Solid-State Electronics , 2016 ( 125 ): 25 - 38 .
MAOW , LIU JN , TONG W , et al . A review of storage technology research based on phase change memory [J ] . Chinese Journal of Computers , 2015 , 38 ( 5 ):944.
WANG C , JIANG J , CHEN X , et al . Apus:fast and scalable Paxos on RDMA [C ] // The 2017 Symposiumon Cloud Computing,September 25-27,2017,Santa Clara,USA . New York:ACM Press , 2017 : 94 - 107 .
KIM D , MEMARIPOUR A , BADAM A , et al . Hyperloop:group-based NIC-offloading to accelerate replicated transactions in multi-tenant storage systems [C ] // The 2018 Conference of the ACM Special Interest Group on Data Communication,August 20-25,2018,Budapest,Hungary . New York:ACM Press , 2018 : 297 - 312 .
DANG H T , SCIASCIA D , CANINI M , et al . Netpaxos:consensus at network speed [C ] // The 1st ACM SIGCOMM Symposium on Software Defined Networking Research,June 17-18,2015,Santa Clara,USA . New York:ACM Press , 2015 .
MCKEOWN N , ANDERSON T , BALAKRISHNAN H , et al . OpenFlow:enabling innovation in campus networks [J ] . ACM SIGCOMM Computer Communication Review , 2008 , 38 ( 2 ): 69 - 74 .
LI J , MICHAEL E , SHARMA N K , et al . Just say noto Paxos overhead:replacing consensus with network ordering [C ] // The 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI16),November 2-4,2016,Savannah,USA . Berkeley:USENIX Association , 2016 : 467 - 483 .
BOSSHART P , DALY D , GIBB G , et al . P4:programming protocol-independent packet processors [J ] . ACM SIGCOMM Computer Communication Review , 2014 , 44 ( 3 ): 87 - 95 .
DANG H T , CANINI M , PEDONE F , et al . Paxos made switch-y [J ] . ACM SIGCOMM Computer Communication Review , 2016 , 46 ( 2 ): 18 - 24 .
SULAIMAN N , OBAID Z A , MARHABAN M , et al . Design and implementation of FPGAbased systems-areview [J ] . Australian Journal of Basic and Applied Sciences , 2009 , 3 ( 4 ): 3575 - 3596 .
ISTVÁN Z , SIDLER D , ALONSO G , et al . Consensus in a box:inexpensive coordination in hardware [C ] // The 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI16),March 16-18,2016,Santa Clara,USA . Berkeley:USENIX Association , 2016 : 425 - 438 .
POKE M , HOEFLER T . DARE:highperformance state machine replication on RDMA networks [C ] // The 24th International Symposium on High-Performance Parallel and Distributed Computing,June 15-19,2015,Portland,USA . New York:ACM Press , 2015 : 107 - 118 .
TALEB Y , STUTSMAN R , ANTONIU G , et al . Tailwind:fast and atomic RDMA-based replication [C ] // The 2018 USENIX Annual Technical Conference,July 11-13,2018,Boston,USA.[S.l.:s.n . ] , 2018 : 851 - 863 .
PANDA D K , KOOP M , BALAJI P . Tutorial:infiniband and 10-Gigabit Ethernet for dummies [C ] // IEEE Symposium on High Performance Interconnects,August 25-27,2009,New York,USA . Piscataway:IEEE Press , 2009 .
0
浏览量
982
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621