1. 复旦大学计算机科学技术学院,上海 200438
2. 上海市数据科学重点实验室,上海 200438
[ "任洪润(1995- ),女,复旦大学计算机科学技术学院、上海市数据科学重点实验室博士生,主要研究方向为数据科学和数字经济,近期研究重点为数据定价、数据生产模型等。" ]
[ "朱扬勇(1963- ),男,博士,复旦大学计算机科学技术学院教授,复旦大学数据产业研究中心副主任。《大数据》期刊编委会副主任,农业大数据产业技术创新战略联盟副理事长兼首席科学家,大数据协同安全技术国家工程实验室副理事长,中国自动化学会国防大数据专业委员会副主任。国际数据科学倡导者,提出数据界、数据学、数据身、数据自治、数据财政等概念和体系。发表学术论文200多篇,出版《数据学》《旖旎数据》《特异群组挖掘》《数据自治》等专著,并任《大数据技术与应用丛书》(22册)主编、《大数据资源》主编。主要研究方向为数据科学和数字经济,近期研究重点方向为数字化转型、数据财政、数据资产、数据自治与数据跨境等。" ]
网络首发:2023-05,
纸质出版:2023-05-15
移动端阅览
任洪润, 朱扬勇. 数据管道模型:场外流式数据市场形态探索[J]. 大数据, 2023,9(3):15-28.
Hongrun REN, Yangyong ZHU. Data pipeline model: exploration of the overthe-counter form of streaming data[J]. Big data research, 2023, 9(3): 15-28.
任洪润, 朱扬勇. 数据管道模型:场外流式数据市场形态探索[J]. 大数据, 2023,9(3):15-28. DOI: 10.11959/j.issn.2096-0271.2023031.
Hongrun REN, Yangyong ZHU. Data pipeline model: exploration of the overthe-counter form of streaming data[J]. Big data research, 2023, 9(3): 15-28. DOI: 10.11959/j.issn.2096-0271.2023031.
当前数据要素市场建设探索主要集中在数据交易场所(场内)建设,而流式数据市场指数据供应商向数据使用者持续、快速地供应特定数据的市场,流式数据并不适合在场内交易,因此需要探索流式数据的场外交易模式。研究了当前流式数据市场的运行现状,指出了市场无序、监管工具不足是存在的主要问题,提出了场外流式数据市场的数据管道模型,包括管道流通要件(数据管道、数据工厂、数据供应链)、市场规范要件(数据计量表、质量抽检器、合规审核仪)等,论证了数据管道模型的技术可行性,以期为场外数据市场建设、规范和监管提供理论和技术支持。
The current exploration of data element market construction mainly focuses on the construction of data trading venues
while the streaming data market refers to the market in which data suppliers continuously and quickly supply specific data to data users.Streaming data is not suitable for transactions in the data trading venues
so the over-the-counter trading model of streaming data needs to be explored.After studying the current operation status of the streaming data market
it was pointed out that market disorder and insufficient regulatory tools were the main problems
and a data pipeline model for the over-the-counter streaming data market was proposed
including pipeline circulation elements (data pipeline
data factory
data supply chain)
market regulation elements (data meter
quality sampling device
compliance auditing device)
etc.The technical feasibility of the data pipeline model was demonstrated
and it provided theoretical and technical support for the construction
regulation and supervision of the over-the-counter data market.
包晓丽 , 杜万里 . 数据可信交易体系的制度构建——基于场内交易视角 [J ] . 电子政务 , 2023 :2023.06.003.
BAO X L , DU W L . Institutional construction of data credible trading system—based on the perspective of floor trading [J ] . E-Government , 2023 :2023.06.003.
CHEN J , LI M , XU H . Selling data to a machine learner:pricing via costly signaling [C ] // Proceedings of the 39th International Conference on Machine Learning .[S.l.:s.n. ] , 2022 : 3336 - 3359 .
CONG Z C , LUO X , PEI J , et al . Data pricing in machine learning pipelines [J ] . Knowledge and Information Systems , 2022 , 64 ( 6 ): 1417 - 1455 .
HERNANDEZ D , GAMEIRO L , SENNA C , et al . Handling producer and consumer mobility in IoT publish–subscribe named data networks [J ] . IEEE Internet of Things Journal , 2022 , 9 ( 2 ): 868 - 884 .
KOLAJO T , DARAMOLA O , ADEBIYI A . Big data stream analysis:a systematic literature review [J ] . Journal of Big Data , 2019 , 6 ( 1 ): 1 - 30 .
KARMAKAR G , CHOWDHURY A , DAS R , et al . Assessing trust level of a driverless car using deep learning [J ] . IEEE Transactions on Intelligent Transportation Systems , 2021 , 22 ( 7 ): 4457 - 4466 .
ZHANG P , LIU B X , LU T , et al . A semantic embedding enhanced topic model for user-generated textual content modeling in social ecosystems [J ] . The Computer Journal , 2022 , 65 ( 11 ): 2953 - 2968 .
O’CALLAGHAN L , MISHRA N , MEYERSON A , et al . Streamingdata algorithms for high-quality clustering [C ] // Proceedings of 18th International Conference on Data Engineering . Piscataway:IEEE Press , 2002 : 685 - 694 .
孙大为 , 张广艳 , 郑纬民 . 大数据流式计算:关键技术及系统实例 [J ] . 软件学报 , 2014 , 25 ( 4 ): 839 - 862 .
SUN D W , ZHANG G Y , ZHENG W . Big data stream computing:technologies and instances [J ] . Journal of Software , 2014 , 25 ( 4 ): 839 - 862 .
LI Y , YANG J C , ZHANG Z , et al . Healthcare data quality assessment for cybersecurity intelligence [J ] . IEEE Transactions on Industrial Informatics , 2023 , 19 ( 1 ): 841 - 848 .
WU X B , XU Y H , SHAO Z L , et al . LSM-trie:an LSM-tree-based ultralarge key-value store for small data [C ] // Proceedings of the 2015 USENIX Conference on USENIX Annual Technical Conference . New York:ACM Press , 2015 : 71 - 82 .
蔡莉 , 梁宇 , 朱扬勇 , 等 . 数据质量的历史沿革和发展趋势 [J ] . 计算机科学 , 2018 , 45 ( 4 ): 1 - 10 .
CAI L , LIANG Y , ZHU Y Y , et al . History and development tendency of data quality [J ] . Computer Science , 2018 , 45 ( 4 ): 1 - 10 .
NIU C Y , ZHENG Z Z , WU F , et al . Online pricing with reserve price constraint for personal data markets [C ] // Proceedings of 2020 IEEE 36th International Conference on Data Engineering . Piscataway:IEEE Press , 2020 : 1978 - 1981 .
陈纯 . 流式大数据实时处理技术、平台及应用 [J ] . 大数据 , 2017 , 3 ( 4 ): 1 - 8 .
CHEN C . Real-time processing technology,platform and application of streaming big data [J ] . Big Data Research , 2017 , 3 ( 4 ): 1 - 8 .
KREPS J , NARKHEDE N , RAO J . Kafka:a distributed messaging system for log processing [C ] // Proceedings of the NetDB .[S.l.:s.n. ] , 2011 .
MISRA S , REISSLEIN M , XUE G L . A survey of multimedia streaming in wireless sensor networks [J ] . IEEE Communications Surveys & Tutorials , 2008 , 10 ( 4 ): 18 - 39 .
TOSHNIWAL A , TANEJA S , SHUKLA A , et al . Storm@twitter [C ] // Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data . New York:ACM Press , 2014 : 147 - 156 .
CARBONE P , KATSIFODIMOS A , EWEN S , et al . Apache Flink:stream and batch processing in a single engine [J ] . Bulletin of the IEEE Computer Society Technical Committee on Data Engineering , 2015 , 36 ( 4 ): 28 - 38 .
ALI M H , CHANDRAMOULI B , FAY J , et al . Online visualization of geospatial stream data using the worldwide telescope [J ] . Proceedings of the VLDB Endowment , 2011 , 4 ( 12 ): 1379 - 1382 .
WANG G Z , CHEN L , DIKSHIT A , et al . Consistency and completeness:rethinking distributed stream processing in Apache Kafka [C ] // Proceedings of the 2021 International Conference on Management of Data . New York:ACM Press , 2021 : 2602 - 2613 .
LIU Q , FENG G Z , ZHENG W B , et al . Managing data quality of cooperative information systems:model and algorithm [J ] . Expert Systems With Applications , 2022 ,189.
AN J , WU S Y , GUI X L , et al . A blockchainbased framework for data quality in edgecomputing-enabled crowdsensing [J ] . Frontiers of Computer Science , 2023 , 17 ( 4 ): 174503 .
BODON F , RÓNYAI L . Trie:an alternative data structure for data mining algorithms [J ] . Mathematical and Computer Modelling , 2003 , 38 ( 7-9 ): 739 - 751 .
0
浏览量
496
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621