1. 中国科学院计算机网络信息中心,北京 100083
2. 中国科学院文献情报中心,北京 100190
3. 中国科学院科学传播局,北京 100864
[ "沈志宏(1977- ),男,博士,中国科学院计算机网络信息中心正高级工程师、博士生导师,主要研究方向为科学大数据管理与处理、图数据库管理系统、语义网技术等" ]
[ "张晓林(1956- ),男,博士,中国科学院文献情报中心研究员、博士生导师,主要研究方向为数字知识系统、数据挖掘与知识发现、科技政策等" ]
[ "郑晓欢(1981- ),女,高级工程师,中国科学院科学传播局网络安全与信息化处副处长,主要研究方向为科学数据管理、科研信息化战略等" ]
网络首发:2023-07,
纸质出版:2023-07-15
移动端阅览
沈志宏, 张晓林, 郑晓欢. PARIS原则:开放协作环境下科学数据的可用性[J]. 大数据, 2023,9(4):172-188.
Hongzhi SHEN, Xiaolin ZHANG, Xiaohuan ZHENG. PARIS principle: improving the usability of scientific data in the open collaborative environment[J]. Big data research, 2023, 9(4): 172-188.
沈志宏, 张晓林, 郑晓欢. PARIS原则:开放协作环境下科学数据的可用性[J]. 大数据, 2023,9(4):172-188. DOI: 10.11959/j.issn.2096-0271.2023013.
Hongzhi SHEN, Xiaolin ZHANG, Xiaohuan ZHENG. PARIS principle: improving the usability of scientific data in the open collaborative environment[J]. Big data research, 2023, 9(4): 172-188. DOI: 10.11959/j.issn.2096-0271.2023013.
科学数据利用的需求日益迫切,且在“第四范式”“融合科学”等新型科研范式带来的开放协作环境下,呈现出跨边界、端到端、动态性和协作化的特征。作为“数据仓储时代”的产物,FAIR、TRUST原则已无法为开放协作环境下科学数据的高效利用提供深入的指导。详细分析了科学数据利用的典型场景,提出开放协作环境下促进科学数据利用的PARIS原则:可处理(processable)、可问答(askable)、可信赖(reliable)、可联合(incorporable)与可供给(suppliable),并重点分析了PARIS原则对科学数据可用性的促进作用。最后,探讨了实现PARIS原则可参考的技术路径。作为FAIR、TRUST原则的有益扩展,期望PARIS原则能有效提升科学数据的可用性。
The demand for scientific data utilization is increasingly urgent
and in the open environment brought by the new scientific research paradigms such as “Fourth Paradigm” and “Convergence Science”
the data utilization shows the characteristics of cross-the-boundary
end-to-end
dynamic and collaborative.As products of the “era of data repository”
the FAIR and TRUST principles can no longer provide in-depth guidance for the efficient use of scientific data in the open environment.This paper analyzed the typical scenarios of scientific data utilization in detail.Then
it presented the PARIS principles to promote scientific data utilization: processable
askable
reliable
incorporable
and suppliable.Finally
this paper given a technical practice path that the PARIS principles can refer to.As beneficial extensions of the FAIR and TRUST principles
it is expected that the PARIS principles can effectively improve the usability of scientific data.
王卷乐 , 王明明 , 石蕾 , 等 . 科学数据管理态势及其对我国地球科学领域的启示 [J ] . 地球科学进展 , 2019 , 34 ( 3 ): 306 - 315 .
WANG J L , WANG M M , SHI L , et al . The situation of scientific data management and its enlightenment to earth sciences of China [J ] . Advances in Earth Science , 2019 , 34 ( 3 ): 306 - 315 .
徐波 , 王瑞丹 , 陈祖刚 , 等 . 科学数据中心综合运行评价体系赋权研究 [J ] . 中国科技资源导刊 , 2021 , 53 ( 4 ): 96 - 103 .
XU B , WANG R D , CHEN Z G , et al . Research on empowerment of scientific data center comprehensive operation evaluation system [J ] . China Science &Technology Resources Review , 2021 , 53 ( 4 ): 96 - 103 .
王瑞丹 , 高孟绪 , 石蕾 , 等 . 对大数据背景下科学数据开放共享的研究与思考 [J ] . 中国科技资源导刊 , 2020 , 52 ( 1 ): 1 - 5 , 26 .
WANG R D , GAO M X , SHI L , et al . Research and thoughts on the opening and sharing of scientific data under background of big data [J ] . China Science& Technology Resources Review , 2020 , 52 ( 1 ): 1 - 5 , 26 .
高雅丽 . 在科技管理中,让科学数据“开放共享” [N ] . 中国科学报 ,2022-06-08(1).
GAO Y L . Making scientific data “open and shared” in science and technology management [N ] . China Science Daily ,2022-06-08(1).
WILKINSON M D , DUMONTIER M , AALBERSBERG I J , et al . The FAIR guiding principles for scientific data management and stewardship [J ] . Scientific Data , 2016 ,3:160018.
LIN D W , CRABTREE J , DILLO I , et al . The TRUST principles for digital repositories [J ] . Scientific Data , 2020 , 7 ( 1 ): 144 .
李春秋 , 杜博雅 , 耿骞 , 等 . 医学科学数据开放平台FAIR原则的应用评估与调查分析 [J ] . 图书情报工作 , 2022 , 66 ( 3 ): 72 - 82 .
LI C Q , DU B Y , GENG Q , et al . Application assessment and survey analysis of FAIR principle in medical scientific data open platforms [J ] . Library and Information Service , 2022 , 66 ( 3 ): 72 - 82 .
JONES S . Open data,FAIR data and RDM:the ugly duckling [C ] // Proceedings of Open Sciences Conference .[S.l.:s.n. ] , 2018 : 13 - 14 .
韩扬眉 . 科学数据要像学术论文一样积极“共享” [N ] . 中国科学报 ,2022-03-15(1).
HAN Y M . Scientific data should be actively “shared” like academic papers [N ] . China Science Daily ,2022 -03-15(1).
梅宏 . 数据要素化仍是国际性难题 [N ] . 中国科学报 ,2022-09-01.
MEI H . Data elementization is still an international problem [N ] . China Science Daily ,2022-09-01.
ABLIKIM M , ACHASOV M N , ADLARSON P , et al . Probing CP symmetry and weak phases with entangled double-strange baryons [J ] . Nature , 2022 , 606 ( 7912 ): 64 - 69 .
GUAN C , CHEN N , QIAO L J , et al . Photosynthesis regulates the diel hysteresis pattern between soil respiration and soil temperature in a steppe grassland [J ] . Geoderma , 2022 ,408:115561.
TANSLEY S , TOLLE K M . The fourth paradigm:data-intensive scientific discovery [M ] . Redmond : Microsoft Research , 2009 .
肖小溪 , 甘泉 , 蒋芳 , 等 . “融合科学”新范式及其对开放数据的要求 [J ] . 中国科学院院刊 , 2020 , 35 ( 1 ): 3 - 10 .
XIAO X X , GAN Q , JIANG F , et al . Convergence science as a new paradigm and its requirement for open data [J ] . Bulletin of the Chinese Academy of Sciences , 2020 , 35 ( 1 ): 3 - 10 .
KAUPPINEN T , DE ESPINDOLA G M . Linked open science-communicating,sharing and evaluating data,methods and results for executable papers [J ] . Procedia Computer Science , 2011 , 4 : 726 - 731 .
NIU C H , AGGARWAL K , LI D , et al . A repeating fast radio burst associated with a persistent radio source [J ] . Nature , 2022 , 606 ( 7916 ): 873 - 877 .
ABBOTT B P , ABBOTT R , ABBOTT T D , et al . Multi-messenger observations of a binary neutron star merger [J ] . The Astrophysical Journal , 2017 , 848 ( 2 ): L12 .
崔辰州 , 薛艳杰 , 李建 , 等 . 虚拟天文台:天文学研究的科研信息化环境 [J ] . 中国科学院院刊 , 2013 , 28 ( 4 ): 511 - 518 .
CUI C Z , XUE Y J , LI J , et al . Virtual observatory,an e-science environment for astronomy [J ] . Bulletin of Chinese Academy of Sciences , 2013 , 28 ( 4 ): 511 - 518 .
卢逸航 , 李国庆 , 陈祖刚 . 科学数据中心间互操作模式研究 [J ] . 数据与计算发展前沿 , 2022 ( 1 ): 69 - 83 .
LU Y H , LI G Q , CHEN Z G . Research on interoperability models between scientific data centers [J ] . Frontiers of Data &Computing , 2022 ( 1 ): 69 - 83 .
KLUMP J , BERTELMANN R , BRASE J , et al . Data publication in the open access initiative [J ] . Data Science Journal , 2006 , 5 : 79 - 83 .
NAIDU S , TIGANI J . Google bigquery analytics [M ] . [S.l.] : John Wiley & Sons , 2014 .
WAN M , WU C , WANG J , et al . Column store for GWAC:a high-cadence,highdensity,large-scale astronomical light curve pipeline and distributed sharednothing database [J ] . Publications of the Astronomical Society of the Pacific , 2016 , 128 ( 969 ): 114501 .
BLOLAND P , MACNEIL A . Defining& assessing the quality,usability,and utilization of immunization data [J ] . BMC Public Health , 2019 , 19 ( 1 ): 1 - 8 .
PRINS H , KRUISINGA F H , BÜLLER H A , , et al . Availability and usability of data for medical practice assessment [J ] . International Journal for Quality in Health Care , 2002 , 14 ( 2 ): 127 - 137 .
WACHOWICZ M , RIEDERMANN C , VULLINGS W , et al . Workshop report on spatial data usability [C ] // Proceedings of the 5th AGILE Conference on Geographical Information Science .[S.l.:s.n. ] , 2002 : 429 - 436 .
李建中 , 刘显敏 . 大数据的一个重要方面:数据可用性 [J ] . 计算机研究与发展 , 2013 , 50 ( 6 ): 1147 - 1162 .
LI J Z , LIU X M . An important aspect of big data:data usability [J ] . Journal of Computer Research and Development , 2013 , 50 ( 6 ): 1147 - 1162 .
LI T , SAHU A K , TALWALKAR A , et al . Federated learning:challenges,methods,and future directions [J ] . IEEE Signal Processing Magazine , 2020 , 37 ( 3 ): 50 - 60 .
BARKER A , VAN HEMERT J . Scientific workflow:a survey and research directions [C ] // International Conference on Parallel Processing and Applied Mathematics . Heidelberg:Springer , 2008 : 746 - 753 .
FOSTER I , KESSELMAN C , TUECKE S . The anatomy of the grid:enabling scalable virtual organizations [J ] . The International Journal of HighPerformance Computing Applications , 2001 , 15 ( 3 ): 200 - 222 .
FOSTER I , KESSELMAN C . The Globus Project:a status report [C ] // Proceedings Seventh Heterogeneous Computing Workshop (HCW’98) . Piscataway:IEEE Press , 2002 : 4 - 18 .
SHIERS J . The worldwide LHC computing grid (worldwide LCG) [J ] . Computer Physics Communications , 2007 , 177 ( 1/2 ): 219 - 223 .
BERNHOLDT D , BHARATHI S , BROWN D , et al . The earth system grid:supporting the next generation of climate modeling research [J ] . Proceedings of the IEEE , 2005 , 93 ( 3 ): 485 - 495 .
KOITZSCH K . Data pipelines and how to construct them [M ] // Pro Hadoop data analytics . Berkeley : Apress , 2017 : 77 - 90 .
0
浏览量
166
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621