1. 浙江大学计算机科学与技术学院,浙江 杭州 310007
2. 阿里巴巴-浙江大学前沿技术联合研究中心,浙江 杭州 311121
3. 浙江大学软件学院,浙江 杭州 310007
4. 阿里巴巴集团,浙江 杭州 311121
[ "陈华钧(1978- ),男,浙江大学计算机科学与技术学院教授,主要研究方向为知识图谱、自然语言处理、大数据系统。" ]
[ "张文(1992- ),女,博士,浙江大学软件学院助理研究员,主要研究方向为知识图谱、知识表示和知识推理。" ]
[ "黄志文(1993- ),男,阿里巴巴集团商品知识图谱团队算法工程师,主要研究方向为深度学习和知识图谱。" ]
[ "叶橄强(1996- ),男,浙江大学计算机科学与技术学院硕士生,主要研究方向为知识图谱表示学习和预训练。" ]
[ "文博(1994- ),男,浙江大学计算机科学与技术学院硕士生,主要研究方向为知识图谱和推荐计算。" ]
[ "张伟(1983- ),男,博士,阿里巴巴集团资深算法专家,主要研究方向为自然语言处理和知识图谱。" ]
网络首发:2021-05,
纸质出版:2021-05-15
移动端阅览
陈华钧, 张文, 黄志文, 等. 大规模知识图谱预训练模型及电商应用[J]. 大数据, 2021,7(3):2021028.
Huajun CHEN, Wen ZHANG, Chi-Man WONG, et al. Large scale pre-trained knowledge graph model and e-commerce application[J]. Big data research, 2021, 7(3): 2021028.
陈华钧, 张文, 黄志文, 等. 大规模知识图谱预训练模型及电商应用[J]. 大数据, 2021,7(3):2021028. DOI: 10.11959/j.issn.2096-0271.2021028.
Huajun CHEN, Wen ZHANG, Chi-Man WONG, et al. Large scale pre-trained knowledge graph model and e-commerce application[J]. Big data research, 2021, 7(3): 2021028. DOI: 10.11959/j.issn.2096-0271.2021028.
近年来,知识图谱因具有以统一的方式组织数据等优势,被广泛应用于许多需要知识的任务,并且在电子商务领域大放光彩。然而知识服务通常需要烦琐的数据选择和知识注入模型的设计,这会给业务带来不良影响。为了更好地解决这一问题,提出了“预训练+知识向量服务”的模式,并设计了知识图谱预训练模型(PKGM),在不直接访问商品知识图谱中三元组数据的情况下,以知识向量的方式为下游任务提供知识图谱服务。在商品分类、同款商品识别和商品推荐等知识图谱下游任务中进行测试,实验结果表明,知识图谱预训练模型能够有效地提高每个任务的性能。
In recent years
knowledge graph has been widely applied to organize data in a uniform way and enhance many tasks that require knowledge.For example
it has been widely used in the field of e-commerce.However
such knowledge services usually include tedious data selection and model design for knowledge infusion
which might bring inappropriate results.Thus
to solve this problem
the method of first pre-training then providing knowledge vector service was put forward
and a pre-trained knowledge graph model (PKGM) was proposed for our billionscale e-commerce product knowledge graph
providing item knowledge services in a uniform way for embeddingbased models without accessing triple data in the knowledge graph.PKGM was tested in three knowledge-related tasks including item classification
same item identification
and recommendation.Experimental results show PKGM successfully improves the performance of each task.
PECHSIRI C , PIRIYAKUL R . Explanation knowledge graph construction through causality extraction from texts [J ] . Journal of Computer Science and Technology , 2010 , 25 ( 5 ): 1055 - 1070 .
RADFORD A , NARASIMHAN K , SALIMANS T , et al . Improving language understanding by generative pre-training [Z ] . 2018 .
DEVLIN J , CHANG M W , LEE K , et al . BERT:pre-training of deep bidirectional transformers for language understanding [J ] . arXiv preprint , 2018 ,arXiv:1810.04805.
YANG Z L , DAI Z H , YANG Y M , et al . XLNet:generalized autoregressive pretraining for language understanding [J ] . arXiv preprint , 2019 ,arXiv:1906.08237.
MIKOLOV T , CHEN K , CORRADO G , et al . Efficient estimation of word representations in vector space [J ] . arXiv preprint , 2013 ,arXiv:1301.3781.
BORDES A , USUNIER N , GARCIADURÁN A , et al . Translating embeddings for modeling multi-relational data [C ] // Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS) .[S.l. ] : Curran Associates Inc. , 2013 : 2787 - 2795 .
PARK M Y , HASTIE T . L1-regularization path algorithm for generalized linear models [J ] . Journal of the Royal Statistical Society:Series B (Statistical Methodology) , 2007 , 69 ( 4 ): 659 - 677 .
JI S X , PAN S R , CAMBRIA E , et al . A survey on knowledge graphs:Representation,acquisition and applications [J ] . arXiv preprint , 2020 ,arXiv:2002.00388.
MELO A , PAULHEIM H . Detection of relation assertion errors in knowledge graphs [C ] // Proceedings of the 2017 Knowledge Capture Conference . New York:ACM Press , 2017 : 1 - 8 .
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [J ] . arXiv preprint , 2017 ,arXiv:1706.03762.
ZHU Y K , KIROS R , ZEMEL R , et al . Aligning books and movies:towards story-like visual explanations by watching movies and reading books [C ] // Proceedings of the 2015 IEEE International Conference on Computer Vision . Piscataway:IEEE Press , 2015 : 19 - 27 .
VÖLKEL M , KRÖTZSCH M , VRANDECIC D , et al . Semantic Wikipedia [C ] // Proceedings of the 15th International Conference on World Wide Web . New York:ACM Press , 2006 : 585 - 594 .
SUCHANEK F M , KASNECI G , WEIKUM G . YAGO:a core of semantic knowledge [C ] // Proceedings of the 16th International Conference on World Wide Web . New York:ACM Press , 2007 : 697 - 706 .
BOLLACKER K , EVANS C , PARITOSH P , et al . Freebase:a collaboratively created graph database for structuring human knowledge [C ] // Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data . New York:ACM Press , 2008 : 1247 - 1250 .
MILLER G A , BECKWITH R , FELLBAUM C , et al . Introduction to WordNet:an on-line lexical database [J ] . International Journal of Lexicography , 1990 , 3 ( 4 ): 235 - 244 .
SINOARA R A , CAMACHO-COLLADOS J , ROSSI R G , et al . Knowledgeenhanced document embeddings for text classification [J ] . Knowledge-Based Systems , 2019 , 163 : 955 - 971 .
HE X N , LIAO L Z , ZHANG H W , et al . Neural collaborative filtering [C ] // Proceedings of the 26th International Conference on World Wide Web .[S.l.:s.n. ] , 2017 : 173 - 182 .
0
浏览量
1069
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621