1. 东北大学计算机科学与工程学院,辽宁 沈阳 110000
2. 东北大学外国语学院,辽宁 沈阳 110000
[ "李爱黎(1995- ),女,东北大学计算机科学与工程学院硕士生,主要研究方向为情感分析、数据挖掘" ]
[ "张子帅(2000- ),男,东北大学计算机科学与工程学院本科生,主要研究方向为数据挖掘、机器学习" ]
[ "林荫(1984- ),女,东北大学外国语学院讲师,主要研究方向为中日文化比较研究" ]
[ "王秋菊(1962- ),女,东北大学外国语学院教授,主要研究方向为中日文化比较研究、科技与文化研究" ]
[ "杨建安(2002- ),男,东北大学计算机科学与工程学院本科生,主要研究方向为数据挖掘、机器学习" ]
[ "孟炜程(2002- ),男,东北大学计算机科学与工程学院本科生,主要研究方向为数据挖掘、机器学习" ]
[ "张岩峰(1982- ),男,东北大学计算机科学与工程学院教授,中国计算机学会高级会员,主要研究方向为大数据挖掘、大规模机器学习、分布式系统" ]
网络首发:2022-11,
纸质出版:2022-11-15
移动端阅览
李爱黎, 张子帅, 林荫, 等. 基于社交网络大数据的民众情感监测研究[J]. 大数据, 2022,8(6):105-126.
Aili LI, Zishuai ZHANG, Yin LIN, et al. Research on emotion monitoring of public based on social network big data[J]. Big data research, 2022, 8(6): 105-126.
李爱黎, 张子帅, 林荫, 等. 基于社交网络大数据的民众情感监测研究[J]. 大数据, 2022,8(6):105-126. DOI: 10.11959/j.issn.2096-0271.2022054.
Aili LI, Zishuai ZHANG, Yin LIN, et al. Research on emotion monitoring of public based on social network big data[J]. Big data research, 2022, 8(6): 105-126. DOI: 10.11959/j.issn.2096-0271.2022054.
近年来,新浪微博、推特等社交网络平台逐渐成为反映社会舆情的主要载体之一,为网民发表观点和表达情绪提供了便利。基于社交网络大数据的舆情监控已经成为新的研究热点,利用各国的社交网络大数据进行民众情感监测,有助于直接掌握国际关系中的民众情感倾向,对我国外交、对外贸易等方面都有很重要的作用。基于此,提出了一种面向中日语料的民众情感监测系统,该系统能够同时分析新浪微博和推特等社交平台的中日文语料数据中包含的情感倾向,并以可视化的形式展现给用户。情感分析算法方面,在BERT模型基础上结合自扩展的中日文情感词典,提出了一个新的情感分析模型——EmoBERT。实验结果表明,相比于原始BERT模型,EmoBERT模型在中文情感分类任务和日文情感分类任务上都取得了很好的表现。其中中文模型EmoBERT-C将中文BERT模型准确率从89.68%提升至92.15%,日文模型EmoBERT-J将日文BERT模型准确率从74.73%提升至78.26%。
In recent years
social networking platforms such as Sina Weibo and Twitter have gradually become one of the main carriers for reflecting social public opinion
providing a convenient platform for netizens to express their opinions and emotions.Public opinion monitoring based on social network big data has become a new research hotspot.People’s emotions monitoring using social network big data in various countries is helpful to directly grasp people’s emotional tendencies in international relations
and has a great impact on the diplomacy
foreign trade
and other aspects.Based on this
a public sentiment monitoring system for Chinese and Japanese data was proposed
which could analyze the emotional tendencies contained in Chinese and Japanese data on social platforms such as Sina Weibo and Twitter simultaneously
and displayed them to users in a visual form.In the aspect of sentiment analysis algorithm
based on the BERT model and combined with the self-expanding Chinese and Japanese sentiment lexicon
a new sentiment analysis model
EmoBERT
was proposed.The experimental results show that
compared with the original BERT model
the EmoBERT has achieved good results on both Chinese sentiment classification tasks and Japanese sentiment classification tasks.Among them
EmoBERT-C increases the accuracy of Chinese BERT from 89.68% to 92.15%
and EmoBERT-J increases the accuracy of Japanese BERT model from 74.73% to 78.26%.
敦欣卉 , 张云秋 , 杨铠西 . 基于微博的细粒度情感分析 [J ] . 数据分析与知识发现 , 2017 , 1 ( 7 ): 61 - 72 .
DUN X H , ZHANG Y Q , YANG K X . Fine-grained sentiment analysis based on weibo [J ] . Data Analysis and Knowledge Discovery , 2017 , 1 ( 7 ): 61 - 72 .
ZHAO J C , DONG L , WU J J , et al . MoodLens:an emoticon-based sentiment analysis system for Chinese tweets [C ] // Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD’12 . New York:ACM Press , 2012 .
WANG H , CAN D , KAZEMZADEH A , et al . A system for real-time Twitter sentiment analysis of 2012 U.S.presidential election cycle [C ] // Proceedings of the ACL System Demonstrations .[S.l.:s.n. ] , 2012 .
WILLIAMS J , KATZ G . Extracting and modeling durations for habits and events from Twitter [C ] // Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics:Short Papers- Volume 2 .[S.l.:s.n. ] , 2012 : 223 - 227 .
李忠俊 . 基于话题检测与聚类的内部舆情监测系统 [J ] . 计算机科学 , 2012 , 39 ( 12 ): 237 - 240 .
LI Z J . Internal public opinions monitor system based on topic detection and clustering [J ] . Computer Science , 2012 , 39 ( 12 ): 237 - 240 .
YI J , NASUKAWA T , BUNESCU R , et al . Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques [C ] // Proceedings of 3rd IEEE International Conference on Data Mining . Piscataway:IEEE Press , 2003 : 427 - 434 .
RILOFF E M , SHEPHERD J . A corpusbased approach for building semantic lexicons [J ] . arXiv preprint , 1997 ,arXiv:cmp-lg/9706013.
熊德兰 , 程菊明 , 田胜利 . 基于HowNet的句子褒贬倾向性研究 [J ] . 计算机工程与应用 , 2008 , 44 ( 22 ): 143 - 145 .
XIONG D L , CHENG J M , TIAN S L . Sentence orientation research based on HowNet [J ] . Computer Engineering and Applications , 2008 , 44 ( 22 ): 143 - 145 .
潘明慧 , 牛耘 . 基于多线索混合词典的微博情绪识别 [J ] . 计算机技术与发展 , 2014 , 24 ( 9 ): 28 - 32 , 36 .
PAN M H , NIU Y . Emotion recognition of micro-blogs based on a hybrid lexicon [J ] . Computer Technology and Development , 2014 , 24 ( 9 ): 28 - 32 , 36 .
PANG B , LEE L , VAITHYANATHAN S . Thumbs up? Sentiment classification using machine learning techniques [C ] // Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - EMNLP’02 . Morristown:Association for Computational Linguistics , 2002 .
杨艳霞 . 基于分类的微博情感分析算法研究及实现 [J ] . 计算机与数字工程 , 2017 , 45 ( 2 ): 197 - 200 , 396 .
YANG Y X . Microblog sentiment analysis algorithm research and implementation based on classification [J ] . Computer &Digital Engineering , 2017 , 45 ( 2 ): 197200 ,396.
COLLOBERT R , WESTON J , BOTTOU L , et al . Natural language processing (almost) from scratch [J ] . The Journal of Machine Learning Research , 2011 , 12 : 2493 - 2537 .
AL-RIFAIE M M , BISHOP J . Swarmic sketches and attention mechanism [C ] // Proceedings of the International Conference on Evolutionary and Biologically Inspired Music and Art .[S.l.:s.n. ] , 2013 : 85 - 96 .
SCHULL ER B , MOUSA A E D , VRYNIOTIS V . Sentiment analysis and opinion mining: on optimal parameters and performances [J ] . Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery , 2015 , 5 ( 5 ): 255 - 263 .
宋婷 , 陈战伟 , 杨海峰 . 基于分层注意力网络的方面情感分析 [J ] . 大数据 , 2020 , 6 ( 5 ): 82 - 91 .
SONG T , CHEN Z W , YANG H F . Aspect sentiment analysis based on a hierarchical attention network [J ] . Big Data Research , 2020 , 6 ( 5 ): 82 - 91 .
徐志栋 , 陈炳阳 , 王晓 , 等 . 基于胶囊网络的方面级情感分类研究 [J ] . 智能科学与技术学报 , 2020 , 2 ( 3 ): 284 - 292 .
XU Z D , CHEN B Y , WANG X , et al . Research on capsule network-based for aspect-level sentiment classification [J ] . Chinese Journal of Intelligent Science and Technology , 2020 , 2 ( 3 ): 284 - 292 .
张宝华 , 张华平 , 厉铁帅 , 等 . 基于多输入模型及句法结构的中文评论情感分析方法 [J ] . 大数据 , 2021 , 7 ( 6 ): 41 - 52 .
ZHANG B H , ZHANG H P , LI T S , et al . Chinese comment sentiment analysis method based on multi-input model and syntactic structure [J ] . Big Data Research , 2021 , 7 ( 6 ): 41 - 52 .
RADFORD A , NARASIMHAN K . Improving language understanding by generative pre-training [J ] . Preprint-Work in Progress , 2018 .
DEVLIN J , CHANG M W , LEE K , et al . BERT:pre-training of deep bidirectional transformers for language understanding [J ] . arXiv preprint,2018,arXiv:1810.04805 .
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [J ] . arXiv preprint,2017,arXiv:1706.03762 .
SUN C , QIU X P , XU Y G , et al . How to fine-tune BERT for text classification? [C ] // Proceedings of Chinese Computational Linguistics .[S.l.:s.n. ] , 2019 .
杨晨 , 宋晓宁 , 宋威 . SentiBERT:结合情感信息的预训练语言模型 [J ] . 计算机科学与探索 , 2020 , 14 ( 9 ): 1563 - 1570 .
YANG C , SONG X N , SONG W . SentiBERT:pre-training language model combining sentiment information [J ] . Journal of Frontiers of Computer Science and Technology , 2020 , 14 ( 9 ): 1563 - 1570 .
0
浏览量
527
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621