基于动态动作覆盖的深度强化学习新闻推荐

董相宏; 安俊秀

doi:10.11959/j.issn.2096-0271.2023069

您当前的位置：

首页 >

文章列表页 >

基于动态动作覆盖的深度强化学习新闻推荐

研究 | 更新时间：2024-06-03

- 基于动态动作覆盖的深度强化学习新闻推荐
- Deep reinforcement learning news recommendation based on dynamic action coverage
- 大数据 2024年10卷第3期页码：109-118
- 作者机构：
- 作者简介：
  
  [ "董相宏（1993- ），男，成都信息工程大学软件工程学院硕士生，主要研究方向为云计算技术、推荐系统。" ]
  [ "安俊秀（1970- ），女，成都信息工程大学软件工程学院教授，主要研究方向为云计算与大数据技术、大数据分析与服务、云计算技术及应用等。" ]
- 基金信息：
  
  国家社会科学基金项目;The National Social Science Foundation of China(22BXW048)
- DOI：10.11959/j.issn.2096-0271.2023069
  中图分类号： TP311.5
- 网络首发：2024-05，
  
  纸质出版：2024-05-15
- 稿件说明：
移动端阅览
董相宏, 安俊秀. 基于动态动作覆盖的深度强化学习新闻推荐[J]. 大数据, 2024,10(3):109-118.

Xianghong DONG, Junxiu AN. Deep reinforcement learning news recommendation based on dynamic action coverage[J]. Big data research, 2024, 10(3): 109-118.
董相宏, 安俊秀. 基于动态动作覆盖的深度强化学习新闻推荐[J]. 大数据, 2024,10(3):109-118. DOI： 10.11959/j.issn.2096-0271.2023069.

Xianghong DONG, Junxiu AN. Deep reinforcement learning news recommendation based on dynamic action coverage[J]. Big data research, 2024, 10(3): 109-118. DOI： 10.11959/j.issn.2096-0271.2023069.

摘要

新闻推荐系统对新媒体新闻传播有着重要作用。提出了一种以深度强化学习为基础的推荐系统，旨在结合神经网络的表征能力和强化学习的策略选择能力来提升新闻推荐效果。使用动态动作掩码加强对用户短期兴趣的判断能力，使用优化缓存机制提升经验缓存的使用效率，通过区域遮蔽性质的奖励设计加快模型训练，从而提高推荐系统在新闻推荐领域的表现。实验表明，所提模型在新闻数据集上的推荐准确率与主流的神经网络推荐方法相当，且在排序性能上优于当前先进的推荐算法。

Abstract

News recommendation system plays an important role in news dissemination of new media.This paper proposed a recommendation system based on deep reinforcement learning

which aimed to combine the representation ability of neural network and the strategy selection ability of reinforcement learning to improve the effect of news recommendation.This paper used dynamic action masks to enhance the ability of judging the short-term interests of users

used the optimization cache mechanism to improve the efficiency of experience cache use

and accelerated model training through the reward design of regional masking nature to improve the performance of the recommendation system in the field of news recommendation.Experimental results show that the accuracy of the proposed model in news data sets is comparable to the current mainstream neural network recommendation methods

and its ranking performance is better than others.

关键词

Keywords

references

LIN C , XIE R Q , GUAN X J , et al . Personalized news recommendation via implicit social experts [J ] . Information Sciences , 2014 , 254 : 1 - 18 .

ZHENG G , ZHANG F , ZHENG Z , et al . DRN:a deep reinforcement learning framework for news recommendation [C ] // Proceedings of the 2018 World Wide Web Conference . Republic and Canton of Geneva:IW3C2 , 2018 : 167 - 176 .

HIDASI B , KARATZOGLOU A , BALTRUNAS L , et al . Session-based recommendations with recurrent neural networks [EB ] . arXiv preprint,2015,arXiv:1511.06939 .

LIN G Y , GAO C , LI Y F , et al . Dual contrastive network for sequential recommendation [C ] // Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval . New York:ACM , 2022 : 2686 - 2691 .

ZHAO Q H . RESETBERT4Rec:a pretraining model integrating time and user historical behavior for sequential recommendation [C ] // Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval . New York:ACM , 2022 : 1812 - 1816 .

IJNTEMA W , GOOSSEN F , FRASINCAR F , et al . Ontology-based news recommendation [C ] // Proceedings of the 2010 EDBT/ICDT Workshops . New York:ACM , 2010 : 1 - 6 .

OKURA S , TAGAMI Y , ONO S , et al . Embedding-based news recommendation for millions of users [C ] // Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . New York:ACM , 2017 : 1933 - 1942 .

KARVELIS P , GAVRILIS D , GEORGOULAS G , et al . Topic recommendation using Doc2Vec [C ] // Proceedings of 2018 International Joint Conference on Neural Networks (IJCNN) . Piscataway:IEEE Press , 2018 : 1 - 6 .

CASELLES-DUPRÉ H , LESAINT F , ROYO-LETELIER J . Word2Vec applied to recommendation:hyperparameters matter [C ] // Proceedings of the 12th ACM Conference on Recommender Systems . New York:ACM , 2018 : 352 - 356 .

ZHANG J D , CHOW C Y , LI Y H . iGeoRec:a personalized and efficient geographical location recommendation framework [J ] . IEEE Transactions on Services Computing , 2015 , 8 ( 5 ): 701 - 714 .

KARATZOGLOU A , HIDASI B . Deep learning for recommender systems [C ] // Proceedings of the Eleventh ACM Conference on Recommender Systems . New York:ACM , 2017 : 396 - 397 .

DEVLIN S M , KUDENKO D . Dynamic potential-based reward shaping [C ] // Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems . Richland:IFAAMAS , 2012 : 433 - 440 .

LI L H , CHU W , LANGFORD J , et al . A contextual-bandit approach to personalized news article recommendation [C ] // Proceedings of the 19th international conference on World wide web . New York:ACM , 2010 : 661 - 670 .

YUE Y S , JOACHIMS T . Interactively optimizing information retrieval systems as a dueling bandits problem [C ] // Proceedings of the 26th Annual International Conference on Machine Learning . New York:ACM , 2009 : 1201 - 1208 .

XIAOCONG C , LINA Y , et al Localitysensitive state-guided experience replay optimization for sparse rewards in online recommendation [C ] // Proceedings of the 45th international ACM SIGIR conference on research and development in information retrieval . New York:ACM , 2022 : 1316 - 1325 .

刘全 , 翟建伟 , 章宗长 , 等 . 深度强化学习综述 [J ] . 计算机学报 , 2018 , 41 ( 1 ): 1 - 27 .

LIU Q , ZHAI J W , ZHANG Z Z , et al . A survey on deep reinforcement learning [J ] . Chinese Journal of Computers , 2018 , 41 ( 1 ): 1 - 27 .

ZHANG Y Y , SU X Y , LIU Y . A novel movie recommendation system based on deep reinforcement learning with prioritized experience replay [C ] // Proceedings of 2019 IEEE 19th International Conference on Communication Technology (ICCT) . Piscataway:IEEE Press , 2020 : 1496 - 1500 .

LI Y Q , CHEN W Z , YAN H F . Learning graph-based embedding for time-aware product recommendation [C ] // Proceedings of the 2017 ACM on Conference on Information and Knowledge Management . New York:ACM , 2017 : 2163 - 2166 .

LIU Q , ZENG Y F , MOKHOSI R , et al . STAMP:short-term attention/memory priority model for sessionbased recommendation [C ] // Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery ＆ Data Mining . New York:ACM , 2018 : 1831 - 1839 .

蔡丽娇 , 秦进 , 陈双 . 远离旧区域和避免回路的强化探索方法 [J ] . 计算机工程 , 2023 , 49 ( 7 ): 118 - 124 , 134 .

CAI L J , QIN J , CHEN S . Reinforcement exploration method to keep away from old areas and avoid loops [J ] . Computer Engineering , 2023 , 49 ( 7 ): 118 - 124 , 134 .

ZHAO X Y , ZHANG L , DING Z Y , et al . Recommendations with negative feedback via pairwise deep reinforcement learning [C ] // Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery ＆ Data Mining . New York:ACM , 2018 : 1040 - 1048 .

GONG S , ZHU K Q . Positive,negative and neutral:modeling implicit feedback in session-based news recommendation [EB ] . arXiv prprint,2022,arXiv:2205.06058 .

刘树栋 , 张可 , 陈旭 . 基于多维度兴趣注意力和用户长短期偏好的新闻推荐 [J ] . 中文信息学报 , 2022 , 36 ( 9 ): 102 - 111 .

LIU S D , ZHANG K , CHEN X . Multidimensional interest-attention-based news recommendation with long and short-term user preferences [J ] . Journal of Chinese Information Processing , 2022 , 36 ( 9 ): 102 - 111 .

陈希亮 , 曹雷 , 李晨溪 , 等 . 基于重抽样优选缓存经验回放机制的深度强化学习方法 [J ] . 控制与决策 , 2018 , 33 ( 4 ): 600 - 606 .

CHEN X L , CAO L , LI C X , et al . Deep reinforcement learning via good choice resampling experience replay memory [J ] . Control and Decision , 2018 , 33 ( 4 ): 600 - 606 .

KOREN Y . Factorization meets the neighborhood:a multifaceted collaborative filtering model [C ] // Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining . New York:ACM , 2008 : 426 - 434 .

HE X N , LIAO L Z , ZHANG H W , et al . Neural collaborative filtering [C ] // Proceedings of the 26th International Conference on World Wide Web . Republic and Canton of Geneva:International World Wide Web Conferences Steering Committee , 2017 : 173 - 182 .

浏览量

428

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

强化学习在资源优化领域的应用

多源异构传感器协同的深度强化学习轨道交通车辆防火系统设计

基于智能合约的可信数据资产动态定价模型