监督学习中的损失函数及应用研究

邓建国; 张素兰; 张继福; 荀亚玲; 刘爱琴

doi:10.11959/j.issn.2096-0271.2020006

您当前的位置：

首页 >

文章列表页 >

监督学习中的损失函数及应用研究

研究 | 更新时间：2024-06-03

- 监督学习中的损失函数及应用研究
- Loss function and application research in supervised learning
- 大数据 2020年6卷第1期页码：2020006-1
- 作者机构：
- 作者简介：
  
  [ "邓建国（1977- ），男，太原科技大学计算机科学与技术学院硕士生，主要研究方向为数据挖掘与图像理解" ]
  [ "张素兰（1971- ），女，博士，太原科技大学计算机科学与技术学院教授，中国计算机学会（CCF）会员，主要研究方向为粒计算、数据挖掘与图像理解" ]
  [ "张继福（1963- ），男，太原科技大学计算机科学与技术学院教授、博士生导师，CCF高级会员，主要研究方向为数据挖掘与高性能计算" ]
  [ "荀亚玲（1980- ），女，博士，太原科技大学计算机科学与技术学院副教授，主要研究方向为数据挖掘与并行计算" ]
  [ "刘爱琴（1975- ），女，太原科技大学计算机科学与技术学院副教授，主要研究方向为数据挖掘、并行与分布式计算" ]
- 基金信息：
  
  国家自然科学资金资助项目;The National Natural Science Foundation of China(61373099);国家自然科学资金资助项目;The National Natural Science Foundation of China(61602335)
- DOI：10.11959/j.issn.2096-0271.2020006
  中图分类号： TP181
- 网络首发：2020-01，
  
  纸质出版：2020-01-15
- 稿件说明：
移动端阅览
邓建国, 张素兰, 张继福, 等. 监督学习中的损失函数及应用研究[J]. 大数据, 2020,6(1):2020006-1.

Jianguo DENG, Sulan ZHANG, Jifu ZHANG, et al. Loss function and application research in supervised learning[J]. Big Data Research, 2020, 6(1): 2020006-1.
邓建国, 张素兰, 张继福, 等. 监督学习中的损失函数及应用研究[J]. 大数据, 2020,6(1):2020006-1. DOI： 10.11959/j.issn.2096-0271.2020006.

Jianguo DENG, Sulan ZHANG, Jifu ZHANG, et al. Loss function and application research in supervised learning[J]. Big Data Research, 2020, 6(1): 2020006-1. DOI： 10.11959/j.issn.2096-0271.2020006.

摘要

监督学习中的损失函数常用来评估样本的真实值和模型预测值之间的不一致程度，一般用于模型的参数估计。受应用场景、数据集和待求解问题等因素的制约，现有监督学习算法使用的损失函数的种类和数量较多，而且每个损失函数都有各自的特征，因此从众多损失函数中选择适合求解问题最优模型的损失函数是相当困难的。研究了监督学习算法中常用损失函数的标准形式、基本思想、优缺点、主要应用以及对应的演化形式，探索了它们适用的应用场景和可能的优化策略。本研究不仅有助于提升模型预测的精确度，而且也为构建新的损失函数或改进现有损失函数的应用研究提供了一个新的思路。

Abstract

The loss function in supervised learning is often used to evaluate the degree of inconsistency between the real value of the sample and the predicted value of the model

and is generally used for parameter estimation of the model.Due to the constraints of application scenarios

data sets and problems to be solved

there are many kinds and quantities of loss functions used by existing supervised learning algorithms

and each loss function has its own characteristics.Therefore

it is very difficult to select a loss function suitable for solving the optimal model of the problem from many loss functions.The standard forms

basic ideas

advantages and disadvantages

main applications and corresponding evolution forms of commonly used loss functions in supervised learning algorithms were studied

and their more appropriate application scenarios and possible optimization strategies were summarized.This study not only helps to improve the accuracy of model prediction

it also provides a new idea for the application of new loss functions or to improve the application of existing loss functions.

关键词

Keywords

references

李航 . 统计学习方法 [M ] . 北京 : 清华大学出版社 , 2012 .

LI H . Statistical learning method [M ] . Beijing : Tsinghua University PressPress , 2012 .

ZHOU W , BOVIK A C . Mean squared error:love it or leave it? A new look at signal fidelity measures [J ] . IEEE Signal Processing Magazine , 2009 , 26 ( 1 ): 98 - 117 .

ZHAO H , GALLO O , FROSIO I , et al . Loss functions for image restoration with neural networks [J ] . IEEE Transactions on Computational Imaging , 2017 , 3 ( 1 ): 47 - 57 .

LIU W , WEN Y , YU Z , et al . Largemargin softmax loss for convolutional neural networks [C ] // The 33rd International Conference on Machine Learning,June 19-24,2016,New York,USA.[S.l.:s.n . ] , 2016 : 507 - 516 .

SCHROFF F , KALENICHENKO D , PHILBIN J . Facenet:a unified embedding for face recognition and clustering [C ] // The IEEE Conference on Computer Vision and Pattern Recognition,June 7-12,2015,Boston,USA . Piscataway:IEEE Press , 2015 : 815 - 823 .

HADSELL R , CHOPRA S , LECUN Y . Dimensionality reduction by learning an invariant mapping [C ] // 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,June 17-22,2006,New York,USA . Piscataway:IEEE Press , 2006 : 1735 - 1742 .

SUN Y , CHEN Y , WANG X , et al . Deep learning face representation by joint identification-verification [C ] // The 27th International Conference on Neural Information Processing Systems,December 13,2014,Montreal,Canada . New York:ACM Press , 2014 : 1988 - 1996 .

刘思琦 , 郎丛妍 , 冯松鹤 . 基于对抗式扩张卷积的多尺度人群密度估计 [J ] . 中国图象图形学报 , 2019 , 24 ( 3 ): 483 - 492 .

LIU S Q , LANG C Y , FENG S H . Multiscale crowd counting via adversarial dilated convolutions [J ] . Journal of Image and Graphics , 2019 , 24 ( 3 ): 483 - 492 .

GHOSH A , KUMAR H , SASTRY P S . Robust loss functions under label noise for deep neural networks [C ] // The 31st AAAI Conference on Artificial Intelligence,February 4-9,2017,San Francisco,USA . Palo Alto:AAAI Press , 2017 : 1919 - 1925 .

应自炉 , 龙祥 . 多尺度密集残差网络的单幅图像超分辨率重建 [J ] . 中国图象图形学报 , 2019 , 24 ( 3 ): 410 - 419 .

YING Z L , LONG X . Single-image super-resolution construction based on multi-scale dense residual network [J ] . Journal of Image and Graphics , 2019 , 24 ( 3 ): 410 - 419 .

GIRSHICK R , . Fast R-CNN [C ] // The IEEE International Conference on Computer Vision,December 7-13,2015,Santiago,Chile . Piscataway:IEEE Press , 2015 : 1440 - 1448 .

李国庆 , 赵洋 , 刘青萌 , 等 . 多层感知分解的全参考图像质量评估 [J ] . 中国图象图形学报 , 2019 , 24 ( 1 ): 153 - 162 .

LI G Q , ZHAO Y , LIU Q M , et al . Multilayer perceptual decomposition based full reference image quality assessment [J ] . Journal of Image and Graphics , 2019 , 24 ( 1 ): 153 - 162 .

LIU L P , DIETTERICH T G , LI N , et al . Transductive optimization of top k precision [C ] // The 25th International Joint Conference on Artificial Intelligence,July 9-15,2016,New York,USA . Palo Alto:AAAI Press , 2016 : 1781 - 1787 .

HE X , ZHOU Y , ZHOU Z , et al . Tripletcenter loss for multi-view 3D object retrieval [C ] // The IEEE Conference on Computer Vision and Pattern Recognition,June 18-22,2018,Salt Lake City,USA . Piscataway:IEEE Press , 2018 : 1945 - 1954 .

李为 , 游寒旭 , 朱杰 , 等 . 一种应用于文本相关说话人确认的L-向量表示和改进的余弦距离核函数 [J ] . 上海师范大学学报:自然科学版 , 2016 ( 2 ): 243 - 247 .

LI W , YOU H X , ZHU J , et al . A novel L-vector representation and improved cosine distance kernel for text-dependent speaker verification [J ] . Journal of Shanghai Normal University(Natural Sciences) , 2016 ( 2 ): 243 - 247 .

PAINSKY A , WORNELL G . On the universality of the logistic loss function [C ] // 2018 IEEE International Symposium on Information Theory,June 17-22,2018,Vail,USA . Piscataway:IEEE Press , 2018 : 936 - 940 .

LIANG S , YANG F , WEN T , et al . Nonlocal total variation based on symmetric Kullback-Leibler divergence for the ultrasound image despeckling [J ] . BMC Medical Imaging , 2017 , 17 ( 1 ):57.

GRANERO-BELINCHÓN C , ROUX S G , GARNIER N B . Kullback-Leibler divergence measure of intermittency:application to turbulence [J ] . Physical Review E , 2018 , 97 ( 1 ):013107.

JERO S E , RAMU P , RAMAKRISHNAN S . Discrete wavelet transform and singular value decomposition based ECG steganography for secured patient information transmission [J ] . Journal of Medical Systems , 2014 , 10 ( 38 ): 1 - 11 .

KING G , ZENG L . Logistic regression in rare events data [J ] . Political Analysis , 2001 , 9 ( 2 ): 137 - 163 .

HINTON G , VINYALS O , DEAN J . Distilling the knowledge in a neural network [J ] . Computer Science , 2015 ,arXiv:1503.02531.

WANG H , WANG Y , ZHOU Z , et al . Cosface:large margin cosine loss for deep face recognition [C ] // The IEEE Conference on Computer Vision and Pattern Recognition,June 18-22,2018,Salt Lake City,USA . Piscataway:IEEE Press , 2018 : 5265 - 5274 .

LIU W , WEN Y , YU Z , et al . SphereFace:deep hypersphere embedding for face recognition [C ] // The IEEE Conference on Computer Vision and Pattern Recognition,July 21-26,2017,Honolulu,USA . Piscataway:IEEE Press , 2017 : 212 - 220 .

WANG F , CHENG J , LIU W , et al . Additive margin softmax for face verification [J ] . IEEE Signal Processing Letters , 2018 , 25 ( 7 ): 926 - 930 .

黄旭 , 凌志刚 , 李绣心 . 融合判别式深度特征学习的图像识别算法 [J ] . 中国图象图形学报 , 2018 , 23 ( 4 ): 510 - 518 .

HUANG X , LING Z G , LI X X . Discriminative deep feature learning method by fusing linear discriminant analysis for image recognition [J ] . Journal of Image and Graphics , 2018 , 23 ( 4 ): 510 - 518 .

WEN Y , ZHANG K , LI Z , et al . A discriminative feature learning approach for deep face recognition [C ] // European Conference on Computer Vision.Springer,October 8-16,2016,Amsterdam,The Netherlands . Heidelberg:Springer , 2016 : 499 - 515 .

WEN Y , ZHANG K , LI Z , et al . A comprehensive study on center loss for deep face recognition [J ] . International Journal of Computer Vision , 2019 , 127 ( 6-7 ): 668 - 683 .

董震 , 裴明涛 . 基于异构哈希网络的跨模态人脸检索方法 [J ] . 计算机学报 , 2019 , 42 ( 1 ): 75 - 86 .

DONG Z , PEI M T . Cross-modality face retrieval based on heterogeneous Hashing network [J ] . Chinese Journal of Computers , 2019 , 42 ( 1 ): 75 - 86 .

CHENG D , GONG Y , ZHOU S , et al . Person re-identification by multichannel parts-based cnn with improved triplet loss function [C ] // The IEEE Conference on Computer Vision and Pattern Recognition,June 27-30,2016,Las Vegas,USA . Piscataway:IEEE Press , 2016 : 1335 - 1344 .

WU B , CHEN W , SUN P , et al . Tagging like humans:diverse and distinct image annotation [C ] // The IEEE Conference on Computer Vision and Pattern Recognition,June 18-22,2018,Salt Lake City,USA . Piscataway:IEEE Press , 2018 : 7967 - 7975 .

HU H , ZHOU G T , DENG Z , et al . Learning structured inference neural networks with label relations [C ] // The IEEE Conference on Computer Vision and Pattern Recognition,June 27-30,2016,Las Vegas,USA . Piscataway:IEEE Press , 2016 : 2960 - 2968 .

MIKOLOV T , CHEN K , CORRADO G , et al . Efficient estimation of word representations in vector space [J ] . Computer Science , 2013 ,arXiv:1301.3781.

ERIN LIONG V , LU J , WANG G , et al . Deep hashing for compact binary codes learning [C ] // The IEEE Conference on Computer Vision and Pattern Recognition,June 7-12,2015,Boston,USA . Piscataway:IEEE Press , 2015 : 2475 - 2483 .

KIM J , KWON LEE J , MU LEE K . Accurate image super-resolution using very deep convolutional networks [C ] // The IEEE Conference on Computer Vision and Pattern Recognition,June 27-30,2016,Las Vegas,USA . Piscataway:IEEE Press , 2016 : 1646 - 1654 .

詹曙 , 梁植程 , 谢栋栋 . 前列腺磁共振图像分割的反卷积神经网络方法 [J ] . 中国图象图形学报 , 2017 , 22 ( 4 ): 516 - 522 .

ZHAN S , LIANG Z C , XIE D D . Deconvolutional neural network for prostate MRI segmentation [J ] . Journal of Image and Graphics , 2017 , 22 ( 4 ): 516 - 522 .

LI Y , GAO F , OU Z , et al . Angular softmax loss for end-to-end speaker verification [C ] // 2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP),November 26-29,2018,Taipei,China . Piscataway:IEEE Press , 2018 : 190 - 194 .

ZHONG P , WANG D , MIAO C . An affectrich neural conversational model with biased attention and weighted crossentropy loss [C ] // The AAAI Conference on Artificial Intelligence,January 27-February 1,Honolulu,USA . Palo Alto:AAAI Press , 2019 : 7492 - 7500 .

徐鹏 , 林森 . 基于C 4.5决策树的流量分类方法 [J ] . 软件学报 , 2009 , 20 ( 10 ): 2692 - 2704 .

XU P , LIN S.Internet traffic classification using C4 . 5 decision tree [J ] . Journal of Software , 2009 , 20 ( 10 ): 2692 - 2704 .

SCHLEGEL D , GRISETTI G . HBST:a Hamming distance embedding binary search tree for feature-based visual place recognition [J ] . IEEE Robotics and Automation Letters , 2018 , 3 ( 4 ): 3741 - 3748 .

WANG H , HU J . Multi-label image annotation via maximum consistency [C ] // 2010 IEEE International Conference on Image Processing,September 26-29,2010,Hong Kong,China . Piscataway:IEEE Press , 2010 : 2337 - 2340 .

DEMBCZYŃSKI K , WAEGEMAN W , CHENG W , et al . Regret analysis for performance metrics in multi-label classification:the case of hamming and subset zero-one loss [C ] // Joint European Conference on Machine Learning and Knowledge Discovery in Databases,September 20-24,2010,Barcelona,Spain . Heidelberg:Springer , 2010 : 280 - 295 .

SHALEV-SHWARTZ S , SHAMIR O , SRIDHARAN K . Learning kernelbased halfspaces with the zeroone loss [J ] . Computer Science , 2010 ,arXiv:1005.3681.

CHENG D , TIAN F , LIU L , et al . Image segmentation based on multiregion multi-scale local binary fitting and Kullback-Leibler divergence [J ] . Signal Image ＆ Video Processing , 2018 ( 2 ): 1 - 9 .

YU P , ZHU Z , ZHANG Z . Robust exponential squared loss-based estimation in semi-functional linear regression models [J ] . Computational Statistics , 2019 , 34 ( 2 ): 503 - 525 .

ONORO-RUBIO D , LÓPEZ-SASTRE R J , . Towards perspecti-ve-free object counting with deep learning [C ] // European Conference on Computer Vision,October 8-16,2016,Amsterdam,The Netherlands . Heidelberg:Springer , 2016 : 615 - 629 .

CAVAZZA J , MURINO V . Active regression with adaptive huber loss [J ] . Computer Science , 2016 arXiv:1606.01568.

毛毅 , 陈稳霖 , 郭宝龙 , 等 . 基于密度估计的逻辑回归模型 [J ] . 自动化学报 , 2014 , 40 ( 1 ): 62 - 72 .

MAO Y , CHEN W L , GUO B L , et al . A novel logistic regression model based on density estimation [J ] . Acta Automatica Sinica , 2014 , 40 ( 1 ): 62 - 72 .

于巧 , 姜淑娟 , 张艳梅 , 等 . 分类不平衡对软件缺陷预测模型性能的影响研究 [J ] . 计算机学报 , 2018 , 41 ( 4 ): 809 - 824 .

YU Q , JIANG S J , ZHANG Y M , et al . The impact study of class imbalance on the performance of software defect prediction models [J ] . Chinese Journal of Computers , 2018 , 41 ( 4 ): 809 - 824 .

刘梦娟 , 曾贵川 , 岳威 , 等 . 基于融合结构的在线广告点击率预测模型 [J ] . 计算机学报 , 2019 , 42 ( 7 ): 1570 - 1587 .

LIU M J , ZENG G C , YUE W , et al . A hybrid network based CTR prediction model for online advertising [J ] . Chinese Journal of Computers , 2019 , 42 ( 7 ): 1570 - 1587 .

WU W , NIE J , GAO G . An improved SVM-based multiple features fusion method for image annotation [J ] . Journal of Information ＆ Computational Science , 2014 , 11 ( 14 ): 4987 - 4997 .

CHAN S B , YAMANA H , LE D D , et al . Image annotation fusing contentbased and tag-based technique using support vector machine and vector space model [C ] // 2014 10th International Conference on Signal-Image Technology and Internet-Based Systems,November 23-27,2014,Marrakech,Morocco . Piscataway:IEEE Press , 2014 : 272 - 276 .

VERMA Y , JAWAHAR C V . Exploring SVM for image annotation in presence of confusing labels [C ] // British Machine Vision Conference,September 9-13,2013,Bristol,UK.[S.l.:s.n] . 2013 : 1 - 11 .

WU P , HOI S C H , ZHAO P , et al . Mining social images with distance metric learning for automated image tagging [C ] // The 4th ACM International Conference on Web Search and Data Mining,February 9-12,2011,Hong Kong,China . New York:ACM Press , 2011 : 197 - 206 .

VERMA Y , JAWAHAR C V . Image annotation using metric learning in semantic neighbourhoods [C ] // European Conference on Computer Vision,October 7-13,2012,Firenze,Italy . Heidelberg:Springer , 2012 : 836 - 849 .

ZHAI D , LIU X , CHANG H , et al . Parametric local multiview hamming distance metric learning [J ] . Pattern Recognition , 2018 , 75 ( C ): 250 - 262 .

惠国保 , 童一飞 , 李东波 . 基于改进的图像局部区域相似度学习架构的图像特征匹配技术研究 [J ] . 计算机学报 , 2015 , 38 ( 6 ): 1148 - 1161 .

HUI G B , TONG Y F , LI D B . Image features matching based on improved patch similarity learning framework [J ] . Chinese Journal of Computers , 2015 , 38 ( 6 ): 1148 - 1161 .

XING H J , LIU W T . Robust AdaBoost based ensemble of one-class support vector machines [J ] . Information Fusion , 2020 , 55 : 45 - 58 .

浏览量

1309

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据