1. 济南大学信息科学与工程学院,山东 济南 250024
2. 山东省网络环境智能计算技术重点实验室,山东 济南 250024
[ "仵匀政(1998- ),男,济南大学信息科学与工程学院硕士生,主要研究方向为数据挖掘、数据聚类。" ]
[ "杜韬(1979- ),男,博士,济南大学信息科学与工程学院副教授,主要研究方向为数据挖掘、数据聚类。" ]
[ "周劲(1976- ),男,博士, 济南大学信息科学与工程学院教授, 山东省人工智能学会理事,主要研究方向为数据挖掘、数据聚类。" ]
[ "陈迪(1998- ),男,济南大学信息科学与工程学院硕士生,主要研究方向为数据挖掘、数据聚类。" ]
[ "王心耕(1999- ),男,济南大学信息科学与工程学院硕士生,主要研究方向为数据挖掘、数据聚类。" ]
网络首发:2024-05,
纸质出版:2024-05-15
移动端阅览
仵匀政, 杜韬, 周劲, 等. 基于三阶张量的大规模数据谱聚类集成算法[J]. 大数据, 2024,10(3):133-148.
Yunzheng WU, Tao DU, Jin ZHOU, et al. Spectral clustering ensemble algorithm based on three-order tensor for large-scale data[J]. Big data research, 2024, 10(3): 133-148.
仵匀政, 杜韬, 周劲, 等. 基于三阶张量的大规模数据谱聚类集成算法[J]. 大数据, 2024,10(3):133-148. DOI: 10.11959/j.issn.2096-0271.2024007.
Yunzheng WU, Tao DU, Jin ZHOU, et al. Spectral clustering ensemble algorithm based on three-order tensor for large-scale data[J]. Big data research, 2024, 10(3): 133-148. DOI: 10.11959/j.issn.2096-0271.2024007.
为了降低大规模数据谱聚类计算负担,进一步提高聚类的准确性和鲁棒性,提出了一种基于三阶张量的大规模数据谱聚类集成算法。首先,提出一种混合代表最近邻近似方法构造数据间的稀疏亲和子矩阵;然后将稀疏亲和子矩阵表示为二部图,通过图分割的方法得到初步聚类结果;最后,提出三阶张量集成方法,将多个聚类结果进行融合,得到最终的聚类结果。在大规模的真实数据集和合成数据集上验证,相较经典的谱聚类算法、聚类集成算法以及近年来对其改进的算法,该算法表现出更优异的性能。
In order to reduce the computational burden of large-scale data spectral clustering and further improve the clustering accuracy and robustness
the spectral clustering ensemble algorithm based on the three-order tensor for large-scale data was proposed.The sparse affinity sub-matrix was first constructed by the mixed representative nearest neighbor approximation method.The sparse affinity sub-matrix was then represented as a bipartite graph.The preliminary clustering results were obtained by Graph Segmentation.Finally
an unified clustering result was obtained by fusing multiple clustering results through the three-order tensor ensemble method.On the real datasets and the synthetic datasets
the proposed algorithm showed a better performance compared to the classical spectral clustering algorithm
the clustering ensemble algorithm
and the improved algorithms in recent years.
孙林 , 秦小营 , 徐久成 , 等 . 基于K近邻和优化分配策略的密度峰值聚类算法 [J ] . 软件学报 , 2022 , 33 ( 4 ): 1390 - 1411 .
SUN L , QIN X Y , XU J C , et al . Density peak clustering algorithm based on K-nearest neighbors and optimized allocation strategy [J ] . Journal of Software , 2022 , 33 ( 4 ): 1390 - 1411 .
LU Y , CHEUNG Y M , TANG Y Y . Self-adaptive multiprototype-based competitive learning approach:a k-means-type algorithm for imbalanced data clustering [J ] . IEEE Transactions on Cybernetics , 2021 , 51 ( 3 ): 1598 - 1612 .
AHMAD A , KHAN S S . IinitKmix-a novel initial partition generation algorithm for clustering mixed data using k-meansbased clustering [J ] . Expert Systems with Applications , 2021 ,167:114149.
胡春安 , 王家欣 , 毛伊敏 . 基于分组和IGSA的并行密度聚类算法 [J ] . 计算机应用研究 , 2021 , 38 ( 11 ): 3293 - 3299 .
HU C A , WANG J X , MAO Y M . Densitybased clustering algorithm based on groups and improve gravitational search [J ] . Application Research of Computers , 2021 , 38 ( 11 ): 3293 - 3299 .
GUO W J , WANG W H , ZHAO S P , et al . Density peak clustering with connectivity estimation [J ] . Knowledge-Based Systems , 2022 ,243:108501.
江婧婷 , 郑朝晖 . 面向大规模节点划分的网格密度峰值聚类 [J ] . 小型微型计算机系统 , 2022 , 43 ( 3 ): 498 - 505 .
JIANG J T , ZHENG Z H . Density peak and grid based clustering for large-scale node partition [J ] . Journal of Chinese Computer Systems , 2022 , 43 ( 3 ): 498 - 505 .
徐晓 , 丁世飞 , 孙统风 , 等 . 基于网格筛选的大规模密度峰值聚类算法 [J ] . 计算机研究与发展 , 2018 , 55 ( 11 ): 2419 - 2429 .
XU X , DING S F , SUN T F , et al . Largescale density peaks clustering algorithm based on grid screening [J ] . Journal of Computer Research and Development , 2018 , 55 ( 11 ): 2419 - 2429 .
唐益明 , 丰刚永 , 任福继 , 等 . 面向结构复杂数据集的模糊聚类有效性指标 [J ] . 电子测量与仪器学报 , 2018 , 32 ( 4 ): 119 - 127 .
TANG Y M , FENG G Y , REN F J , et al . Fuzzy clustering validity index facing data set with complexity structure [J ] . Journal of Electronic Measurement and Instrumentation , 2018 , 32 ( 4 ): 119 - 127 .
李凯 , 张可心 . 结构α-熵的加权高斯混合模型的子空间聚类 [J ] . 电子学报 , 2022 , 50 ( 3 ): 718 - 725 .
LI K , ZHANG K X . Structural α-entropy weighting Gaussian mixture model for subspace clustering [J ] . Acta Electronica Sinica , 2022 , 50 ( 3 ): 718 - 725 .
张熠玲 , 杨燕 , 周威 , 等 . CMvSC:知识迁移下的深度一致性多视图谱聚类网络 [J ] . 软件学报 , 2022 , 33 ( 4 ): 1373 - 1389 .
ZHANG Y L , YANG Y , ZHOU W , et al . CMvSC:knowledge transferring based deep consensus network for multiview spectral clustering [J ] . Journal of Software , 2022 , 33 ( 4 ): 1373 - 1389 .
TANG H , ZHU X T , CHEN K , et al . Towards uncovering the intrinsic data structures for unsupervised domain adaptation using structurally regularized deep clustering [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022 , 44 ( 10 ): 6517 - 6533 .
VON LUXBURG U . A tutorial on spectral clustering [J ] . Statistics and Computing , 2007 , 17 ( 4 ): 395 - 416 .
CHEN W Y , SONG Y Q , BAI H J , et al . Parallel spectral clustering in distributed systems [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2011 , 33 ( 3 ): 568 - 586 .
CAI D , CHEN X L . Large scale spectral clustering via landmark-based sparse representation [J ] . IEEE Transactions on Cybernetics , 2015 , 45 ( 8 ): 1669 - 1680 .
HUANG D , WANG C D , WU J S , et al . Ultra-scalable spectral clustering and ensemble clustering [J ] . IEEE Transactions on Knowledge and Data Engineering , 2019 , 32 ( 6 ): 1212 - 1226 .
RODRIGUEZ A , LAIO A . Clustering by fast search and find of density peaks [J ] . Science , 2014 , 344 ( 6191 ): 1492 - 1496 .
FRED A L N , JAIN A K . Combining multiple clusterings using evidence accumulation [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2005 , 27 ( 6 ): 835 - 850 .
HORE P , HALL L O , GOLDGOF D B . A scalable framework for cluster ensembles [J ] . Pattern Recognition , 2009 , 42 ( 5 ): 676 - 688 .
YI J F , YANG T B , JIN R , et al . Robust ensemble clustering by matrix completion [C ] // Proceedings of 2012 IEEE 12th International Conference on Data Mining . Piscataway:IEEE Press , 2013 : 1176 - 1181 .
罗晓慧 , 李凡长 , 张莉 , 等 . 基于选择聚类集成的相似流形学习算法 [J ] . 软件学报 , 2020 , 31 ( 4 ): 991 - 1001 .
LUO X H , LI F C , ZHANG L , et al . Similar manifold learning based on selective cluster ensemble for image clustering [J ] . Journal of Software , 2020 , 31 ( 4 ): 991 - 1001 .
WEI H Q , CHEN L , RUAN K Y , et al . Low-rank tensor regularized fuzzy clustering for multiview data [J ] . IEEE Transactions on Fuzzy Systems , 2020 , 28 ( 12 ): 3087 - 3099 .
LI Z G , WU X M , CHANG S F . Segmentation using superpixels:a bipartite graph partitioning approach [C ] // Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2012 : 789 - 796 .
NGUYEN N , CARUANA R . Consensus clusterings [C ] // Proceedings of Seventh IEEE International Conference on Data Mining (ICDM 2007) . Piscataway:IEEE Press , 2008 : 607 - 612 .
STREHL A , GHOSH J . Cluster ensembles:a knowledge reuse framework for combining multiple partitions [J ] . Journal Of Machine Learning Research , 2003 , 3 : 583 - 617 .
VON LUXBURG U . A tutorial on spectral clustering [J ] . Statistics and Computing , 2007 , 17 ( 4 ): 395 - 416 .
LIU J L , WANG C , DANILEVSKY M , et al . Large-scale spectral clustering on graphs [C ] // Proceedings of the TwentyThird international joint conference on Artificial Intelligence . New York:ACM , 2013 : 1486 - 1492 .
CHEN W Y , SONG Y Q , BAI H J , et al . Parallel spectral clustering in distributed systems [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2011 , 33 ( 3 ): 568 - 586 .
CAI D , CHEN X L . Large scale spectral clustering via landmark-based sparse representation [J ] . IEEE Transactions on Cybernetics , 2015 , 45 ( 8 ): 1669 - 1680 .
HE L , RAY N , GUAN Y S , et al . Fast large-scale spectral clustering via explicit feature mapping [J ] . IEEE Transactions on Cybernetics , 2019 , 49 ( 3 ): 1058 - 1071 .
WU J S , ZHENG W S , LAI J H , et al . Euler clustering on large-scale dataset [J ] . IEEE Transactions on Big Data , 2018 , 4 ( 4 ): 502 - 515 .
IAM-ON N , BOONGOEN T , GARRETT S , et al . A link-based approach to the cluster ensemble problem [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2011 , 33 ( 12 ): 2396 - 2409 .
WU J J , LIU H F , XIONG H , et al . K-means-based consensus clustering:a unified view [J ] . IEEE Transactions on Knowledge and Data Engineering , 2015 , 27 ( 1 ): 155 - 169 .
HUANG D , LAI J H , WANG C D . Robust ensemble clustering using probability trajectories [J ] . IEEE Transactions on Knowledge and Data Engineering , 2016 , 28 ( 5 ): 1312 - 1326 .
LIU H F , ZHAO R , FANG H S , et al . Entropy-based consensus clustering for patient stratification [J ] . Bioinformatics , 2017 , 33 ( 17 ): 2691 - 2698 .
LIU H F , WU J J , LIU T L , et al . Spectral ensemble clustering via weighted K-means:theoretical and practical evidence [J ] . IEEE Transactions on Knowledge and Data Engineering , 2017 , 29 ( 5 ): 1129 - 1143 .
HUANG D , WANG C D , LAI J H . Locally weighted ensemble clustering [J ] . IEEE Transactions on Cybernetics , 2018 , 48 ( 5 ): 1460 - 1473 .
0
浏览量
184
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621