曲靖师范学院学报 ›› 2022, Vol. 41 ›› Issue (3): 1-7.

• 数学研究 •    下一篇

一种加权K-means聚类算法及其应用

沈秀娟, 薛烁   

  1. 曲靖师范学院 数学与统计学院,云南 曲靖 655011
  • 收稿日期:2021-12-22 出版日期:2022-05-26 发布日期:2022-06-02
  • 作者简介:沈秀娟,曲靖师范学院数学与统计学院助教,主要从事数理统计研究.
  • 基金资助:
    云南省科技厅项目“隐马尔可夫结构方程模型的贝叶斯推断”(2018FH001-109)、“云环境下面向公众服务的政务数据可搜索加密技术研究”(2019FH001-108);云南省教育厅项目“隐非齐次马尔可夫模型的贝叶斯推断”(2021J0499)、“复杂数据的统计建模与分析”(2021J0503).

A Weighted K-means Clustering Algorithm and Its Application

SHEN Xiujuan, XUE Shuo   

  1. School of Mathematics and Statistics, Qujing Normal University, Qujing Yunnan 655011, China
  • Received:2021-12-22 Published:2022-05-26 Online:2022-06-02

摘要: K-means聚类方法是比较经典的聚类分析方法之一,因执行效率快,算法简单等优点在许多领域得到广泛应用.传统的K-means聚类算法没有考虑各指标的重要性,实际上,指标的重要性是有差异的,应当区别对待.基于此,改进传统的K-means聚类方法,提出了加权K-means聚类算法.为验证改进算法的优越性,以R软件自带的鸢尾花卉数据集iris作为实验数据,设定了三个实验进行对比分析,实验结果表明加权K-means聚类算法收敛速度较快,分类的结果更接近原始分类,效果更好.为进一步说明加权K-means聚类算法的实用性,以103所高校部分指标为实验数据用该算法进行了聚类分析.

关键词: K-means聚类, 加权K-means聚类, 标准差系数法, 高校聚类

Abstract: K-means clustering method is one of the classic and widely-used clustering analysis methods because of its fast execution efficiency and simple algorithm.However, the traditional K-means clustering algorithm does not consider the importance of each index while the importance of each index is different and it should be treated differently. This paper improves the traditional K-means clustering method and proposes a weighted K-means clustering algorithm in order to verify the superiority of the improved algorithm, we take iris as the experimental data, and set up three experiments for comparative analysis. The experimental results show that the weighted K-means clustering algorithm converges faster, the classification result is closer to the original classification, and the effect is better. In order to further illustrate the practicability of weighted K-means clustering algorithm, this paper takes some indexes of 103 colleges and universities as experimental data, and uses the algorithm for clustering analysis.

Key words: K-means clustering, weighted K-means clustering, standard deviation coefficient method, university clustering

中图分类号: