曲靖师范学院学报 ›› 2020, Vol. 39 ›› Issue (3): 1-5.

• 数学研究 •    下一篇

基于弹性网的稀疏近似主成分分析方法

张文明1, 付光辉2, 张小花1   

  1. 1.昆明长水国际机场 信息技术部,云南 昆明 650211;
    2.昆明理工大学 理学院数学系,云南 昆明 650500
  • 收稿日期:2020-01-06 出版日期:2020-05-26
  • 通讯作者: 付光辉,昆明理工大学理学院数学系副教授,统计学博士,主要从事应用统计,复杂数据分析研究.
  • 作者简介:张文明,昆明长水国际机场信息技术部副高级工程师,主要从事数据挖掘、大数据与云计算研究.
  • 基金资助:
    国家自然科学地区基金项目“类不平衡高维数据的统计建模及在代谢组学中的应用”(11761041).

Sparse Approximation for Principal Component analysis via elastic net

Zhang Wenming 1, Fu Guanghui2, Zhang Xiaohua1   

  1. 1. Department of Information Technology,Kunming Changshui International Airport, Kunming Yunnan 650211, China;
    2. School of Science, Kunming University of Science and Technology, Kunming Yunnan 650500, China
  • Received:2020-01-06 Published:2020-05-26

摘要: 主成分分析因能在损失极小信息的基础上极大地降低数据的维数且各个主成分相互正交而具有广泛的应用,但各个主成分是所有初始预测变量的线性组合,这不利于模型的解析.本文在主成分分析的基础上采用弹性网对各个主成分系数施行稀疏近似,得到了稀疏近似主成分分析(sPCA)算法.sPCA不但保留了原主成分分析的优点,而且因为其系数具有稀疏性,能极大地提高模型的解释性.

关键词: 主成分分析, 稀疏主成分分析, 弹性网, 模型解释性

Abstract: Principal component analysis (PCA) is widely used in many areas of scientific discoveries due to its dimension reduction and orthogonality of each principal component. However, The principal components are often the linear combinations of all the predictors and it is hard to achieve good model interpretability. In this paper, sparse approximation of the loadings of the principal components is induced by elastic net methods, and we established the sparse approximation principal component analysis (sPCA) algorithm. The sPCA not only keeps the advantages of original PCA, but also can greatly improve the model interpretability due to the sparsity of the loadings.

Key words: Principal component analysis, Sparse principal component analysis, Elastic net, Model interpretability

中图分类号: