运筹与管理 ›› 2011, Vol. 20 ›› Issue (6): 66-72.

• 理论分析与方法探讨 • 上一篇    下一篇

基于主成分分析的单变量时间序列聚类方法

苏木亚, 郭崇慧   

  1. 大连理工大学 系统工程研究所, 辽宁 大连 116024
  • 收稿日期:2010-05-18 出版日期:2011-12-25
  • 作者简介:苏木亚(1983-),男,蒙古族,博士研究生,研究方向为数据挖掘与商务智能;郭崇慧(1973-),男,博士,教授,博士生导师,研究方向为系统优化方法、数据挖掘与机器学习。
  • 基金资助:
    国家自然科学基金资助项目(10571018,70871015);国家高技术研究发展计划(863计划)资助项目(2008AA04Z107)

Univariate Time Series Clustering Method Based on Principal Component Analysis

SU Mu-ya, GUO Chong-hui   

  1. Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China
  • Received:2010-05-18 Online:2011-12-25

摘要: 针对时间序列数据的高维特性,在进行理论分析的基础上,利用主成分分析法提出了一种单变量时间序列数据降维的新方法,进而提出了基于主成分分析的单变量时间序列聚类方法。其主要思想是在线性空间中的同一组基下,用系数之间的相似性来刻画对应时间序列之间相似性,在理论分析过程中,首先对单变量时间序列数据集进行主成分分析,其次分析了单变量时间序列数据集、样本协方差矩阵的特征向量与主成分之间的关系,并证明了由主成分构成的向量组线性无关。为了进一步验证理论分析结果的正确性和所提算法的有效性,分别利用仿真数据和真实的股票数据进行了数值实验。

关键词: 多元统计分析, 单变量时间序列, 主成分分析, 聚类分析

Abstract: For the high dimensionality of time series, based on theoretical analysis, a new method is proposed to reduce the dimension of univariate time series via principal component analysis, thus univariate time series clustering method based on principal component analysis is presented. The main idea is that, similarities among the univariate time series are reflected by similarities among the corresponding coefficients under the same basic vectors of linear space. In the process of theoretical analysis, we firstly do the principal component analysis on univariate time series data sets and then analyze the relationship among univariate time series, eigenvectors of sample covariance matrix and principal components. Moreover, we prove that the vectors composed of principal components are linear independent. In order to further verify the correctness of theoretical analysis results and the performance of the proposed algorithm, simulation data and real stock data are used to do the numerical experiments finally.

Key words: univariate time series, principal component analysis, cluster analysis

中图分类号: