• Chỉ mục bởi
  • Năm xuất bản
LIÊN KẾT WEBSITE

Two novel adaptive symbolic representations for similarity search in time series databases

Pham N.D. Faculty of Computer Science and Engineering, HCM University of Technology, Vietnam National University, HoChiMinh City, Viet Nam|
Dang T.K. | Le Q.L. |

Advances in Web Technologies and Applications - Proceedings of the 12th Asia-Pacific Web Conference, APWeb 2010 Số , năm 2010 (Tập , trang 181-187)

DOI: 10.1109/APWeb.2010.23

Tài liệu thuộc danh mục: Scopus

Conference Paper

English

Từ khóa: Data sets; Dimensionality reduction; Distance measure; Gaussian distributed; k-Means algorithm; Level Of Interest; Lower bounds; Multi-resolutions; Piecewise polynomial models; Random disk access; Real-world; Real-world application; Representation model; Similarity search; Spectral models; Symbolic model; Symbolic representation; Time series data mining; Time Series Database; Algorithms; Data mining; Gaussian distribution; Image recording; Polynomial approximation; Time series; Mathematical models
Tóm tắt tiếng anh
Since the last decade, we have seen an increasing level of interest in time series data mining due to its variety of real-world applications. Numerous representation models of time series have been proposed for data mining, including piecewise polynomial models, spectral models, and the recently proposed symbolic models, such as Symbolic Aggregate approXimation (SAX) and its multiresolution extension, indexable Symbolic Aggregate approXimation (iSAX). In spite of many advantages of dimensionality/numerosity reduction, and lower bounding distance measures, the quality of SAX approximation is highly dependent on the Gaussian distributed property of time series, especially in reduced-dimensionality literature. In this paper, we introduce a novel adaptive symbolic approach based on the combination of SAX and kmeans algorithm which we call adaptive SAX (aSAX). The proposed representation greatly outperforms the classic SAX not only on the highly Gaussian distribution datasets, but also on the lack of Gaussian distribution datasets with a variety of dimensionality reduction. In addition to being competitive with, or superior to, the classic SAX, we extend aSAX to the multiresolution symbolic representation called indexable adaptive SAX (iaSAX). Our empirical experiments with realworld time series datasets confirm the theoretical analyses as well as the efficiency of the two proposed algorithms in terms of the tightness of lower bound, pruning power and number of random disk accesses. � 2010 IEEE.

Xem chi tiết