Functional Data Clustering Method Based on Shape Information and Functional Mahalanobis Distance

Authors

  • Zibing Wang

DOI:

https://doi.org/10.54097/2a2gam15

Keywords:

Function-based data, Clustering algorithm, Nonparametric clustering, Derivative function information

Abstract

Function-type data can provide insights into the internal structure of the data and facilitate the extraction of data features from the perspective of interactive derivative functions. The paper proposes a nonparametric clustering method for function-based data that incorporates first-order and second-order derivative function information into the Marginal distance for clustering function-based data. The method is based on the traditional K-means algorithm and is designed to cluster function-based data that contains rich shape information. The paper evaluated the performance of the algorithm of the text by comparing its purity and adjusted Rand coefficients against six other clustering algorithms on three different datasets. The results show that the algorithm of the text outperforms the other algorithms, demonstrating its outstanding performance, wide applicability, and practical significance in solving real-world problems.

Downloads

Download data is not yet available.

References

Wang J L, Chiou J M, Müller H G. Functional data analysis[J]. Annual Review of Statistics and its application, 2016, 3: 257-295.

Zhang S, Li X, Lin J, et al. Review of single-cell RNA-seq data clustering for cell-type identification and characterization[J]. RNA, 2023, 29(5): 517-530.

M., Giacofci, S., Lambert-Lacroix, G., & Marot, et al. (2013). Wavelet-based clustering for mixed-effects functional models in high dimension. Biometrics Journal of the Biometric Society An International Society Devoted to the Mathematical & Statistical Aspects of Biology.

WU Qi-ran, Zhou Li-kai, SUN Jin-jin, et al. Researchon characeristics of air quality change in Zhejiang Province——based on functional data analysis[J]. Journal of Shandong University(Natural Science),2021,56(07):53-64.

Gao Minghui, Yi Danhu, Peng jin, et al. Application of Functional Data Clustering Methods on Missing Data[J].Modernization of Traditonal Chinese Medicine and Materia Medica-World Science and Technology.2017,19(12):1966-1975.

Chamroukhi F, Nguyen H D. Model‐based clustering and classification of functional data[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2019, 9(4): e1298.

CHIOU J M ,LI P L. Functional clustering and identifying substructures of longitudinal data[J]. Journal of the Royal Statistical Society Series B, 2007,69: 679-699.

Zambom A Z, Collazos J A A, Dias R. Functional data clustering via hypothesis testing k-means[J]. Computational Statistics, 2019, 34: 527-549.

Wu R, Wang B, Xu A. Functional data clustering using principal curve methods[J]. Communications in Statistics-Theory and Methods, 2022, 51(20): 7264-7283.

GUO Jun-peng, WANG Mei-nan, GAO Cheng-ju, DAI Hui. Step-by-Step Hierarchical Algorithm for Functional Data [J].Journal of Systems & Management,2015,24(06):814-820.

SUN Li-rong, ZHUO Wei-jie, WANG Kai-li, ma Jia-hui. Study on functional clustering analysis methods[J]. Applied Mathematics A Journal of Chinese Universities(Ser .A),2020,35(02):127-140.

WANG DE-qing, ZHU Jian-ping, Liu Ciao-wei HE Ling-yu. Review and Prospect of Functional Data Clustering Analysis[J]. Journal of Applied Statistics and Management ,2018,37(01):51-63. 20170519-003.

MENG Yinfeng, YANG Jiayu, CAO Fuyuan. Splist transfer hierarchical clustering algorithm for functional data[J]. Journal of Shandong University(Engineering Science),2022,52(01):19-27.

LIANG B, LIANG J Y, CAO F Y. A multiple k-means clustering ensemble algorithm to find nonlinearly separable clusters[J]. Information Fusion,2020:61.

Jie, Peng, Hans-Georg, & Müller. (2008). Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions. The Annals of Applied Statistics.

Jacques, J. , & Preda, C. . (2013). Funclust: a curves clustering method using functional random variables density approximation. Neurocomputing, 112(jul.18), 164-171.

Charles Bouveyron and Julien and Jacques. Model-based clustering of time series in group-specific functional subspaces. Advances in Data Analysis and Classification.5(4).281-300.2011

Nicoleta Serban and Huijing Jiang. Clustering Random Curves Under Spatial Interdependence With Application to Service Accessibility. Technometrics.54(2).108-119.2012

Downloads

Published

26-01-2024

How to Cite

Wang, Z. (2024). Functional Data Clustering Method Based on Shape Information and Functional Mahalanobis Distance. Highlights in Science, Engineering and Technology, 82, 223-229. https://doi.org/10.54097/2a2gam15