Flexible Parsimonious Mixture of Skew Factor Analysis‎ ‎Based‎ ‎on‎ ‎Normal‎ ‎Mean--Variance Birnbaum-Saunders

Document Type : Original Scientific Paper

Authors

1 ‎Department of Statistics, ‎University of Kashan, ‎Kashan‎, ‎I‎. ‎R‎. ‎Iran

2 ‎Department of Applied Mathematics, ‎University of Kashan,‎Kashan‎, ‎I‎. ‎R‎. ‎Iran

3 ‎Farhangian University Of Kerman, ‎Kerman‎, ‎I‎. ‎R‎. ‎Iran

Abstract

‎The purpose of this paper is to extend the mixture factor analyzers (MFA) model \CG{to handle} missing and heavy-\CG{tailed} data‎. ‎In this model‎, ‎the distribution of factors loading and errors arise from the multivariate normal mean-variance mixture of‎ \CG{the} Birnbaum-Saunders (NMVBS) distribution‎. ‎By using the structures covariance matrix‎, ‎we introduce parsimonious MFA based on NMVBS distribution‎. ‎An Expectation Maximization (EM)-type algorithm is developed for parameter estimation‎. ‎Simulations study and real data sets represent the efficiency and performance of the proposed model‎.

Keywords

Main Subjects


[1] G. Celeux and G. Govaert, Gaussian parsimonious clustering models, Pattern Recognit. 28 (1995) 781-793, https://doi.org/10.1016/0031-3203(94)00125-6.
[2] J. L. Andrews and P. D. McNicholas, Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions, Stat. Comput. 22 (2012) 1021 - 1029, https://doi.org/10.1007/s11222-011-9272-x.
[3] R. P. Browne and P. D. McNicholas, A mixture of generalized hyperbolic distributions, Canad. J. Statist. 43 (2015) 176 - 198, https://doi.org/10.1002/cjs.11246.
[4] P. D. McNicholas, Mixture Model-Based Classification, Taylor and Francis Group, 2020.
[5] Z. Ghahramani and G. E. Hinton, The EM Algorithm for Mixtures of Factor Analyzers, In: Technical Report CRG-TR-96-1. University of Toronto, 1996.
[6] P. M. Murray, R. P. Browne and P. D. McNicholas, Mixtures of skew-t factor analyzers, Comput. Statist. Data Anal. 77 (2014) 326 - 335, https://doi.org/10.1016/j.csda.2014.03.012.
[7] T.-I. Lin, W.-L. Wang, G. J. McLachlan and S. X. Lee, Robust mixtures of factor analysis models using the restricted multivariate skew-t distribution, Stat. Model. 18 (2018) 50 - 72, https://doi.org/10.1177/1471082X17718119.
[8] Y. Wei, Y. Tang and P. D. McNicholas, Flexible high-dimensional unsupervised learning with missing data, IEEE Trans. Pattern Anal. Mach. Intell. 42 (2018) 610 - 621, https://doi.org/10.1109/TPAMI.2018.2885760.
[9] S. X. Lee, T.-I. Lin and G. J. McLachlan, Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions, Adv. Data Anal. Classif. 15 (2021) 481 - 512, https://doi.org/10.1007/s11634-020-00420-9.
[10] W.-L.Wang and T.-I. Lin, Model-based clustering via mixtures of unrestricted skew normal factor analyzers with complete and incomplete data, Stat. Methods Appl. 32 (2023) 787 - 817, https://doi.org/10.1007/s10260-022-00674-x.
[11] A. Azzalini, A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985) 171 - 178.
[12] T.-I. Lin, G. J. McLachlan and S. X. Lee, Extending mixtures of factor models using the restricted multivariate skew-normal distribution, J. Multivariate Anal. 143 (2016) 398 - 413, https://doi.org/10.1016/j.jmva.2015.09.025.
[13] P. D. McNicholas and T. B. Murphy, Parsimonious gaussian mixture models, Stat. Comput. 18 (2008) 285-296, https://doi.org/10.1007/s11222-008-9056-0.
[14] P. M. Murray, P. D. McNicholas and R. P. Browne, A mixture of common skew-t factor analysers, Stat. 3 (2014) 68 - 82,
https://doi.org/10.1002/sta4.43.
[15] F. Hashemi, M. Naderi, A. Jamalizadeh and T.-I. Lin, A skew factor analysis model based on the normal mean–variance mixture of Birnbaum-Saunders distribution, J. Appl. Stat. 47 (2020) 3007 - 3029, https://doi.org/10.1080/02664763.2019.1709054.
[16] F. Hashemi, M. Naderi, A. Jamalizadeh and A. Bekker, A flexible factor analysis based on the class of mean-mixture of normal distributions, Comput. Statist. Data Anal. 157 (2021) #107162, https://doi.org/10.1016/j.csda.2020.107162.
[17] A. P. Dempster, N. M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Statist. Soc. Ser. B 39 (1977) 1 - 22, https://doi.org/10.1111/j.2517-6161.1977.tb01600.x.
[18] X.-L. Meng and D. B. Rubin, Maximum likelihood estimation via the ECM algorithm: a general framework, Biometrika 80 (1993) 267 - 278, https://doi.org/10.1093/biomet/80.2.267.
[19] Z. W. Birnbaum and S. C. Saunders, A new family of life distributions, J. Appl. Prob. 6 (1969) 319 - 327, https://doi.org/10.2307/3212003.
[20] O. Barndorff-Nielsen and C. Halgreen, Infinite divisibility of the hyperbolic and generalized inverse gaussian distributions, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 38 (1977) 309-311, https://doi.org/10.1007/BF00533162.
[21] M. Naderi, W. L. Hung, T. I. Lin and A. Jamalizadeh, A novel mixture model using the multivariate normal mean–variance mixture of Birnbaum–Saunders distributions and its application to extrasolar planets, J. Multivariate Anal.
171 (2019) 126 - 138, https://doi.org/10.1016/j.jmva.2018.11.015.
[22] U. J. Dang, A. Punzo, P. D. McNicholas, S. Ingrassia and R. P. Browne, Multivariate response and parsimony for Gaussian cluster-weighted models, J. Classif. 34 (2017) 4 - 34, https://doi.org/10.1007/s00357-017-9221-2.
[23] G. Schwarz, Estimating the dimension of a model, Ann. Statist. 6 (1978) 461 - 464.
[24] L. Hubert and P. Arabie, Comparing partitions, J. Classif. 2 (1985) 193-218, https://doi.org/10.1007/BF01908075.
[25] I. Meilijson, A fast improvement to the EM algorithm on its own terms, J. Roy. Statist. Soc. Ser. B 51 (1989) 127-138, https://doi.org/10.1111/j.2517-6161.1989.tb01754.x.
[26] R. P. Gorman and T. J. Sejnowski, Analysis of hidden units in a layered network trained to classify sonar targets, Neur. Net. 1 (1988) 75- 89, https://doi.org/10.1016/0893-6080(88)90023-8.