Comparative‎ ‎Study‎ ‎of‎ ‎an‎ ‎Ensemble‎ ‎Machine Learning‎ ‎Model Versus‎ ‎Maximum‎ ‎Likelihood Model‎ ‎to Assess‎ ‎Reliability Measures in Right Censored Data Analysis

Document Type : Original Scientific Paper

Authors

1 ‎Department of Statistics, Faculty of Mathematical Science, ‎University of Kashan, Kashan‎, ‎I‎. ‎R‎. ‎Iran

2 ‎Department of Computer Science, ‎Faculty of Mathematical Science, ‎University of Kashan, ‎Kashan‎, ‎I‎. ‎R‎. ‎Iran

10.22052/mir.2024.254968.1465

Abstract

‎This paper explores the estimation of a new power function under Type-II right censoring using two methods‎: ‎maximum likelihood estimation (MLE) and an ensemble machine learning model based on stacking‎. ‎The study aims to assess both methods' effectiveness in estimating various reliability measures‎, ‎such as hazard rate‎, ‎mean residual life‎, ‎variance residual life‎, ‎mean inactivity time‎, ‎and variance inactivity time‎. ‎The stacking model integrates five base models‎, ‎radial basis function neural network‎, ‎random forest‎, ‎Support Vector Regression (SVR)‎, ‎Multilayer Perceptron (MLP)‎, ‎and gradient boosting regression trees‎, ‎with an radial basis function neural network serving as a meta-learner for final predictions‎. ‎Numerical experiments compare the performance of the stacking model against MLE for Type-II censored data‎. ‎Results indicate that the stacking model significantly enhances the accuracy of reliability measure predictions‎, ‎showcasing its potential as a robust tool for reliability analysis in the context of Type-II censoring‎.

Keywords

Main Subjects


[1] M. Z. Iqbal, M. Z. Arshad, G. Ozel and O. S. Balogun, A better approach to discuss medical science and engineering data with a modified Lehmann type II model, F1000Research 10 (2021) #823.
[2] N. U. Nair, P. G. Sankaran and N. Balakrishnan, Quantile-Based Reliability Analysis, Basel: Birkhäuser, 2013.
[3] J. P. Mills, Table of the ratio: area to bounding ordinate for any portion of normal curve, Biometrika 18 (1926) 395 - 400, https://doi.org/10.1093/biomet/18.3-4.395.
[4] C. Kundu and A. K. Nanda, Some reliability properties of the inactivity time, Comm. Statist. Theory Methods 39 (2010) 899 - 911, https://doi.org/10.1080/03610920902807895.
[5] H. Ishwaran, U. B. Kogalur, E. H. Blackstone and M. S. Lauer, Random survival forests, Ann. Appl. Stat. 2 (2008) 841 - 860.
[6] N. Rane, S. P. Choudhary and J. Rane, Ensemble deep learning and machine learning: applications, opportunities, challenges, and future directions, Stud. Med. Heal. Sci. 5 (2024) 18 - 41, https://doi.org/10.48185/smhs.v1i2.1225.
[7] J. H. Friedman, Greedy function approximation: a gradient boosting machine, Ann. Statist. 29 (2001) 1189 - 1232.
[8] J. L. Katzman, U. Shaham, A. Cloninger, J. Bates, T. Jiang and Y. Kluger, DeepSurv: personalized treatment recommendation via a Cox proportional hazards deep neural network, BMC Med. Res. Methodol 18 (2018), https://doi.org/10.1186/s12874-018-0482-1.
[9] A. Wey, J. Connett and K. Rudser, Combining parametric, semi-parametric, and non-parametric survival models with stacked survival models, Biostatistics 16 (2015) 537 - 549.
[10] D. G. Kleinbaum and M. Klein, Survival Analysis: A Self-Learning Text, Springer New York, NY, 2012.
[11] C. Sun, H. Li, R. E. Mills and Y. Guan, Prognostic model for multiple myeloma progression integrating gene expression and clinical features, Gigascience 8 (2019) #giz153, https://doi.org/10.1093/gigascience/giz153.
[12] P. G. Ginestet, E. E. Gabriel and M. C. Sachs, Survival stacking with multiple data types using pseudo-observation-based-AUC loss, J. Biopharm. Stat. 32 (2022) 858 - 870, https://doi.org/10.1080/10543406.2022.2041655.
[13] J. Shen, S. Wang, H. Sun, J. Huang, L. Bai, X. Wang and Z. Tang, A novel non-negative bayesian stacking modeling method for cancer survival prediction using high-dimensional omics data, BMC Med. Res. Methodol. 24 (2024) #105, https://doi.org/10.1186/s12874-024-02232-3.
[14] J. H. McVittie, D. B. Wolfson, V. Addona and Z. Li, Stacked survival models for residual lifetime data, BMC Med. Res. Methodol. 22 (2022) #10, https://doi.org/10.1186/s12874-021-01496-3.
[15] A. B. Çolak, T. N. Sindhu, S. A. Lone, M. T. Akhtar and A. Shafiq, A comparative analysis of maximum likelihood estimation and artificial neural network modeling to assess electrical component reliability, Qual. Reliab. Eng. Int. 40 (2024) 91 - 114, https://doi.org/10.1002/qre.3233.
[16] G. Casella and R. L. Berger, Statistical Inference, Second edition, Duxbury Press, 2002.
[17] J. F. Lawless, Statistical Models and Methods for Lifetime Data, John Wiley and Sons, New York, 2003.
[18] B. C. Arnold, N. Balakrishnan and H. N. Nagaraja, A First Course in Order Statistics, New York, John Wiley, 1992.
[19] R. B. D’Agostino and M. A. Stephens, Goodness-of-Fit Techniques, Marcel Dekker, New York, 1986.
[20] C.-T. Lin, Y.-L. Huang and N. Balakrishnan, A new method for goodness-offit testing based on type-II right censored samples, IEEE Trans. Reliab. 57 (2008) 633 - 642, https://doi.org/10.1109/TR.2008.2005860.
[21] J. R. Michael and W. R. Schucany, A new approach to testing goodness of fit for censored samples, Technometrics 21 (1979) 435 - 441.
[22] M. A. Stephens, EDF statistics for goodness of fit and some comparisons, J. Am. Stat. Assoc. 69 (1974) 730 - 737.
[23] T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer New York, NY, 2009.
[24] D. H. Wolpert, Stacked generalization, Neural networks 5 (1992) 241 - 259.
[25] C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, 1995.
[26] L. V. Fausett, Fundamentals of Neural Networks: Architectures, Algorithms and Applications, Pearson Education India, 2006.
[27] K. Levenberg, A method for the solution of certain non-linear problems in least squares, Quart. Appl. Math. 2 (1944) 164 - 168.
[28] D. W. Marquardt, An algorithm for least-squares estimation of nonlinear parameters, J. Soc. Ind. Appl. Math. 11 (1963) 431 - 441.
[29] J. Nocedal and S. J.Wright, Numerical optimization, New York, NY, Springer New York, 1999.
[30] A. J. Smola and B. Schölkopf, A tutorial on support vector regression, Stat. Comput. 14 (2004) 199 - 222.
[31] V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, 1995.
[32] L. Breiman, Random forests, Mach. Learn. 45 (2001) 5 - 32.
[33] H. Rinne, The Weibull Distribution: A Handbook, CRC Press: Boca Raton, 2009.