Statistics in Biosciences, cilt.16, sa.3, ss.634-667, 2024 (ESCI)
Count datasets represented as integers are commonly encountered in various scientific fields, encompassing scenarios such as the number of species in a habitat, the number of accidents at a junction, the number of infected cells. This type of data often entails the presence of zero counts, which can be notably prevalent within the dataset. Recently, the zero-inflated Bell distribution family has been introduced to address the substantial occurrence of zeros in count datasets. Model diagnosis is a crucial step to ensure the appropriateness of a fitted model for the given data. While Pearson and deviance residuals are commonly employed for diagnosing count models in practical applications, it is widely acknowledged that these residuals do not adhere to normality when applied to count data. In the present study, our focus lies in evaluating the effectiveness of conventional diagnostic tools, including Pearson and deviance residuals, as well as randomized quantile residuals (RQRs) for the novel Bell and zero-inflated Bell models, which have been proposed as solutions to address overdispersion and zero inflation, respectively. Through this investigation, we aim to shed light on the performance of these residuals in the context of these newly proposed models. In the simulation study, the normality of randomized quantile residuals based on the Shapiro-Wilk test’s p-values are investigated for detecting overdispersion and zero inflation for the Bell-type regression models. The findings of this study indicate the superiority of RQRs in detecting distributional assumptions and reveal that RQRs possess the capability to detect overdispersion and zero inflation under Bell-type models. The number of infected blood cells is used in the application part of the study to illustrate the residual diagnostics of Bell-type regression models. Poisson, Bell, negative binomial, and their zero-inflated versions are utilized to analyze the infected blood cells dataset. Model fit criteria are employed to compare the analysis results of these count models, both in terms of goodness of fit and residual diagnostics.