Graduation Year


Document Type




Degree Granting Department

Epidemiology and Biostatistics

Major Professor

Getachew A. Dagne


Bayesian approach, bivariate ordered probit, MCMC, WinBUGS, zero-inflation


Multivariate ordinal response data, such as severity of pain, degree of disability, and satisfaction with a healthcare provider, are prevalent in many areas of research including public health, biomedical, and social science research. Ignoring the multivariate features of the response variables, that is, by not taking the correlation between the errors across models into account, may lead to substantially biased estimates and inference. In addition, such multivariate ordinal outcomes frequently exhibit a high percentage of zeros (zero inflation) at the lower end of the ordinal scales, as compared to what is expected under a multivariate ordinal distribution. Thus, zero inflation coupled with the multivariate structure make it difficult to analyze such data and properly interpret the results. Methods that have been developed to address the zero-inflated data are limited to univariate-logit or univariate-probit model, and extension to bivariate (or multivariate) probit models has been very limited to date.

In this research, a latent variable approach was used to develop a Mixture Bivariate Zero-Inflated Ordered Probit (MBZIOP) model. A Bayesian MCMC technique was used for parameter estimation. A simulation study was then conducted to compare the performances of the estimators of the proposed model with two existing models. The simulation study suggested that for data with at least a moderate proportion of zeros in bivariate responses, the proposed model performed better than the comparison models both in terms of lower bias and greater accuracy (RMSE). Finally, the proposed method was illustrated with a publicly-available drug-abuse dataset to identify highly probable predictors of: (i) being a user/nonuser of marijuana, cocaine, or both; and (ii), conditional on user status, the level of consumption of these drugs. The results from the analysis suggested that older individuals, smokers, and people with a prior criminal background have a higher risk of being a marijuana only user, or being the user of both drugs. However, cocaine only users were predicted on the basis of being younger and having been engaged in the criminal-justice system. Given that an individual is a user of marijuana only, or user of both drugs, age appears to have an inverse effect on the latent level of consumption of marijuana as well as cocaine. Similarly, given that a respondent is a user of cocaine only, all covariates--age, involvement in criminal activities, and being of black race--are strong predictors of the level of cocaine consumption. The finding of older age being associated with higher drug consumption may represent a survival bias whereby previous younger users with high consumption may have been at elevated risk of premature mortality. Finally, the analysis indicated that blacks are likely to use less marijuana, but have a higher latent level of cocaine given that they are user of both drugs.

Included in

Biostatistics Commons