Graduation Year


Document Type




Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Mathematics and Statistics

Major Professor

Chris P. Tsokos, Ph.D.

Committee Member

Getachew A. Dagne, Ph.D.

Committee Member

Kandethody M. Ramachandran, Ph.D.

Committee Member

Lu Lu, Ph.D.


Beta-Amyloid, Carbon Dioxide, Copula, Parametric Analysis, Tau Protein, Times Series


The importance and applicability of data-driven statistical models have increased significantly. This current study, we have utilized statistical techniques in interdisciplinary research, including environmental and health.

Environmentally, global warming is considered one of the critical issues facing our planet. It is the increase in average global temperatures caused mostly by increases in Carbon Dioxide CO2. The excessive rise of carbon dioxide from the average level as the side effect of the industrial revolution has a significant impact on blocking the heat and increase the temperature within the Earth’s atmosphere. Based on the record of total CO2 emissions from fossil fuel burning and cement production in 2014, Saudi Arabia ranked as the 8th largest carbon dioxide emitter among all the countries in the world and some of the Middle Eastern countries are in the top 50.

In the first part of the study, we have developed a data-driven nonlinear statistical model to identify the significant types of fossil fuel (gas fuel, liquid fuel, and solid fuel), cement manufacture, and gas flaring and their possible interactions and have ranked them based on their percentage of contribution to the atmospheric CO2 concentrations in the Middle East. Then, we compared the results to the findings with those of the United States, the European Union, and South Korea.

Second, the multiplicative seasonal autoregressive integrated moving average (seasonal ARIMA) model is used to develop statistical time series forecasting models to predict carbon dioxide in the atmosphere in the Middle East and atmospheric temperature in Saudi Arabia. Thus, the resulting statistical predictive model is useful in forecasting and monitoring the future level of carbon dioxide emission and extracting meaningful statistics and characteristics about the emission of carbon dioxide in the Middle East.

In health science, Alzheimer’s disease is one of the most critical diseases our planet is facing since it is a rapidly increasing disease as the population ages, and the diagnosis of the disease is still poorly understood. Thus, the need for biomarkers for reliable diagnosis is tremendous to help in finding treatment to this severe disease. Hence, the main aim of this study is to utilize information from baseline measurements to develop a statistical prediction model using multiple logistic regression to identify patients with Alzheimer’s disease from cognitively normal individuals. Our optimal predictive model includes five risk factors and two interaction terms and has been evaluated using classification accuracy, sensitivity, specificity values, and area under the curve.

Finally, as researchers and scientists suggested that the abnormal level of beta-amyloid and phosphorylated tau (P𝜏) proteins as one of the possible causes of Alzheimer’s, we performed parametric statistical analysis to the beta-amyloid and the P𝜏 proteins levels of Alzheimer’s patients to understand their probabilistic behavior independently. This study involves the identification of the probability distribution function that characterizes the behavior of the subject variables of interest. Having identified such a probability function, we can obtain useful information concerning the two subject entities, such as the expected numerical value and confidence level of the beta-amyloid and P-tau proteins. The second main aim of this study is to explore their probabilistic behavior as correlated variables by establishing their bivariate probability distribution function. A copula method is proposed to model the joint probability density function of both proteins with the given marginals and correlation coefficient. Usually, researchers working on Alzheimer’s data characterize the probability distribution function (pdf) of beta-amyloid and P-tau protein levels as the popular Gaussian pdf. The required symmetry of the data is not correct in the subject area, and the results will be misleading. Thus, the best distributions that fit the levels of beta-amyloid and P-tau proteins are the three parameters log-logistic probability distribution and the three-parameter Weibull probability distribution, respectively.