Graduation Year

2017

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Mathematics and Statistics

Major Professor

Chris P. Tsokos, Ph.D.

Committee Member

Kandethody Ramachandran, Ph.D.

Committee Member

Lu Lu, Ph.D.

Committee Member

Dan Shen, Ph.D.

Keywords

Artificial Neural Networks, Bayesian Analysis, Diagnostic Models in Health, Survival Predictions, Vulnerabilty prediction models

Abstract

Being in the era of Big data, the applicability and importance of data-driven models like artificial neural network (ANN) in the modern statistics have increased substantially. In this dissertation, our main goal is to contribute to the development and the expansion of these ANN models by incorporating Bayesian learning techniques. We have demonstrated the applicability of these Bayesian ANN models in interdisciplinary research including health and cybersecurity.

Breast cancer is one of the leading causes of deaths among females. Early and accurate diagnosis is a critical component which decides the survival of the patients. Including the well known ``Gail Model", numerous efforts are being made to quantify the risk of diagnosing malignant breast cancer. However, these models impose some limitations on their use of risk prediction. In this dissertation, we have developed a diagnosis model using ANN to identify the potential breast cancer patients with their demographic factors and the previous mammogram results. While developing the model, we applied the Bayesian regularization techniques (evidence procedure), along with the automatic relevance determination (ARD) prior, to minimize the network over-fitting. The optimal Bayesian network has 81\% overall accuracy in correctly classifying the actual status of breast cancer patients, 59\% sensitivity in accurately detecting the malignancy and 83\% specificity in correctly detecting non-malignancy. The area under the receiver operating characteristic curve (0.7940) shows that this is a moderate classification model.

We then present a new Bayesian ANN model for developing a nonlinear Poisson regression model which can be used for count data modeling. Here, we have summarized all the important steps involved in developing the ANN model, including the forward-propagation, backward-propagation and the error gradient calculations of the newly developed network. As a part of this, we have introduced a new activation function into the output layer of the ANN and error minimizing criterion, using count data. Moreover, we have expanded our model to incorporate the Bayesian learning techniques. The performance our model is tested using simulation data.

In addition to that, a piecewise constant hazard model is developed by extending the above nonlinear Poisson regression model under the Bayesian setting. This model can be utilized over the other conventional methods for accurate survival time prediction. With this, we were able to significantly improve the prediction accuracies. We captured the uncertainties of our predictions by incorporating the error bars which could not achieve with a linear Poisson model due to the overdispersion in the data. We also have proposed a new hybrid learning technique, and we evaluated the performance of those techniques with a varying number of hidden nodes and data size.

Finally, we demonstrate the suitability of Bayesian ANN models for time series forecasting by using an online training algorithm. We have developed a vulnerability forecast model for the Linux operating system by using this approach.

Share

COinS