Graduation Year

2015

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Xiaoning Qian, Ph.D.

Co-Major Professor

Lawrence O. Hall, Ph.D.

Committee Member

Dmitry Goldgof, Ph.D.

Committee Member

Bo Zeng, Ph.D.

Committee Member

Hye-Seung Lee, Ph.D.

Keywords

computational biology, disease stratification, heterogeneity, interaction, networks

Abstract

In a modern systematic view of biology, cell functions arise from the interaction between molecular components. One of the challenging problems in systems biology with high-throughput measurements is discovering the important components involved in the development and progression of complex diseases, which may serve as biomarkers for accurate predictive modeling and as targets for therapeutic purposes. Due to the non-linearity and heterogeneity of these complex diseases, traditional biomarker identification approaches have had limited success at finding clinically useful biomarkers. In this dissertation we propose novel methods for biomarker identification that explicitly take into account the non-linearity and heterogeneity of complex diseases. We first focus on the methods to deal with non-linearity by taking into account the interactions among features with respect to the disease outcome of interest. We then focus on the methods for finding disease subtypes with their subtype-specific biomarkers for heterogeneous diseases, where we show how prior biological knowledge and simultaneous disease stratification and personalized biomarker identification can help achieve better performance. We develop novel computational methods for more accurate and robust biomarker identification including methods for estimating the interactive effects, a network-based feature ranking algorithm that takes into account the interactive effects between biomarkers, different approaches for finding distances between somatic mutation profiles for better disease stratification using prior knowledge, and a network-regularized bi-clique finding algorithm for simultaneous subtype and biomarker identification. Our experimental results show that our proposed methods perform better than the state-of-the-art methods for both problems.

Share

COinS