Graduation Year


Document Type




Degree Granting Department

Public Health

Major Professor

Thomas J. Mason, Ph.D.


Survival time, Prognostic factors, Molecular markers, SAM, mRNA


Prediction of outcome in colorectal cancer (CRC) is currently based on the TNM staging classification; however, histopathological classification alone is insufficient for accurately predicting survival in stage II and III patients. Studies indicate that microarray gene expression profiles can predict survival in CRC. We hypothesize that tumor gene expression in combination with clinical parameters, is a better predictor of outcome in stage II and III colorectal cancers than the TNM stage classification alone. Clinical records and follow-up data were retrospectively reviewed for 58 Stage II and Stage III patients with primary colorectal cancer, who did not receive any neoadjuvant therapy preoperatively and whose samples had been previously analyzed for gene expression profiles using the Affymetrix U 133a Gene chip.

For molecular classification of patients as being at high or low risk for poor survival, samples were divided into two clusters by hierarchical cluster analysis of genes selected by SAM. Univariate and multivariate analyses using Cox proportional hazard models were done to identify significant prognostic factors. The 3-year and 5-year survival estimates were 72.41% (SE=5.8%) and 55.17% (SE=6.7%), respectively, for all 58 patients. Univariate analysis showed that advanced stage, older age, high-risk molecular classification, positive lymph nodes were the statistically significant prognostic factors of poor survival (p<0.05), while gender, preoperative CEA level, and family history of CRC in first degree relatives were not statistically significant. In multivariate analysis molecular classification, age and body mass index were independent significant prognostic factors. In Cox proportional hazard model, the estimated hazard ratios for Stage III vs II was 2.45 (95%CI: 0.85-7.04), for high vs low molecular risk was 3.83 (95%CI: 1.22-12.06) and old vs young age was 3.72 (95%CI: 1.2-11.49).

Model containing clinical stage in conjunction with molecular risk, body mass index, and age was a stronger indicator of clinical outcome (p= 0.0056) than model with clinical stage alone. Gene expression profiles predict survival independent of clinical parameters, and the addition of gene expression profiles to stage is more predictive of survival than stage alone. Further analysis needs to be done to validate the molecular classification on an independent dataset.