Digital Object Identifier (DOI)
Regression models are extensively used in many epidemiological studies to understand the linkage between specific outcomes of interest and their risk factors. However, regression models in general examine the average effects of the risk factors and ignore subgroups with different risk profiles. As a result, interventions are often geared towards the average member of the population, without consideration of the special health needs of different subgroups within the population. This paper demonstrates the value of using rule-based analysis methods that can identify subgroups with heterogeneous risk profiles in a population without imposing assumptions on the subgroups or method. The rules define the risk pattern of subsets of individuals by not only considering the interactions between the risk factors but also their ranges. We compared the rule-based analysis results with the results from a logistic regression model in The Environmental Determinants of Diabetes in the Young (TEDDY) study. Both methods detected a similar suite of risk factors, but the rule-based analysis was superior at detecting multiple interactions between the risk factors that characterize the subgroups. A further investigation of the particular characteristics of each subgroup may detect the special health needs of the subgroup and lead to tailored interventions.
This work is licensed under a Creative Commons Attribution 4.0 License.
Was this content written or created while at USF?
Citation / Publisher Attribution
Scientific Reports, v. 6, art. 30828
Scholar Commons Citation
Haghighi, Mona; Johnson, Suzanne B.; Qian, Xiaoning; Lynch, Kristian F.; Vehik, Kendra; Huang, Shuai; and The TEDDY Study Group, "A Comparison of Rule-based Analysis with Regression Methods in Understanding the Risk Factors for Study Withdrawal in a Pediatric Study" (2016). Industrial and Management Systems Engineering Faculty Publications. 51.