Graduation Year


Document Type




Degree Granting Department

Epidemiology and Biostatistics

Major Professor

Getachew Dagne

Co-Major Professor

Wei Wang


adolescent substance use, commensurate measuers, latent variables, longitudinal data analysis, moderated non linear factor analysis, pooled data analysis


A recent advancement in statistical methodology, Integrative Data Analyses (IDA Curran & Hussong, 2009) has led researchers to employ a calibration technique as to not violate an independence assumption. This technique uses a randomly selected, simplified correlational structured subset, or calibration, of a whole data set in a preliminary stage of analysis. However, a single calibration estimator suffers from instability, low precision and loss of power. To overcome this limitation, a multiple calibration (MC; Greenbaum et al., 2013; Wang et al., 2013) approach has been developed to produce better estimators, while still removing a level of dependency in the data as to not violate independence assumption. The MC method is conceptually similar to multiple imputation (MI; Rubin, 1987; Schafer, 1997), so MI estimators were borrowed for comparison.

A simulation study was conducted to compare the MC and MI estimators, as well as to evaluate the performance of the operating characteristics of the methods in a cross classified data characteristic design. The estimators were tested in the context of assessing change over time in a longitudinal data set. Multiple calibrations consisting of a single measurement occasion per subject were drawn from a repeated measures data set, analyzed separately, and then combined by the rules set forth by each method to produce the final results. The data characteristics investigated were effect size, sample size, and the number of repeated measures per subject. Additionally, a real data application of an MC approach in an IDA framework was conducted on data from three completed, randomized controlled trials studying the treatment effects of Multidimensional Family Therapy (MDFT; Liddle et al., 2002) on substance use trajectories for adolescents at a one year follow-up.

The simulation study provided empirical evidence of how the MC method preforms, as well as how it compares to the MI method in a total of 27 hypothetical scenarios. There were strong asymptotic tendencies observed for the bias, standard error, mean square error and relative efficiency of an MC estimator to approach the whole set estimators as the number of calibrations approached 100. The MI combination rules proved not appropriate to borrow for the MC case because the standard error formulas were too conservative and performance with respect to power was not robust. As a general suggestion, 5 calibrations are sufficient to produce an estimator with about half the bias of a single calibration estimator and at least some indication of significance, while 20 calibrations are ideal. After 20 calibrations, the contribution of an additional calibration to the combined estimator greatly diminished.

The MDFT application demonstrated a successful implementation of 5 calibration approach in an IDA on real data, as well as the risk of missing treatment effects when analysis is limited to a single calibration's results. Additionally, results from the application provided evidence that MDFT interventions reduced the trajectories of substance use involvement at a 1-year follow-up to a greater extent than any of the active control treatment groups, overall and across all gender and ethnicity subgroups. This paper will aid researchers interested in employing a MC approach in an IDA framework or whenever a level of dependency in a data set needs to be removed for an independence assumption to hold.

Included in

Biostatistics Commons