Regression Analysis, Residual Plots, Life Expectancy


A period life table provides an estimate of the probability that a person will die at a particular age. Using data available online, we examine tables of expected years to live for males and females against age for three populations: the United States in 2007, the U.S. at the turn of the twentieth century, and the Roman Empire. Scatter plots of males and females for each population show how life expectancy increases with age (e.g., U.S. 2007: 50 year-old female > 40 year-old female > 45 year-old male). The three data sets allow historical comparisons (e.g., of gender disparity, larger now; of infant mortality, smaller now). Regression lines for the linear portion of the plots (ages 5 to 70) show the annual increase in the years to live (e.g., U.S. 2007: 0.11 years for men, 0.07 years for women). Residual plots show that, even though the coefficients of determination of the line exceed 0.99, a concave-up, decreasing function would be a better model. The residual plots also reveal a curious inflection for the males that is not evident for the females. Such examples from period life tables might be presented in a discussion of life expectancy; alternatively, one or more could add to an introduction to regression, particularly illustrating the value of residual plots in understanding a data set.



Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License