Calculating Median Age at Death and Modal Age at Death with Decimals

Background

Longevity is often measured through time trends in the mean age at death (MAD). In addition to MAD, there are two other interesting metrics, which can be calculated from life table functions. Namely, the median age at death and the modal age at death. The former refers to the age when half of the population has died, while the latter defines the age where most deaths are observed. Demographers are often interested in analyzing these two figures with decimals precision. In the previous literature, several methods for this purpose have been proposed. I selected some of them and provide R functions, enabling researchers to calculate the median age at death and the modal age at death with decimals points. Further, I compare the methods with data for US women from 1990 to 2019. To the best of my knowledge, there are currently no R functions on this topic. So, please feel free to improve and extent this work.

Calculating the Median Age at Death with Decimals

One method for calculating the median age at death has been described by Canudas-Romo (2010). My Rcode focuses on equation A1 on page 309, \[ Median(t) = x + \frac{[0.5 - l(x,t)]}{[l(x+1, t) - l(x,t)]}, \] where \(l(x,t)\) denotes the life table survivorship function in time \(t\) and age \(x\) and \(x+1\) correspond to the ages surrounding the median age at death, i.e., the exact age where \(l(x,t)=0.5\). In the following, I will apply this equation to data for the US in R. The period life tables for the USA are downloaded from the HMD.

setwd("d:/Rcode/Data/USA")
LT.women <- read.table("fltper_1x1.txt", header=TRUE, skip=2)


### getting the lx function for women in 2019
lx = LT.women$lx[LT.women$Year==2019]
### getting our ages, 0-110
last.age = length(lx)-1
age = 0:last.age
###normalize to 1
Sx = lx/lx[1]
### finding the last age before half of the population has died
x = Sx - 0.5
the.x = which(x==min(x[x>0]))
###applying the formula
Median = age[the.x] + ((0.5 - Sx[the.x]) / (Sx[the.x+1] - Sx[the.x]))

###corresponding function
Median1.fun <- function(lx) {
  
  Sx = lx/lx[1]
  x = Sx - 0.5
  the.x = which(x==min(x[x>0]))
  Median = age[the.x] + ((0.5 - Sx[the.x]) / (Sx[the.x+1] - Sx[the.x]))
  
  return(Median)
}

In addition, I test the performance of modeling the \(l(x,t)\) function and predicting it for ages with decimal points. This idea is based on work by Ouellette & Bourbeau (2011) and Horiuchi et al. (2013).

### using the splinefun
smooth.lx = splinefun(lx ~ age)
###you can modify the "by argument" for smaller age intervals
decimal.lx = smooth.lx(seq(0,age[length(age)], by=0.001))
smooth.Sx = decimal.lx/decimal.lx[1]
the.x = which.min(abs(smooth.Sx - 0.5))
Median = seq(0,110, by=0.001)[the.x]

###corresponding function
Median2.fun <- function(lx) {
  
  last.age = length(lx)-1
  age = 0:last.age
  smooth.lx = splinefun(lx ~ age)
  decimal.lx = smooth.lx(seq(0, age[length(age)], by=0.001))
  smooth.Sx = decimal.lx/decimal.lx[1]
  the.x = which.min(abs(smooth.Sx - 0.5))
  Median = seq(0, age[length(age)], by=0.001)[the.x]
  
  return(Median)
}


###using loess function
smooth.lx = loess(lx ~ age, span=0.1, degree=2)
decimal.lx = predict(smooth.lx, newdata=seq(0, age[length(age)], by=0.001))
smooth.Sx = decimal.lx/decimal.lx[1]
the.x = which.min(abs(smooth.Sx - 0.5))
Median = seq(0, age[length(age)], by=0.001)[the.x]

###corresponding function
Median3.fun <- function(lx) {

  last.age = length(lx)-1
  age = 0:last.age
  smooth.lx = loess(lx ~ age, span=0.1, degree=2)
  decimal.lx = predict(smooth.lx, newdata=seq(0, age[length(age)], by=0.001))
  smooth.Sx = decimal.lx/decimal.lx[1]
  the.x = which.min(abs(smooth.Sx - 0.5))
  Median = seq(0, age[length(age)], by=0.001)[the.x]
  
  return(Median)
  
}

###Lets test them out
Median1.fun(lx = LT.women$lx[LT.women$Year==2019])
## [1] 85.29959
Median2.fun(lx = LT.women$lx[LT.women$Year==2019])
## [1] 85.305
Median3.fun(lx = LT.women$lx[LT.women$Year==2019])
## [1] 85.324

The results are very similar.

Calculating the Modal Age at Death with Decimals

The first method for calculating the modal age at death with decimals is taken from Kannisto (2001). The equation can be found on page 163 in the paper, \[ Modal(t) = x + \frac{[d(x,t)-d(x-1,t)]}{[d(x,t)-d(x-1,t)]+[d(x,t)-d(x+1,t)]}, \] with \(d(x,t)\) being the age-distribution of life tables deaths in time \(t\) and age \(x\) corresponds to the age where most deaths occur.

###getting data for US women in 2019
dx = LT.women$dx[LT.women$Year==2019]
### finding the age with highest number of deaths
the.x = which.max(dx)
###applying the formula
Mode = age[the.x] - 0.5 + (dx[the.x] - dx[the.x-1]) / 
  ( (dx[the.x] - dx[the.x-1]) + (dx[the.x] - dx[the.x+1]) )

###corresponding function
Mode1.fun <- function(dx) {

  last.age = length(dx)-1
  age = 0:last.age
  the.x = which.max(dx)
  Mode = age[the.x] - 0.5 + (dx[the.x] - dx[the.x-1]) / 
    ( (dx[the.x] - dx[the.x-1]) + (dx[the.x] - dx[the.x+1]) )

  return(Mode)  
}

Again, the other two methods are based on smoothing age-specific death distributions and predicting deaths for small age intervals. While the first approach uses the optimization function in order to obtain the modal age at death with decimal precision (Ouellette & Bourbeau 2011; Ebeling, Rau, and Baudisch 2018), the second uses R’s loess() function.

### I do not consider infant mortality
smooth.dx = splinefun(dx[-1] ~ age[-1])
max.opt = optimize(smooth.dx, interval = c(30, 110), maximum = T)
Mode = max.opt$maximum

###corresponding function
Mode2.fun <- function(dx) {

  last.age = length(dx)-1
  age = 0:last.age
  smooth.dx = splinefun(dx[-1] ~ age[-1])
  max.opt = optimize(smooth.dx, interval = c(30, 110), maximum = T)
  Mode = max.opt$maximum
  
  return(Mode)
}

### The loess approach, again no infant mortality considered
loess.smooth.dx = loess(dx[-1] ~ age[-1], span=0.1, degree = 2)
y = predict(loess.smooth.dx, newdata = seq(1, age[length(age)], by=0.001))
Mode = seq(1, age[length(age)], by=0.001)[which.max(y)]

###the corresponding function
Mode3.fun <- function(dx) {

  loess.smooth.dx = loess(dx[-1] ~ age[-1], span=0.1, degree = 2)
  y = predict(loess.smooth.dx, newdata = seq(1, age[length(age)], by=0.001))
  Mode = seq(1, age[length(age)], by=0.001)[which.max(y)]

  return(Mode)  
}

###Lets test it out
Mode1.fun(dx = LT.women$dx[LT.women$Year==2019])
## [1] 88.65847
Mode2.fun(dx = LT.women$dx[LT.women$Year==2019])
## [1] 88.74962
Mode3.fun(dx = LT.women$dx[LT.women$Year==2019])
## [1] 88.554

Using Camarda’s MortalitySmooth package for deriving death counts with decimals

The following method uses Camarda’s MortalitySmooth package for deriving age-specific mortality rates in tiny age intervals. These rates are then used to obtain the age-specific survivorship function and the corresponding age-specific death distribution. The maximum of the death distribution (in tiny age intervals) gives the modal age at death with decimals. The method has been proposed by Horiuchi et.al (2013) and the R code is taken from their paper (page 60). Please note that the smoothing of death rates considers the ages 10-110. I run into problems when I want to fit the model to data with age ranging from 1 to 110.

####MortalitySmooth is currently only available through the cran archive
###require(devtools)
###install_version("MortalitySmooth", version = "2.3.4", repos = ###"https://cran.r-project.org")
### The package requires the "svcm" package which can also be downloaded from
### the cran archives, make sure that Rtools is installed on your computer

library(MortalitySmooth)
## Lade nötiges Paket: svcm
## Lade nötiges Paket: Matrix
## Lade nötiges Paket: splines
## Lade nötiges Paket: lattice
###reading in death counts and exposure
setwd("d:/Rcode/Data/USA")
Dx.LT <- read.table("Deaths_1x1.txt", header=TRUE, skip=2)
Ex.LT <- read.table("Exposures_1x1.txt", header=TRUE, skip=2)

Horiuchi.et.al.2013.fun <- function(age, Dx, Ex, delta) {
  
  D <- Dx[age+1]
  E <- Ex[age+1]
  fit <- Mort1Dsmooth(x = age, y = D,
                      offset = log(E))
  age.narrow <- seq(from = min(age), to = max(age), by = delta)
  log.mu.hat <- predict(object = fit, newdata = age.narrow)
  mu.hat <- exp(log.mu.hat)
  l.hat <- exp(-cumsum(mu.hat ∗ delta))
  d.hat <- mu.hat ∗ l.hat
  M.hat <- age.narrow[which.max(d.hat)]
 
  return(M.hat)  
}



MortSmooth.Mode <- c()

for (i in 1:length(1990:2019)) {

  Dx <- Dx.LT$Female[Dx.LT$Year==1989+i]
  Ex <- Ex.LT$Female[Ex.LT$Year==1989+i]
  
  MortSmooth.Mode[i] <- Horiuchi.et.al.2013.fun(age=10:110,
                                           delta=0.001,
                                           Dx=Dx,
                                           Ex=Ex)
  
}

Two General Function for the Median Age at Death and the Modal Age at Death

Finally, I provide two general functions where the method can be selected through the function argument “method”. Since the method using “MortalitySmooth” requires different function inputs, i.e., death counts and exposures as compared to life table functions, I do not include this method in the general function.

MedianAge <- function(lx, method = c("formula", "spline", "loess")) {
    
  if (method == "formula") {
    out <- Median1.fun(lx)
  }
  
  if (method == "spline") {
    out <- Median2.fun(lx)
  }
  
  if (method == "loess") {
    out <- Median3.fun(lx)
  }
  
  return(out)
}

ModeAge <- function(dx, method = c("formula", "spline", "loess")) {
  
  if (method == "formula") {
    out <- Mode1.fun(dx)
  }
  
  if (method == "spline") {
    out <- Mode2.fun(dx)
  }
  
  if (method == "loess") {
    out <- Mode3.fun(dx)
  }
  
  return(out)
}

###Lets test them

Median.1 <- c()
Median.2 <- c()
Median.3 <- c()

Modal.1 <- c()
Modal.2 <- c()
Modal.3 <- c()

for (i in 1:length(1990:2019)) {
  
Median.1[i] <- MedianAge(lx = LT.women$lx[LT.women$Year==1989+i], method="formula")
Median.2[i] <- MedianAge(lx = LT.women$lx[LT.women$Year==1989+i], method="spline")
Median.3[i] <- MedianAge(lx = LT.women$lx[LT.women$Year==1989+i], method="loess")

Modal.1[i] <- ModeAge(dx = LT.women$dx[LT.women$Year==1989+i], method="formula")
Modal.2[i] <- ModeAge(dx = LT.women$dx[LT.women$Year==1989+i], method="spline")
Modal.3[i] <- ModeAge(dx = LT.women$dx[LT.women$Year==1989+i], method="loess")

}

The results suggest that the three methods, “formula”, “spline”, and “loess”, produce very similar results for the median age at death. For the modal age at death, method “formula” corresponds to the “spline” approach. The “loess” method as well as the “MortSmooth” approach can considerably differ from these values.

References

  • Canudas-Romo, V. (2010). Three Measures of Longevity: Time Trends and Record Values. Demography 47(2):299–312.

  • Ebeling, M.; Rau, R.; Baudisch, A. (2018). Rectangularization of the survival curve reconsidered: The maximum inner rectangle approach. Population Studies 72(3):369-379.

  • Horiuchi, S.; Ouellette, N.; Cheung, S.L.K, and Robine, J.-M. Modal age at death: lifespan indicator in the era of longevity extension. Vienna Yearbook of Population Research 2013, pp. 37-69.

  • Kannisto, V. (2001). Mode and Dispersion of the Length of Life. Population: An English Selection13:159–71.

  • Ouellette, N. and Bourbeau, R. (2011). Changes in the age-at-death distribution in four low mortality countries: A nonparametric approach. Demographic Research 25(19):95-628.

Markus Sauerberg
Markus Sauerberg
Research Scientist