Regression with compositional response having unobserved components or below detection limit values


Regression with compositional response having unobserved components or below detection limit values

van den Boogaart, K. G.; Tolosana-Delgado, R.; Templ, M.

The typical way to deal with zeros and missing values in compositional data sets is to impute them with a reasonable value, and then the desired statistical model is estimated with the imputed data set, e.g., a regression model. This contribution aims at presenting alternative approaches to this problem within the framework of Bayesian regression with a compositional response. In the first step, a compositional data set with missing data is considered to follow a normal distribution on the simplex, which mean value is given as an Aitchison affine linear combination of some fully observed explanatory variables. Both the coefficients of this linear combination and the missing values can be estimated with standard Gibbs sampling techniques. In the second step, a normally distributed additive error is considered superimposed on the compositional response, and values are taken as ‘below the detection limit’ (BDL) if they are ‘too small’ in comparison with the additive standard deviation of each variable. Within this framework, the regression parameters and all missing values (including BDL) can be estimated with a Metropolis-Hastings algorithm. Both methods estimate the regression coefficients without need of any preliminary imputation step, and adequately propagate the uncertainty derived from the fact that the missing values and BDL are not actually observed, something imputation methods cannot achieve.

Keywords: Bayesian regression; compositional regression; missing values; nondetects; MCMC

Permalink: https://www.hzdr.de/publications/Publ-20909