Robust Regression with Compositional Response: Applications to Geosciences


Robust Regression with Compositional Response: Applications to Geosciences

Hron, K.; Filzmoser, P.; Templ, M.; van den Boogaart, K. G.; Tolosana-Delgado, R.

Abstract

Compositional data are multivariate observations describing quantitatively the relative importance or weight of a set of parts on a whole. Compositions frequently occur in geochemistry and they are popularly expressed in relative units, like proportions or percentages (i.e. as data with constant sum constraint 1 or 100, respectively).
The aim of multivariate regression is to quantify relations between a multivariate response and one or more explanatory variables, and to use these identied relations for prediction. The standard theory on linear regression models - the least squares methodology - is appropriate if the data do not include outlying observations, deviating from the main linear trend. Although robust regression tolerates a certain amount of deviating data points, it may lead to distorted results if it is directly applied to compositional data.
The isometric logratio (ilr) transformation is used to develop classical least-squares regression, where a compositional response depends on (non-compositional) explanatory variables. For several reasons it exists no straightforward solution for the robust robust regression problem with compositional response. Similarly as in the classical case, the step from the multivariate to the multiple model is not possible if the response ilr coordinates are not independent. Even more, in the robust case, to regress the response variables separately would result in ignoring the multivariate outliers. An additional challange is the proper choice of the ilr transformation that is crucial for an appropriate interpretation of results. Finally, a simplied approach to implement robust methods to ilr transformed data may produce transformationdependent results, an undesirable characteristic.
A solution is provided by the multivariate least trimmed squares (MLTS) method that fullls all required concepts of robustness for regression with compositional data. The robust regression model with compositional response can be used also for testing on subcompositional independence. Theoretical results are applied to a real-world problem from geosciences.

Keywords: compositional data; robust regression; least trimmed squares

  • Contribution to proceedings
    15th Annual Conference of the International Association for Mathematical Geosciences, IAMG 2013, 02.-6.9.2013, Madrid, Espana
    Mathematics of Planet Earth, Proceedings of the 15th Annual Conference of the International Association for Mathematical Geosciences, Lecture Notes in Earth System Science, Heidelberg: Springer, 978-3-642-32407-9, 87-90
    DOI: 10.1007/978-3-642-32408-6

Permalink: https://www.hzdr.de/publications/Publ-19187