Multivariate Bayes Spaces and Compositions


Multivariate Bayes Spaces and Compositions

van den Boogaart, K. G.; Tolosana Delgado, R.

The aim of this contribution is to present the necessary vector space structures and transforms
needed for the statistical analysis of multi-way compositions and multivariate distributions. Both
theoretical developments and examples will be provided.
Several contributions to past CoDaWorks and subsequent articles have dealt with two-way
compositions, covering the space structure, the interpretation of its subspaces and ilr coordinates
(e.g.: Egozcue et al., 2008; Fačevicová et al., 2014; de Sousa et al., 2021). This contribution
reviews these results from a common framework and extends towards multivariate distributions.
Multivariate compositions represent joint distributions of multiple categorical variables. They
form vector spaces and are a special case of multivariate Bayes-spaces, containing arbitrary mul-
tivariate distributions. In all these spaces conditional distributions, independent distributions,
and graphical models can be represented by certain subspaces. Appropriate isometric log ratio
representations are constructed from univariate representations. They explicitly separate rele-
vant subspaces related to their dependence structure as described by Markov graphs and the
Hammersley-Clifford theorem.
The contribution shows with three examples how this structural understanding can be used
to apply and interpret classical statistical methods applied to ilr-transformed multi-way compo-
sitions and multivariate distributions:
1. What are the mean and the variance of (observed) conditional distributions? As conditional
distributions are projections in these subspaces, their mean and variance are already well
defined in the projected space.
2. What are relevant hypotheses in linear models with multi-way compositional response? Clas-
sical multivariate linear models can test for the various kinds of dependence representable
by Markov graphs.
3. How to interpret the principal components from datasets of multivariate distributions? The
theory allows to attribute the influence of each PC to perturbations of the marginal distri-
butions and clique interactions.

Keywords: Multiway compositions; Multivarite Bayes Spaces; Graphical Models

  • Lecture (Conference)
    CoDaWork2022, 28.06.-01.07.2022, Toulouse, France
  • Contribution to proceedings
    CoDaWork2022, 28.06.-01.07.2022, Toulouse, France

Permalink: https://www.hzdr.de/publications/Publ-33984