In this vignette, we discuss how to use multilevelcoda to specify multilevel models where compositional data are used as predictors.

The following table outlines the packages used and a brief description of their purpose.

Package Purpose
multilevelcoda calculate between and within composition variables, calculate substitutions and plots
brms fit Bayesian multilevel models using Stan as a backend
bayestestR compute Bayes factors used to compare models
doFuture parallel processing to speed up run times
library(multilevelcoda)
#> 
#> Attaching package: 'multilevelcoda'
#> The following objects are masked _by_ '.GlobalEnv':
#> 
#>     psub, sbp
library(brms)
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.21.0). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#> 
#> Attaching package: 'brms'
#> The following object is masked from 'package:stats':
#> 
#>     ar
library(bayestestR)
library(doFuture)
#> Loading required package: foreach
#> Loading required package: future

options(digits = 3) # reduce number of digits shown

For the examples, we make use of three built in datasets:

Dataset Purpose
mcompd compositional sleep and wake variables and additional predictors/outcomes (simulated)
sbp a pre-specified sequential binary partition, used in calculating compositional predictors
psub all possible pairwise substitutions between compositional variables, used for substitution analyses
data("mcompd") 
data("sbp")
data("psub")

The following table shows a few rows of data from mcompd.

ID Time Stress TST WAKE MVPA LPA SB Age Female
185 1 4 542 99 297 460 41 30 0
185 2 7 458 49 117 653 162 30 0
185 3 3 271 41 489 625 15 30 0
184 12 2 286 53 107 906 89 22 1
184 13 1 281 19 403 611 126 22 1
184 14 0 397 26 40 587 390 22 1

The following table shows the sequential binary partition being used in sbp. Columns correspond to the composition variables (TST, WAKE, MVPA, LPA, SB). Rows correspond to distinct ILR coordinates.

TST WAKE MVPA LPA SB
1 1 -1 -1 -1
1 -1 0 0 0
0 0 1 -1 -1
0 0 0 1 -1

The following table shows how all the possible binary substitutions contrasts are setup. Time substitutions work by taking time from the -1 variable and adding time to the +1 variable.

TST WAKE MVPA LPA SB
1 -1 0 0 0
1 0 -1 0 0
1 0 0 -1 0
1 0 0 0 -1
-1 1 0 0 0
0 1 -1 0 0
0 1 0 -1 0
0 1 0 0 -1
-1 0 1 0 0
0 -1 1 0 0
0 0 1 -1 0
0 0 1 0 -1
-1 0 0 1 0
0 -1 0 1 0
0 0 -1 1 0
0 0 0 1 -1
-1 0 0 0 1
0 -1 0 0 1
0 0 -1 0 1
0 0 0 -1 1

1 Multilevel model with compositional predictors

1.1 Compositions and isometric log ratio (ILR) coordinates.

Compositional data are often expressed as a set of isometric log ratio (ILR) coordinates in regression models. We can use the complr() function to calculate both between- and within-level ILR coordinates for use in subsequent models as predictors.

Notes: complr() also calculates total ILR coordinates to be used as outcomes (or predictors) in models, if the decomposition into a between- and within-level ILR coordinates was not desired.

The complr() function for multilevel data requires four arguments:

Argument Description