Comparing causal models of binary traits using phylopath

Wouter van der Bijl

2024-06-11

Introduction

This vignette gives a short example of how PPA can be applied to binary data sets using phylopath. A longer example with more explanation of the code can be found in the other vignette, “intro to phylopath”.

Important notes:

There has been some discussion concerning how to best perform logistic regression with phylogenetic correction. I take no position on this matter. This package uses phylolm::phyloglm, written by Lam Si Tung Ho, Robert Lachlan, Rachel Feldman and Cécile Ané. phylopath’s accuracy is directly dependent on the accuracy of that function, and if you don’t trust phyloglm you should not trust binary models used in phylo_path.

phylolm::phyloglm performs checks for model convergence. In practice, these often fail. Especially in phylopath, were we often fit many models, it is likely that at least one model fails these checks and generates a warning. You can see the warnings using check_warnings(). You can try changing to the second method (method = 'logistic_IG10'), or passing parameters like btol and log.alpha.bound (see ?phylolm::phyloglm). These parameters are then applied to all models. Another option is to use the phylosem package, which supports both Binomial and Poisson errors with a different implementation.

The example below currently generates such warnings, should not be trusted, and is only presented as an example.

If you have useful opinions or information on these points, feel free to contact me.

Example analysis

Data and hypotheses

This recreates the analysis from the following paper:

Dey CJ, O’Connor CM, Wilkinson H, Shultz S, Balshine S & Fitzpatrick JL. 2017. Direct benefits and evolutionary transitions to complex societies. Nature Ecology & Evolution. 0137.

This is, to my knowledge, the first study to employ PPA on binary traits.

The study investigates the evolution of cooperative breeding in cichlids. In short (my summary), there has been intense debate about what factors drive species towards evolving systems of cooperative breeding. Many have argued (and provided evidence in birds and mammals) that cooperative breeding chiefly evolves from monogamous mating systems because helpers can gain indirect fitness benefits through kin selection. However, a non-exclusive alternative hypothesis is that direct benefits due to ecological factors may be important and provide direct benefits. Therefore, both hypotheses should be considered at the same time.

The data is included in this paper as cichlids and cichlids_tree

It contains five variables:

Under the indirect fitness hypothesis, monogamy is expected to be a major driver of cooperative breeding, while group living, biparental care and diet type may be important contributors towards a direct benefits scenario.

Defining the causal models

Following the paper in question, we define 12 putative causal models.

library(phylopath)

models <- define_model_set(
  A = c(C~M+D),
  B = c(C~D),
  C = c(C~D, P~M),
  D = c(C~D, M~P, G~P),
  E = c(C~D, P~M, G~P),
  F = c(C~D, P~M+G),
  G = c(C~D, M~P, P~G),
  H = c(C~D, M~P),
  I = c(C~D, M~M, G~P),
  J = c(M~P, G~D),
  K = c(P~M, G~D),
  L = c(C~M+D, P~M+G),
  .common = c(C~P+G)
)

plot_model_set(models, algorithm = 'kk')