---
title: "Marked Point Process"
author: "Philip Mostert"
date: "`r Sys.Date()`"
bibliography: '`r system.file("References.bib", package="PointedSDMs")`'
biblio-style: authoryear
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Marked Point Process}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
eval = FALSE,
warning = FALSE,
message = FALSE
)
```
Real datasets are often fairly complex, and information on the species other than their location data may also be collected. Additional information could be the length of a specific plant, or the weight of a group of mammals, something which describes the underlying characteristics of the studied species'. Using such information results in a *marked point process (*see @illian2012toolbox for an overview); and this framework may be incorporated in the *PointedSDMs* R package*.*
This vignette gives an illustration on how include marks in the model framework, by using data on Eucalyptus globulus (common name: blue gum) on the *Koala Conservation Center on Philip Island* (Australia), collected by a community group between 1993 and 2004 [@moore2010palatability]. The dataset contains a multitude of different marks, however for this study we will be focusing on two: *food*, which is some index of the food value of the tree (calculated as dry matter intake multiplied by foliage palatability), and *koala*, describing the number of koala visits to each tree. No inference of the model is completed in this vignette due to computational intensity of the model. However the *R* script and data are provided below so that the user may carry out inference.
```{r, setup}
library(spatstat)
library(PointedSDMs)
library(sf)
library(sp)
library(ggplot2)
library(inlabru)
library(INLA)
```
```{r load_data}
data(Koala)
eucTrees <- Koala$eucTrees
boundary <- Koala$boundary
```
```{r, clean_data, echo = FALSE,fig.width=7, fig.height=5}
proj <- "+init=epsg:27700"
boundary <- as(boundary, 'sf')
st_crs(boundary) <- proj
euc <- st_as_sf(x = eucTrees,
coords = c('E', 'N'),
crs = proj)
euc$food <- euc$FOOD/1000
euc <- euc[unlist(st_intersects(boundary, euc)),]
class(trees)
mesh = inla.mesh.2d(boundary = boundary,
max.edge = c(10, 20),
offset = c(15,20),
cutoff = 2)
mesh$crs <- st_crs(proj)
ggplot() +
geom_sf(data = st_boundary(boundary)) +
geom_sf(data = euc, aes(color = koala)) +
ggtitle('Plot showing number of koalas at each site')
ggplot() +
geom_sf(data = st_boundary(boundary)) +
geom_sf(data = euc, aes(color = food)) +
ggtitle('Plot showing the food value index at each site')
```
Lets first create a model for the tree locations only: in this example, we will assume that these locations are treated as *present-only* data.
```{r, points_only,fig.width=7, fig.height=5}
points <- startISDM(euc, Boundary = boundary,
Projection = proj,
Mesh = mesh)
points$specifySpatial(sharedSpatial = TRUE,
prior.range = c(120, 0.1),
prior.sigma = c(1, 0.1))
pointsModel <- fitISDM(points, options = list(control.inla = list(int.strategy = 'eb')))
pointsPredictions <- predict(pointsModel, mask = boundary,
mesh = mesh, predictor = TRUE)
pointsPlot <- plot(pointsPredictions, variable = 'mean',
plot = FALSE)
pointsPlot$predictions$predictions$mean +
gg(euc)
```
Now lets add the marks to this framework by specifying the name of the response variable of the marks with the argument `markNames`, and the family of the marks with the argument `markFamily`. Doing so will model each marks as a separate observation process, based on the family specified.
```{r, include_marks,fig.width=7, fig.height=5}
marks <- startMarks(euc, Boundary = boundary,
Projection = proj,
markNames = c('food', 'koala'),
markFamily = c('gamma', 'poisson'),
Mesh = mesh)
marks$specifySpatial(sharedSpatial = TRUE,
prior.range = c(120, 0.1),
prior.sigma = c(1, 0.1))
marks$specifySpatial(Mark = 'koala',
prior.range = c(120, 0.1),
prior.sigma = c(1, 0.1))
marks$specifySpatial(Mark = 'food',
prior.range = c(60, 0.1),
prior.sigma = c(1, 0.1))
marksModel <- fitISDM(marks, options = list(control.inla = list(int.strategy = 'eb'),
safe = TRUE))
foodPredictions <- predict(marksModel, mask = boundary,
mesh = mesh, marks = 'food', spatial = TRUE,
fun = 'exp')
koalaPredictions <- predict(marksModel, mask = boundary,
mesh = mesh, marks = 'koala', spatial = TRUE)
plot(foodPredictions, variable = c('mean', 'sd'))
plot(koalaPredictions, variable = c('mean', 'sd'))
```
For the second mark model we only include the *food* mark, but this time we use a *log-linear* model with additive Gaussian noise. This is specified using the `.$updateFormula` slot function, and then by adding the scaling component to the *inlabru* components with the `.$addComponents` slot function to ensure that it is actually estimated. Moreover we assume *penalizing complexity* priors for the two spatial effects, as well as specify *bru_max_iter* in the *options* argument to keep the time to estimate down.
```{r, marks_add_scaling,fig.width=7, fig.height=5}
marks2 <- startMarks(euc, Boundary = boundary,
Projection = proj,
markNames = 'food',
markFamily = 'gaussian',
Mesh = mesh)
marks2$updateFormula(Mark = 'food',
newFormula = ~ exp(food_intercept + (shared_spatial + 1e-6)*scaling + food_spatial))
marks2$changeComponents(addComponent = 'scaling(1)')
marks2$specifySpatial(sharedSpatial = TRUE,
prior.sigma = c(1, 0.01),
prior.range = c(120, 0.01))
marks2$specifySpatial(Mark = 'food',
prior.sigma = c(1, 0.01),
prior.range = c(120, 0.01))
marksModel2 <- fitISDM(marks2, options = list(control.inla = list(int.strategy = 'eb'),
bru_max_iter = 2, safe = TRUE))
predsMarks2 <- predict(marksModel2, mask = boundary, mesh = mesh,
formula = ~ (food_intercept + (shared_spatial + 1e-6)*scaling + food_spatial))
plot(predsMarks2)
```