In this vignette, we introduce how to explore recurrent event data by
mean cumulative function, and modeling the event counts of interest by
gamma frailty model with the **reda** package through
examples. Most functions in the package are S4 methods that produce S4
class objects. The details of function syntax and the produced objects
are available in the package manual, which will thus not be covered in
this vignette.

```
library(reda)
packageVersion("reda")
```

`## [1] '0.5.4'`

First of all, the sample recurrent event data we are going to use in
the following examples is called `simuDat`

, which contains
totally 500 observations of 6 variables.

`head(simuDat)`

```
## ID time event group x1 gender
## 1 1 1 1 Contr -1.93 female
## 2 1 22 1 Contr -1.93 female
## 3 1 23 1 Contr -1.93 female
## 4 1 57 1 Contr -1.93 female
## 5 1 112 0 Contr -1.93 female
## 6 2 140 0 Treat -0.11 female
```

`str(simuDat)`

```
## 'data.frame': 500 obs. of 6 variables:
## $ ID : num 1 1 1 1 1 2 3 3 4 4 ...
## $ time : num 1 22 23 57 112 140 40 168 14 112 ...
## $ event : int 1 1 1 1 0 0 1 0 1 0 ...
## $ group : Factor w/ 2 levels "Contr","Treat": 1 1 1 1 1 2 1 1 1 1 ...
## $ x1 : num -1.93 -1.93 -1.93 -1.93 -1.93 -0.11 0.2 0.2 -0.43 -0.43 ...
## $ gender: Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1 1 1 ...
```

where

`ID`

: Subjects identification (ID).`time`

: Event or censoring time.`event`

: Event indicator, 1 = event; 0 = censored.`group`

: Treatment group indicator.`x1`

: Continuous variable.`gender`

: Gender of subjects.

The dataset was simulated by thinning method (Lewis and Shedler 1979) and further processed
for a better demonstration purpose. (Note that **reda**
also provides functions for simulating survival data, and recurrent
event data. See `vignette("reda-simulate")`

for details.)

The process’s ID, event times, event indicators or costs, time
origins, and possible terminal events of the follow-up is specified in
the function `Recur()`

, which serves as the formula response
and contains considerate data checking procedures for recurrent event
data. See `vignette("reda-Recur")`

for details.

The nonparametric mean cumulative function (MCF) estimates are widely utilized in exploring the trend of recurrent event data. MCF is also called cumulative mean function (CMF) in literature (see e.g., Lawless and Nadeau 1995). Let \(N_i(t)\) denote the number of events that occurred up to time \(t\) of process \(i\). The MCF of \(N_i(t)\) denoted by \(M_i(t)\), is defined as follows: \[M_i(t)=\mathbb{E}\{N_i(t)\}.\] For \(k\) independent processes having the same MCF, the Nelson-Aalen Estimator (Nelson 2003) is often used, which is defined as follows: \[ \hat{M}(t) = \int_0^t \frac{dN(s)}{\delta(s)}, \] where \(dN(s)=\sum_{i=1}^k dN_i(s)\), \(dN_i(s)\) is the jump size of process \(i\) at time \(s\), \(\delta(s) = \sum_{i=1}^k \delta_i(s)\), \(\delta_i(s)\) is the at-risk indicator of process \(i\) at time \(s\). One variant is called the cumulative sample mean (CSM) function introduced by Cook and Lawless (2007), which assumes that \(\delta_i(s) = 1\) for \(s \ge 0\).

The nonparametric estimate of MCF at each time point does not assume any particular underlying model. The variance estimates at each time point can be computed by the Lawless and Nadeau method (Lawless and Nadeau 1995), Poisson process method, the bootstrap method (Efron 1979) with subjects as resampling units. For CSM, the cumulative sample variance (CSV) method can be used instead. The approximate confidence intervals are provided as well, which are constructed based on the asymptotic normality of the MCF estimates itself or logarithm of the MCF estimates.

The function `mcf()`

is a generic function for the MCF
estimates from a sample data or a fitted gamma frailty model (as
demonstrated later). If a formula with `Recur()`

as formula
response is specified in function `mcf()`

, the formula method
for estimating the sample MCF will be called. The covariate specified at
the right hand side of the formula should be either `1`

or
any “linear” combination of factor variables in the data. The former
computes the overall sample MCF. The latter computes the sample MCF for
each level of the combination of the factor variable(s) specified,
respectively.

The valve-seat dataset in Nelson (1995) and the simulated sample data are used for demonstration as follows:

```
## Example 1. valve-seat data
valveMcf0 <- mcf(Recur(Days, ID, No.) ~ 1, data = valveSeats)
## Example 2. the simulated data
simuMcf <- mcf(Recur(time, ID, event) ~ group + gender,
data = simuDat, subset = ID %in% seq_len(50))
```

After estimation, we may plot the sample MCF by function
`plot()`

, which returns a `ggplot`

object so that
the plot produced can be easily further customized by
**ggplot2**. The `legendname`

and
`legendLevels`

can be specified to easily customize the
legend in the plot. Two examples are given as follows:

```
## overall sample MCF for valve-seat data in Nelson (1995)
plot(valveMcf0, conf.int = TRUE, mark.time = TRUE, addOrigin = TRUE, col = 2) +
ggplot2::xlab("Days") + ggplot2::theme_bw()
```

```
## sample MCF for different groups (the default theme)
plot(simuMcf, conf.int = TRUE, lty = 1:4, legendName = "Treatment & Gender")
```