Given cumulative transition hazards sample paths through the multi-state model.

mssample(
  Haz,
  trans,
  history = list(state = 1, time = 0, tstate = NULL),
  beta.state = NULL,
  clock = c("forward", "reset"),
  output = c("state", "path", "data"),
  tvec,
  cens = NULL,
  M = 10,
  do.trace = NULL
)

Arguments

Haz

Cumulative hazards to be sampled from. These should be given as a data frame with columns time, Haz, trans, for instance as the Haz list element given by msfit.

trans

Transition matrix describing the multi-state model. See trans in msprep for more detailed information

history

A list with elements state, specifying the starting state(s), time, the starting time(s), and tstate, a numeric vector of length the number of states, specifying at what times states have been visited, if appropriate. The default of tstate is NULL; more information can be found under Details.

The elements state and time may either be scalars or vectors, in which case different sampled paths may start from different states or at different times. By default, all sampled paths start from state 1 at time 0.

beta.state

A matrix of dimension (no states) x (no transitions) specifying estimated effects of times at which earlier states were reached on subsequent transitions. If these are not in the model, the value NULL (default) suffices; more information can be found under Details

clock

Character argument, either "forward" (default) or "reset", specifying whether the time-scale of the cumulative hazards is in forward time ("forward") or duration in the present state ("reset")

output

One of "state", "path", or "data", specifying whether states, paths, or data should be output.

tvec

A numeric vector of time points at which the states or paths should be evaluated. Ignored if output="data"

cens

An independent censoring distribution, given as a data frame with time and Haz

M

The number of sampled trajectories through the multi-state model. The default is 10, since the procedure can become quite time-consuming

do.trace

An integer, specifying that the replication number should be written to the console every do.trace replications. Default is NULL in which case no output is written to the console during the simulation

Value

M simulated paths through the multi-state model given by trans and Haz. It is either a data frame with columns time, pstate1, ..., pstateS for S states when output="state", or with columns time, ppath1,..., ppathP for the P paths specified in paths(trans) when output="path". When output="data", the sampled paths are stored in an "msdata" object, a data frame in long format such as that obtained by msprep. This may be useful for (semi-)parametric bootstrap procedures, in which case cens may be used as censoring distribution (assumed to be independent of all transition times and independent of any covariates).

Details

The procedure is described in detail in Fiocco, Putter & van Houwelingen (2008). The argument beta.state and the element tstate from the argument history are meant to incorporate situations where the time at which some previous states were visited may affect future transition rates. The relation between time of visit of state s and transition k is assumed to be linear on the log-hazards; the corresponding regression coefficient is to be supplied as the (s,k)-element of beta.state, which is 0 if no such effect has been included in the model. If no such effects are present, then beta.state=NULL (default) suffices. In the tstate element of history, the s-th element is the time at which state s was visited. This is only relevant for states which have been visited prior to the beginning of sampling, i.e. before the time element of history; the elements of tstate are internally updated when in the sampling process new states are visited (only if beta.state is not NULL to avoid unnecessary computations).

References

Fiocco M, Putter H, van Houwelingen HC (2008). Reduced-rank proportional hazards regression and simulation-based prediction for multi-state models. Statistics in Medicine 27, 4340--4358.

Author

Marta Fiocco, Hein Putter H.Putter@lumc.nl

Examples

# transition matrix for illness-death model tmat <- trans.illdeath() # data in wide format, for transition 1 this is dataset E1 of # Therneau & Grambsch (T&G) tg <- data.frame(illt=c(1,1,6,6,8,9),ills=c(1,0,1,1,0,1), dt=c(5,1,9,7,8,12),ds=c(1,1,1,1,1,1), x1=c(1,1,1,0,0,0),x2=c(6:1)) # data in long format using msprep tglong <- msprep(time=c(NA,"illt","dt"),status=c(NA,"ills","ds"), data=tg,keep=c("x1","x2"),trans=tmat) # expanded covariates tglong <- expand.covs(tglong,c("x1","x2")) # Cox model with different covariate cx <- coxph(Surv(Tstart,Tstop,status)~x1.1+x2.2+strata(trans), data=tglong,method="breslow") # new data, to check whether results are the same for transition 1 as T&G newdata <- data.frame(trans=1:3,x1.1=c(0,0,0),x2.2=c(0,1,0),strata=1:3) fit <- msfit(cx,newdata,trans=tmat) tv <- unique(fit$Haz$time) # mssample set.seed(1234) mssample(Haz=fit$Haz,trans=tmat,tvec=tv,M=100)
#> time pstate1 pstate2 pstate3 #> 1 1 0.91 0.07 0.02 #> 2 5 0.91 0.00 0.09 #> 3 6 0.69 0.22 0.09 #> 4 7 0.69 0.11 0.20 #> 5 8 0.53 0.11 0.36 #> 6 9 0.00 0.53 0.47 #> 7 12 0.00 0.00 1.00
set.seed(1234) paths(tmat)
#> [,1] [,2] [,3] #> [1,] 1 NA NA #> [2,] 1 2 NA #> [3,] 1 2 3 #> [4,] 1 3 NA
mssample(Haz=fit$Haz,trans=tmat,tvec=tv,M=100,output="path")
#> time ppath1 ppath2 ppath3 ppath4 #> 1 1 0.91 0.07 0.00 0.02 #> 2 5 0.91 0.00 0.07 0.02 #> 3 6 0.69 0.22 0.07 0.02 #> 4 7 0.69 0.11 0.18 0.02 #> 5 8 0.53 0.11 0.18 0.18 #> 6 9 0.00 0.53 0.29 0.18 #> 7 12 0.00 0.00 0.82 0.18
set.seed(1234) mssample(Haz=fit$Haz,trans=tmat,tvec=tv,M=100,output="data",do.trace=25)
#> Replication 25 finished at Thu Dec 17 11:35:38 2020 #> Replication 50 finished at Thu Dec 17 11:35:38 2020 #> Replication 75 finished at Thu Dec 17 11:35:38 2020 #> Replication 100 finished at Thu Dec 17 11:35:38 2020
#> An object of class 'msdata' #> #> Data: #> [1] id Tstart Tstop duration from to status trans #> <0 rows> (or 0-length row.names)