Inference

infer

infer(fpm::FittedPumasModel[, type]; level=0.95)

The infer function computes uncertainty estimates for the population parameter estimates of the fpm structure and returns a FittedPumasModelInference structure used for inference based on the fitted model fpm. The uncertainty estimates are displayed in the form of standard errors and confidence intervals but the internal representation depends on the type of uncertainty estimator used. The standard errors can be extracted from the FittedPumasModelInference structure with the stderror function. The default level used in the calculation of the confidence intervals is 0.95.

Robust standard errors

The default type of uncertainty estimator is the robust (also called sandwich) covariance estimator of Halbert White (1982). Internally, this is represented as the (robust) covariance matrix which can also be computed with the vcov function directly from the FittedPumasModel structure. The method requires that the observed information matrix is (numerically) positive definite and invertible which is not always the case. When it is not possible to compute the robust covariance estimate, infer returns a failed FittedPumasModelInference. Example syntax for obtaining confidence intervals based on robust standard errors

my_infer = infer(my_fit)

The above code has a show method that prints the estimates, standard errors and confidence intervals in the REPL. If one wants to output the results as a DataFrame, the coeftable function can be used. Such data frame outputs can be used for markdown reporting.

coeftable(my_infer)

The robust covariance estimator is robust to some degree of model misspecification, but the estimator relies on asymptotic normality of the maximum likelihood estimator. While this is often a reasonable approximation for the parameters associated with the mean of the response variable, it is often a poor approximation for the variance parameters. As a consequence, it is not unlikely that confidence intervals for the variance parameters contain negative values when based on the the robust covariance matrix estimator. For that reason, the uncertainty can be assessed with two other types of estimators: Bootstrap and SIR.

Bootstrapped standard errors (Pumas.Bootstrap)

The quality of the standard errors and confidence intervals can typically be improved by a bootstrap procedure. By passing the Pumas.Bootstrap structure as the second argument, the uncertainty estimates will be based on a paired bootstrap, Bradley Efron (1979), possibly with stratification on one or more variables, see below.

The Pumas.Bootstrap structure constructor takes several optional arguments to specify the details of the bootstrap.

Pumas.Bootstrap(; rng::AbstractRNG, samples::Integer, stratify_by::Union{Nothing,Symbol}, ensemblealg::Union{EnsembleSerial,EnsembleThreads,EnsembleDistributed}])

Example syntax to run a bootstrap

my_bootstrap = infer(my_fit, Pumas.Bootstrap())

By default, the bootstrap uses the global RNG of Julia but it is possible to pass a custom rng. The default number of samples is 200. It is possible to stratify on a discrete covariate by passing the name of the covariate as the stratify_by argument. The default is no stratification. Finally, it is possible to specify the mode of parallelism through the ensemblealg argument. The default is EnsembleThreads. When running Pumas in a distributed setting, it can be beneficial to specify EnsembleThreads when fitting the model and then EnsembleDistributed for the bootstrap. Thereby, the different bootstrap samples will be distributed across Julia processes and each associated model fitting will benefit from multi-threaded execution. Example syntax to run a bootstrap with stratification

using Random
my_bootstrap = infer(my_fit, Pumas.Bootstrap(rng=MersenneTwister(1234), samples=500, stratify_by=[:sex, :trt], ensemblealg = EnsembleThreads()))

The coeftable function is designed to work on the output of Bootstrap where now the confidence intervals reported are bootstrap based.

Sampling importance resampling (Pumas.SIR)

While the bootstrap usually improves the quality of the standard errors and confidence intervals, it can be computationally prohibitive since this requires rerunning a potentially costly fitting procedure many times. Furthermore, when a few subjects are available, a non-parametric procedure might not work well or at all.

As an alternative to the bootstrap procedures, Anne-Gaëlle Dosne , Martin Bergstrand , Kajsa Harling , Mats O Karlsson (2016) suggests the sampling importance resampling procedure for estimating the uncertainty. The proposal distribution for the procedure is based on robust covariance matrix and will therefore fail whenever the computation of the robust covariance matrix fails.

The procedure is selected by passing the Pumas.SIR structure to infer.

Pumas.SIR(rng::AbstractRNG, samples::Integer, resamples::Integer)

Example syntax to run a SIR

my_sir = infer(my_fit, Pumas.SIR(samples=200))

The samples argument specifies the number of samples to be drawn from the proposal distribution and the argument is mandatory. The arguments rng and resamples are optional and specifies the rng to use for the random draws as well as the number of resamples to draw from the proposal samples. The defaults are the standard Julia RNG and samples ÷ 3 respectively.

The coeftable function is designed to work on the output of SIR where now the confidence intervals reported are SIR based.

stderror

stderror(fpm::FittedPumasModelInference)
stderror(fpm::FittedPumasModel)

The stderror(fpm::FittedPumasModelInference) is used for extracting the standard errors of the estimates from the fpm structure returned from a call to infer. The type of standard errors returned by stderror will therefore depend of the options passed to infer. It is also possible to call stderror directly on a FittedPumasModel returned from the fit function. Calling stderror in FittedPumasModel will return the Robust standard errors. Calling stderror(fpm::FittedPumasModel) is not recommended as it will recompute the covariance matrix. The preferred solution is to call infer and then call stderror on the FittedPumasModelInference structure returned from infer.