Inference
infer
infer(fpm::FittedPumasModel[, type]; level=0.95)
The infer
function computes uncertainty estimates for the population parameter estimates of the fpm
structure and returns a FittedPumasModelInference
structure used for inference based on the fitted model fpm
. The uncertainty estimates are displayed in the form of standard errors and confidence intervals but the internal representation depends on the type
of uncertainty estimator used. The standard errors can be extracted from the FittedPumasModelInference
structure with the stderror function. The default level used in the calculation of the confidence intervals is 0.95
.
Robust standard errors
The default type
of uncertainty estimator is the robust (also called sandwich) covariance estimator of Halbert White (1982). Internally, this is represented as the (robust) covariance matrix which can also be computed with the vcov
function directly from the FittedPumasModel
structure. The method requires that the observed information matrix is (numerically) positive definite and invertible which is not always the case. When it is not possible to compute the robust covariance estimate, infer
returns a failed FittedPumasModelInference
. Example syntax for obtaining confidence intervals based on robust standard errors
my_infer = infer(my_fit)
The above code has a show method that prints the estimates, standard errors and confidence intervals in the REPL
. If one wants to output the results as a DataFrame
, the coeftable
function can be used. Such data frame outputs can be used for markdown reporting.
coeftable(my_infer)
The robust covariance estimator is robust to some degree of model misspecification, but the estimator relies on asymptotic normality of the maximum likelihood estimator. While this is often a reasonable approximation for the parameters associated with the mean of the response variable, it is often a poor approximation for the variance parameters. As a consequence, it is not unlikely that confidence intervals for the variance parameters contain negative values when based on the the robust covariance matrix estimator. For that reason, the uncertainty can be assessed with two other types of estimators: Bootstrap
and SIR
.
Bootstrapped standard errors (Pumas.Bootstrap
)
The quality of the standard errors and confidence intervals can typically be improved by a bootstrap procedure. By passing the Pumas.Bootstrap
structure as the second argument, the uncertainty estimates will be based on a paired bootstrap, Bradley Efron (1979), possibly with stratification on one or more variables, see below.
The Pumas.Bootstrap
structure constructor takes several optional arguments to specify the details of the bootstrap.
Pumas.Bootstrap(; rng::AbstractRNG, samples::Integer, stratify_by::Union{Nothing,Symbol}, ensemblealg::Union{EnsembleSerial,EnsembleThreads,EnsembleDistributed}])
Example syntax to run a bootstrap
my_bootstrap = infer(my_fit, Pumas.Bootstrap())
By default, the bootstrap uses the global RNG of Julia but it is possible to pass a custom rng
. The default number of samples
is 200
. It is possible to stratify on a discrete covariate by passing the name of the covariate as the stratify_by
argument. The default is no stratification. Finally, it is possible to specify the mode of parallelism through the ensemblealg
argument. The default is EnsembleThreads
. When running Pumas in a distributed setting, it can be beneficial to specify EnsembleThreads
when fitting the model and then EnsembleDistributed
for the bootstrap. Thereby, the different bootstrap samples will be distributed across Julia processes and each associated model fitting will benefit from multi-threaded execution. Example syntax to run a bootstrap with stratification
using Random
my_bootstrap = infer(my_fit, Pumas.Bootstrap(rng=MersenneTwister(1234), samples=500, stratify_by=[:sex, :trt], ensemblealg = EnsembleThreads()))
The coeftable
function is designed to work on the output of Bootstrap
where now the confidence intervals reported are bootstrap based.
Sampling importance resampling (Pumas.SIR
)
While the bootstrap usually improves the quality of the standard errors and confidence intervals, it can be computationally prohibitive since this requires rerunning a potentially costly fitting procedure many times. Furthermore, when a few subjects are available, a non-parametric procedure might not work well or at all.
As an alternative to the bootstrap procedures, Anne-Gaëlle Dosne , Martin Bergstrand , Kajsa Harling , Mats O Karlsson (2016) suggests the sampling importance resampling procedure for estimating the uncertainty. The proposal distribution for the procedure is based on robust covariance matrix and will therefore fail whenever the computation of the robust covariance matrix fails.
The procedure is selected by passing the Pumas.SIR
structure to infer
.
Pumas.SIR(rng::AbstractRNG, samples::Integer, resamples::Integer)
Example syntax to run a SIR
my_sir = infer(my_fit, Pumas.SIR(samples=200))
The samples
argument specifies the number of samples to be drawn from the proposal distribution and the argument is mandatory. The arguments rng
and resamples
are optional and specifies the rng
to use for the random draws as well as the number of resamples to draw from the proposal samples. The defaults are the standard Julia RNG and samples ÷ 3
respectively.
The coeftable
function is designed to work on the output of SIR
where now the confidence intervals reported are SIR
based.
stderror
stderror(fpm::FittedPumasModelInference)
stderror(fpm::FittedPumasModel)
The stderror(fpm::FittedPumasModelInference)
is used for extracting the standard errors of the estimates from the fpm
structure returned from a call to infer. The type of standard errors returned by stderror
will therefore depend of the options passed to infer
. It is also possible to call stderror
directly on a FittedPumasModel
returned from the fit
function. Calling stderror
in FittedPumasModel
will return the Robust standard errors. Calling stderror(fpm::FittedPumasModel)
is not recommended as it will recompute the covariance matrix. The preferred solution is to call infer and then call stderror
on the FittedPumasModelInference
structure returned from infer
.