arviz_stats.reloo

arviz_stats.reloo#

arviz_stats.reloo(wrapper, loo_orig=None, k_threshold=-inf, pointwise=None)[source]#

Recalculate exact Leave-One-Out cross validation refitting where the approximation fails.

arviz_stats.loo estimates the values of Leave-One-Out (LOO) cross validation using Pareto Smoothed Importance Sampling (PSIS) to approximate its value. PSIS works well when the posterior and the posterior_i (excluding observation i from the data used to fit) are similar. In some cases, there are highly influential observations for which PSIS cannot approximate the LOO-CV, and a warning of a large Pareto shape is sent by ArviZ. These cases typically have a handful of bad or very bad Pareto shapes, and a majority of good or ok shapes.

Therefore, this may not indicate that the model is not robust enough nor that these observations are inherently bad, only that PSIS cannot approximate LOO-CV correctly. Thus, we can use PSIS for all observations where the Pareto shape is below a threshold and refit the model to perform exact cross validation for the handful of observations where PSIS cannot be used. This approach allows us to properly approximate LOO-CV with only a handful of refits, which in most cases is still much less computationally expensive than exact LOO-CV, which needs one refit per observation.

Parameters:

wrapperSamplingWrapper: An instance of a SamplingWrapper subclass that implements the necessary methods for model refitting. This wrapper allows reloo to work with any modeling framework.
loo_origELPDData, optional: Existing LOO results with pointwise data. If None, will compute PSIS-LOO-CV first using the data from wrapper.
k_thresholdfloat, optional: Pareto shape threshold. Observations with k values above this threshold will trigger a refit. Defaults to \(\min(1 - 1/\log_{10}(S), 0.7)\), where S is the number of samples.
pointwisebool, optional: If True, return pointwise LOO data. Defaults to rcParams["stats.ic_pointwise"].

Returns:

ELPDData: Updated LOO results where high Pareto k observations have been replaced with exact LOO-CV values from refitting.

Warning

Refitting can be computationally expensive. Check the number of high Pareto k values before using reloo to ensure the computation time is acceptable.

arviz_stats.reloo

Contents

arviz_stats.reloo#