Stats
Jackknife and bootstrap resampling for statistical error estimation, covariance matrices, and correlation matrices.
StatsBase
- class labc.stats.StatsBase
StatsBase is the shared abstract base for all resampling objects.
StatsType, StatsJack, and StatsBoot all inherit
from it. In normal usage you never instantiate StatsBase directly —
use StatsType.Jack() or StatsType.Boot() instead.
StatsType
- class labc.stats.StatsType(statsID: str)
Bases:
StatsBaseBase class for statistical analysis.
The standard usage is through the static methods
StatsType.Jack()andStatsType.Boot(), which return fully initialisedStatsJackandStatsBootobjects:stats = StatsType.Jack(num_config=100) mean, err, bins = stats.generate_stats(raw_data) stats = StatsType.Boot(num_config=100, num_bins=500) mean, err, bins = stats.generate_stats(raw_data)
StatsTypecan also be instantiated directly for a bins-only workflow, when only pre-computed bins are available and raw configurations are not. For example, with jackknife bins:stats = StatsType('Jack') mean = np.mean(bins, axis=0) err = stats.err_func(mean, bins)
- Parameters:
- statsID
str Identifier for the resampling method. Known values:
'Jack','Boot'(see_KNOWN_IDS).
- statsID
- Attributes:
Methods
Boot(*, num_config, num_bins[, seed])Bootstrap factory — see
StatsBootfor full documentation.Jack(*, num_config[, rebin])Jackknife factory — see
StatsJackfor full documentation.corr(data_x[, data_y])Compute the correlation matrix of one or two
DataStatsobjects.cov(data_x[, data_y])Compute the covariance matrix of one or two
DataStatsobjects.cov_blocks(*data)Covariance matrix of multiple
DataStatsobjects merged into one.cov_blocks_diag(*data)Block-diagonal covariance matrix of multiple
DataStatsobjects.err_func(array_mean, array_bins)Compute the statistical error from resampled bins.
generate_bins(_array_raw)It generates resampled bins from raw data.
generate_stats(array_raw)Compute mean, error, and resampled bins from raw configurations.
- Raises:
ValueErrorIf
statsIDis not in_KNOWN_IDS.
- static Jack(*, num_config: int, rebin: int = 1) StatsJack
Jackknife factory — see
StatsJackfor full documentation.
Jackknife
- class labc.stats.StatsJack(num_config: int, rebin: int = 1)
Bases:
StatsBaseJackknife resampling for statistical error estimation.
Supports optional rebinning: neighbouring configurations are averaged into blocks of size
rebinbefore the leave-one-out procedure. This reduces the effective autocorrelation length of the ensemble. Instantiate viaJack().- Parameters:
- Attributes:
Methods
corr(data_x[, data_y])Compute the correlation matrix of one or two
DataStatsobjects.cov(data_x[, data_y])Compute the covariance matrix of one or two
DataStatsobjects.cov_blocks(*data)Covariance matrix of multiple
DataStatsobjects merged into one.cov_blocks_diag(*data)Block-diagonal covariance matrix of multiple
DataStatsobjects.err_func(array_mean, array_bins)Compute the statistical error from resampled bins.
generate_bins(array_raw)Generate jackknife bins from raw configurations.
generate_stats(array_raw)Compute mean, error, and resampled bins from raw configurations.
- Raises:
ValueErrorIf
rebin > _MAX_REBIN. For larger rebinning factors, pre-average your configurations manually before constructing this object.
- Warns:
UserWarningIf
rebin == _MAX_REBIN, since this is unusual and may indicate significant autocorrelations in the data.
Notes
The prefactor used in error and covariance estimation is \(f = N_\mathrm{bins} - 1\).
- corr(data_x: DataStats, data_y: DataStats | None = None) ndarray
Compute the correlation matrix of one or two
DataStatsobjects.- Parameters:
- data_x
DataStats First dataset.
- data_y
DataStats,optional Second dataset. If
None, the auto-correlation ofdata_xis returned. Diagonal entries are exactly 1 in the auto-correlation case.
- data_x
- Returns:
np.ndarrayCorrelation matrix, shape
(len(data_x), len(data_y)).
Notes
Slicing for fit ranges or thinning should be applied to the inputs before calling this method.
- cov(data_x: DataStats, data_y: DataStats | None = None) ndarray
Compute the covariance matrix of one or two
DataStatsobjects.- Parameters:
- data_x
DataStats First dataset.
- data_y
DataStats,optional Second dataset. If
None, the auto-covariance ofdata_xis returned.
- data_x
- Returns:
np.ndarrayCovariance matrix, shape
(len(data_x), len(data_y)).
Notes
Slicing for fit ranges or thinning should be applied to the inputs before calling this method.
- cov_blocks(*data: DataStats) ndarray
Covariance matrix of multiple
DataStatsobjects merged into one.All datasets are concatenated along the observable axis before the covariance is computed, preserving cross-correlations between them.
- Parameters:
- *data
DataStats Datasets to merge.
- *data
- Returns:
np.ndarrayFull covariance matrix of the concatenated dataset.
- cov_blocks_diag(*data: DataStats) ndarray
Block-diagonal covariance matrix of multiple
DataStatsobjects.Computes the covariance of each dataset independently and assembles them into a block-diagonal matrix, assuming no cross-correlations.
- Parameters:
- *data
DataStats Datasets for each diagonal block.
- *data
- Returns:
np.ndarrayBlock-diagonal covariance matrix.
- err_func(array_mean: ndarray, array_bins: ndarray) ndarray
Compute the statistical error from resampled bins.
- Parameters:
- array_mean
np.ndarray Sample mean, shape
(T,).- array_bins
np.ndarray Resampled bins, shape
(num_bins, T).
- array_mean
- Returns:
np.ndarrayStatistical errors, shape
(T,).
Notes
The error is computed as:
\[\sigma_i = \sqrt{f \cdot \frac{1}{N_\mathrm{bins}} \sum_b \left(b_i - \bar{x}_i\right)^2}\]where \(f\) is the prefactor defined by the concrete subclass.
- generate_bins(array_raw: ndarray) ndarray
Generate jackknife bins from raw configurations.
- Parameters:
- array_raw
np.ndarray Raw configurations, shape
(num_config, T).
- array_raw
- Returns:
np.ndarrayJackknife bins, shape
(num_bins, T), wherenum_bins = ceil(num_config / rebin).
Notes
If
num_configis not divisible byrebin, the last configuration is repeated to pad to the nearest multiple. Rebinning averages consecutive blocks ofrebinconfigurations before the leave-one-out jackknife is applied to thenum_binsblock averages.
- generate_stats(array_raw: ndarray) tuple[ndarray, ndarray, ndarray]
Compute mean, error, and resampled bins from raw configurations.
- Parameters:
- array_raw
np.ndarray Raw configurations, shape
(num_config, T).
- array_raw
- Returns:
- mean
np.ndarray Sample mean, shape
(T,).- err
np.ndarray Statistical error, shape
(T,).- bins
np.ndarray Resampled bins, shape
(num_bins, T).
- mean
Bootstrap
- class labc.stats.StatsBoot(num_config: int, num_bins: int, seed: int = 0)
Bases:
StatsBaseBootstrap resampling for statistical error estimation.
Instantiate via
Boot().- Parameters:
- Attributes:
Methods
corr(data_x[, data_y])Compute the correlation matrix of one or two
DataStatsobjects.cov(data_x[, data_y])Compute the covariance matrix of one or two
DataStatsobjects.cov_blocks(*data)Covariance matrix of multiple
DataStatsobjects merged into one.cov_blocks_diag(*data)Block-diagonal covariance matrix of multiple
DataStatsobjects.err_func(array_mean, array_bins)Compute the statistical error from resampled bins.
generate_bins(array_raw)Generate bootstrap bins from raw configurations.
generate_stats(array_raw)Compute mean, error, and resampled bins from raw configurations.
Notes
The prefactor used in error and covariance estimation is \(f = 1\).
- corr(data_x: DataStats, data_y: DataStats | None = None) ndarray
Compute the correlation matrix of one or two
DataStatsobjects.- Parameters:
- data_x
DataStats First dataset.
- data_y
DataStats,optional Second dataset. If
None, the auto-correlation ofdata_xis returned. Diagonal entries are exactly 1 in the auto-correlation case.
- data_x
- Returns:
np.ndarrayCorrelation matrix, shape
(len(data_x), len(data_y)).
Notes
Slicing for fit ranges or thinning should be applied to the inputs before calling this method.
- cov(data_x: DataStats, data_y: DataStats | None = None) ndarray
Compute the covariance matrix of one or two
DataStatsobjects.- Parameters:
- data_x
DataStats First dataset.
- data_y
DataStats,optional Second dataset. If
None, the auto-covariance ofdata_xis returned.
- data_x
- Returns:
np.ndarrayCovariance matrix, shape
(len(data_x), len(data_y)).
Notes
Slicing for fit ranges or thinning should be applied to the inputs before calling this method.
- cov_blocks(*data: DataStats) ndarray
Covariance matrix of multiple
DataStatsobjects merged into one.All datasets are concatenated along the observable axis before the covariance is computed, preserving cross-correlations between them.
- Parameters:
- *data
DataStats Datasets to merge.
- *data
- Returns:
np.ndarrayFull covariance matrix of the concatenated dataset.
- cov_blocks_diag(*data: DataStats) ndarray
Block-diagonal covariance matrix of multiple
DataStatsobjects.Computes the covariance of each dataset independently and assembles them into a block-diagonal matrix, assuming no cross-correlations.
- Parameters:
- *data
DataStats Datasets for each diagonal block.
- *data
- Returns:
np.ndarrayBlock-diagonal covariance matrix.
- err_func(array_mean: ndarray, array_bins: ndarray) ndarray
Compute the statistical error from resampled bins.
- Parameters:
- array_mean
np.ndarray Sample mean, shape
(T,).- array_bins
np.ndarray Resampled bins, shape
(num_bins, T).
- array_mean
- Returns:
np.ndarrayStatistical errors, shape
(T,).
Notes
The error is computed as:
\[\sigma_i = \sqrt{f \cdot \frac{1}{N_\mathrm{bins}} \sum_b \left(b_i - \bar{x}_i\right)^2}\]where \(f\) is the prefactor defined by the concrete subclass.
- generate_bins(array_raw: ndarray) ndarray
Generate bootstrap bins from raw configurations.
- Parameters:
- array_raw
np.ndarray Raw configurations, shape
(num_config, T).
- array_raw
- Returns:
np.ndarrayBootstrap bins, shape
(num_bins, T).
- Raises:
ValueErrorIf the number of configurations in
array_rawdoes not matchself.num_config.
Notes
Each bootstrap sample is the mean of
num_configconfigurations drawn with replacement. The random state is seeded withself.seedso results are fully reproducible.
- generate_stats(array_raw: ndarray) tuple[ndarray, ndarray, ndarray]
Compute mean, error, and resampled bins from raw configurations.
- Parameters:
- array_raw
np.ndarray Raw configurations, shape
(num_config, T).
- array_raw
- Returns:
- mean
np.ndarray Sample mean, shape
(T,).- err
np.ndarray Statistical error, shape
(T,).- bins
np.ndarray Resampled bins, shape
(num_bins, T).
- mean