Functional Tolerance Bands

Introduction

A tolerance band for functional data is a region expected to contain a given fraction of individual curves in the population – the functional analogue of classical tolerance intervals. Unlike confidence bands (which target the mean), tolerance bands characterize the spread of individual curves.

fdars provides four tolerance band methods, plus a confidence band for the mean:

Tolerance bands (for individual curves):

Method	Key Properties
FPCA	Bootstrap on PC scores; pointwise or simultaneous
Conformal	Distribution-free; uses calibration/training split
Exponential family	For non-Gaussian data (Binomial, Poisson)
Elastic	Alignment-based; removes phase variability first

Confidence band (for the mean function):

Method	Key Properties
SCB (Degras)	Simultaneous confidence band for the mean via multiplier bootstrap

The distinction matters: a tolerance band captures the spread of individual curves in the population, while a confidence band quantifies the uncertainty in the estimated mean function. Tolerance bands are always wider than confidence bands because individual curve variability exceeds mean estimation uncertainty.

How It Works (Intuition)

Imagine you have a collection of temperature curves measured over a year. A tolerance band answers: “If I measure one more year, where will the new curve likely fall?” The band should be wide enough to contain, say, 95% of future curves.

The FPCA method breaks each curve into a mean shape plus a few dominant modes of variation (principal components). It resamples the scores on these modes to estimate where new curves might land.

The conformal method takes a simpler, assumption-free approach: it holds out some curves, measures how far they deviate from the rest, and uses those deviations directly to set band width. No distributional assumptions needed.

The elastic method first aligns curves to remove timing differences (phase variability), then builds a tolerance band on the aligned data. This is useful when curves have the same shape but differ in timing – the band is tighter because alignment concentrates the variability.

The exponential family method handles non-Gaussian data (e.g., count data) by transforming to a natural parameter scale, computing bands there, and transforming back.

Finally, the SCB Degras method is different in kind: it builds a confidence band for the mean function rather than individual curves. It answers “where does the true population mean lie?” rather than “where will the next curve fall?”

Mathematical Framework

Setup

Let $X_1, \ldots, X_n$ be i.i.d. random functions observed on a grid $t_1, \ldots, t_T \in [a, b]$ with mean function $\mu(t) = E[X(t)]$ and covariance function $C(s, t) = \text{Cov}(X(s), X(t))$ .

A $(1 - \alpha)$ -tolerance band is a region $[\ell(t), u(t)]$ such that

$P\bigl(X_{\text{new}}(t) \in [\ell(t), u(t)] \;\text{for all } t\bigr) \geq 1 - \alpha$

for a new independent draw $X_{\text{new}}$ from the same process.

FPCA Method (Rathnayake and Cuevas, 2016)

By the Karhunen-Loève expansion, each curve can be represented as

$X_i(t) = \mu(t) + \sum_{k=1}^K \xi_{ik} \phi_k(t)$

where $\phi_k$ are the eigenfunctions of $C$ and $\xi_{ik}$ are uncorrelated PC scores with $\text{Var}(\xi_{ik}) = \lambda_k$ . The method proceeds:

Estimate $\hat\mu$ , $\hat\phi_k$ , and scores $\hat\xi_{ik}$ from data
Bootstrap: resample $\hat\xi^*_{ik}$ from the empirical score distribution and reconstruct $X^*_i(t) = \hat\mu(t) + \sum_{k=1}^K \hat\xi^*_{ik} \hat\phi_k(t)$
Pointwise band: At each $t_j$ , set $\ell(t_j)$ and $u(t_j)$ to the $\alpha/2$ and $1 - \alpha/2$ quantiles of the bootstrap distribution
Simultaneous band: Find the smallest $c > 0$ such that a fraction $\geq 1 - \alpha$ of bootstrap curves lie entirely within $\hat\mu(t) \pm c \cdot \hat\sigma(t)$ , where $\hat\sigma(t)$ is the pointwise bootstrap standard deviation

The simultaneous band is wider (controls family-wise coverage) while the pointwise band is narrower (controls marginal coverage at each $t$ ).

Conformal Method (Lei and Wasserman, 2014)

The conformal approach is distribution-free. Split the data into a training set of size $n_{\text{train}}$ and a calibration set of size $n_{\text{cal}}$ :

Compute $\hat\mu(t)$ from the training set
For each calibration curve XjX_j, compute a non-conformity score:
- Sup-norm: $R_j = \sup_t |X_j(t) - \hat\mu(t)|$
- $L^2$ : $R_j = \bigl(\int |X_j(t) - \hat\mu(t)|^2 \, dt\bigr)^{1/2}$
Set $\hat{q}$ to the $\lceil (1 - \alpha)(n_{\text{cal}} + 1) \rceil / n_{\text{cal}}$ quantile of $\{R_1, \ldots, R_{n_{\text{cal}}}\}$
The band is $\hat\mu(t) \pm \hat{q}$ (sup-norm) or $\hat\mu(t) \pm \hat{q} \cdot w(t)$ ( $L^2$ , with local weights)

The key guarantee is finite-sample validity: $P(R_{\text{new}} \leq \hat{q}) \geq 1 - \alpha$ , with no distributional assumptions.

SCB Degras Method (Degras, 2011)

This constructs a simultaneous confidence band for the mean $\mu(t)$ rather than a tolerance band for individual curves. Let

$S_n(t) = \frac{\sqrt{n}(\bar{X}_n(t) - \mu(t))}{\hat\sigma(t)}$

be the standardized process. Under regularity conditions, $S_n$ converges to a Gaussian process $G$ with known covariance structure. The critical value $c_\alpha$ is obtained via a multiplier bootstrap:

Generate $W_1^*, \ldots, W_n^* \stackrel{\text{iid}}{\sim} N(0, 1)$
Compute $G^*(t) = \frac{1}{\sqrt{n} \hat\sigma(t)} \sum_{i=1}^n W_i^* (X_i(t) - \bar{X}_n(t))$
Set $c_= $ the $(1-\alpha)$ -quantile of $\sup_t |G^*(t)|$ across bootstrap replicates

The SCB is then $\bar{X}_n(t) \pm c_\alpha \hat\sigma(t) / \sqrt{n}$ .

Exponential Family Method

For functional data from an exponential family with density

$f(x | \theta) = h(x) \exp(\theta x - A(\theta))$

the method applies the canonical link $g$ to transform data to the natural parameter scale, computes FPCA tolerance bands on the transformed data, and maps back through $g^{-1}$ :

Transform: $Y_i(t) = g(X_i(t))$
Compute FPCA band $[\ell_Y(t), u_Y(t)]$ on $\{Y_i\}$
Back-transform: $\ell(t) = g^{-1}(\ell_Y(t))$ , $u(t) = g^{-1}(u_Y(t))$

For Gaussian data ( $g = \text{identity}$ ), this reduces to the standard FPCA band. For Poisson data ( $g = \log$ ), the band respects the non-negativity constraint.

Elastic Method

When curves exhibit phase variability (horizontal shifts), standard tolerance bands are inflated because they treat timing differences as amplitude variation. The elastic method removes this:

Compute the Karcher mean $\hat\mu_K$ and warping functions $\hat\gamma_1, \ldots, \hat\gamma_n$ using the elastic (Fisher-Rao) framework
Align: $\tilde{X}_i(t) = X_i(\hat\gamma_i(t))$
Compute an FPCA tolerance band on the aligned data $\{\tilde{X}_i\}$

The resulting band is tighter because alignment concentrates variability into the amplitude component, reducing the effective variance at each grid point.

Generate Sample Data

FPCA Bootstrap Band

The FPCA method reconstructs curves from their principal component scores and uses bootstrap resampling to estimate quantiles. Two types are available:

Pointwise: Independent interval at each evaluation point (narrower)
Simultaneous: Single scaling factor across all points (wider, controls family-wise error)

band_pw <- tolerance.band(fd, method = "fpca", coverage = 0.95,
                          band.type = "pointwise", nb = 200, seed = 42)
print(band_pw)
#> Functional Tolerance Band
#>   Method: fpca 
#>   Coverage: 0.95 
#>   Grid points: 50 
#>   Mean half-width: 0.7163
plot(band_pw)

band_sim <- tolerance.band(fd, method = "fpca", coverage = 0.95,
                           band.type = "simultaneous", nb = 200, seed = 42)
plot(band_sim)

Conformal Prediction Band

The conformal method is distribution-free: it makes no parametric assumptions about the data-generating process. It splits data into a training set and calibration set, computing non-conformity scores on the calibration set to determine band width.

Two score types are available: - supnorm: Maximum deviation (constant band width) - l2: Integrated squared deviation (variable band width)

band_conf <- tolerance.band(fd, method = "conformal", coverage = 0.95,
                            score.type = "supnorm", seed = 42)
plot(band_conf)

Mean Confidence Band (SCB Degras)

The Degras (2011) method is fundamentally different from the tolerance band methods above. It constructs a simultaneous confidence band for the mean function $\mu(t) = E[X(t)]$ , not a tolerance band for individual curves.

What it tells you: “The true population mean lies within this band with 95% confidence.” The band shrinks as $n \to \infty$ because we estimate the mean more precisely. In contrast, tolerance bands do not shrink with $n$ – they target the spread of individual curves, which is a fixed population property.

How it works: The method standardizes the empirical process $\sqrt{n}(\bar{X}_n(t) - \mu(t)) / \hat\sigma(t)$ and uses a multiplier bootstrap to estimate the distribution of its supremum. The critical value $c_\alpha$ controls the family-wise coverage across all $t$ simultaneously – the band is valid uniformly over the entire domain, not just pointwise.

When to use it:

Testing whether a hypothesized mean function falls within the band
Comparing mean functions across groups (non-overlapping SCBs indicate significant differences)
Assessing sample size adequacy (very wide SCBs suggest more data is needed)

band_scb <- tolerance.band(fd, method = "scb", coverage = 0.95,
                           nb = 200, seed = 42)
plot(band_scb)

Note how much narrower the SCB is compared to the tolerance bands above – it targets the mean, not individual curves. The width scales as $O(1/\sqrt{n})$ .

Exponential Family Band

For data from exponential family distributions (e.g., count data), the exponential family method applies the appropriate link function transformation before computing FPCA bands:

# Gaussian data (identity link)
band_exp <- tolerance.band(fd, method = "exponential", family = "gaussian",
                           nb = 200, seed = 42)
plot(band_exp)

Elastic (Alignment-Based) Band

The elastic method first aligns curves using the Karcher mean in the elastic metric, removing phase variability. It then computes an FPCA tolerance band on the aligned data. This produces tighter bands when curves exhibit timing differences:

# Generate data with phase variability
set.seed(42)
data_phase <- matrix(0, n, 50)
for (i in 1:n) {
  shift <- runif(1, -0.05, 0.05)
  data_phase[i, ] <- sin(2 * pi * (argvals - shift)) + rnorm(1, 0, 0.2) +
                     rnorm(50, 0, 0.05)
}
fd_phase <- fdata(data_phase, argvals = argvals)

band_elastic <- tolerance.band(fd_phase, method = "elastic", coverage = 0.95,
                               nb = 200, max.iter = 10, seed = 42)
print(band_elastic)
#> Functional Tolerance Band
#>   Method: elastic 
#>   Coverage: 0.95 
#>   Grid points: 50 
#>   Mean half-width: 0.5379
plot(band_elastic)

Compare the elastic band to FPCA on the same phase-shifted data:

band_fpca_phase <- tolerance.band(fd_phase, method = "fpca", coverage = 0.95,
                                  nb = 200, seed = 42)
cat("FPCA mean half-width:   ", round(mean(band_fpca_phase$half_width), 4), "\n")
#> FPCA mean half-width:    0.5559
cat("Elastic mean half-width:", round(mean(band_elastic$half_width), 4), "\n")
#> Elastic mean half-width: 0.5379

The elastic band is narrower because alignment concentrates variance into amplitude rather than spreading it across both amplitude and phase.

Phase Tolerance Bands

When curves differ primarily in timing (phase variation), a phase tolerance band captures the expected range of warping functions. This tells you “how much timing variation is normal”:

band_phase <- tolerance.band(fd_phase, method = "phase",
                             ncomp = 3, coverage = 0.95, nb = 200, seed = 42)
plot(band_phase)

The result includes $phase.lower and $phase.upper (warping function bounds) in addition to the standard amplitude band.

Joint Elastic Tolerance Bands with Config

The "elastic.config" method provides fine-grained control over elastic tolerance bands by separately configuring amplitude and phase components:

band_joint <- tolerance.band(fd_phase, method = "elastic.config",
                             ncomp = 3, coverage = 0.95, nb = 200,
                             lambda = 0.0, max_iter = 20, seed = 42)
plot(band_joint)

This returns both amplitude bounds ($lower, $upper) and phase bounds ($phase.lower, $phase.upper), allowing you to understand whether new curves deviate in shape, timing, or both.

Choosing a Method

Your goal	Recommended method
General-purpose tolerance band	FPCA (`method = "fpca"`)
No distributional assumptions	Conformal (`method = "conformal"`)
Data with timing differences	Elastic (`method = "elastic"`)
Phase variation bounds only	Phase (`method = "phase"`)
Separate amplitude + phase bounds	Elastic config (`method = "elastic.config"`)
Count or binary functional data	Exponential (`method = "exponential"`)
Confidence band for the mean	SCB Degras (`method = "scb"`)

References

Rathnayake, L.N. and Cuevas, A. (2016). Tolerance bands for functional data. Technometrics, 58(3):326–334.
Lei, J. and Wasserman, L. (2014). Distribution-free prediction bands for non-parametric regression. Journal of the Royal Statistical Society: Series B, 76(1):71–96.
Degras, D. (2011). Simultaneous confidence bands for nonparametric regression with functional data. Statistica Sinica, 21(4):1735–1765.