Skip to contents

Introduction

A tolerance band for functional data is a region expected to contain a given fraction of individual curves in the population – the functional analogue of classical tolerance intervals. Unlike confidence bands (which target the mean), tolerance bands characterize the spread of individual curves.

fdars provides four tolerance band methods, plus a confidence band for the mean:

Tolerance bands (for individual curves):

Method Key Properties
FPCA Bootstrap on PC scores; pointwise or simultaneous
Conformal Distribution-free; uses calibration/training split
Exponential family For non-Gaussian data (Binomial, Poisson)
Elastic Alignment-based; removes phase variability first

Confidence band (for the mean function):

Method Key Properties
SCB (Degras) Simultaneous confidence band for the mean via multiplier bootstrap

The distinction matters: a tolerance band captures the spread of individual curves in the population, while a confidence band quantifies the uncertainty in the estimated mean function. Tolerance bands are always wider than confidence bands because individual curve variability exceeds mean estimation uncertainty.

How It Works (Intuition)

Imagine you have a collection of temperature curves measured over a year. A tolerance band answers: “If I measure one more year, where will the new curve likely fall?” The band should be wide enough to contain, say, 95% of future curves.

The FPCA method breaks each curve into a mean shape plus a few dominant modes of variation (principal components). It resamples the scores on these modes to estimate where new curves might land.

The conformal method takes a simpler, assumption-free approach: it holds out some curves, measures how far they deviate from the rest, and uses those deviations directly to set band width. No distributional assumptions needed.

The elastic method first aligns curves to remove timing differences (phase variability), then builds a tolerance band on the aligned data. This is useful when curves have the same shape but differ in timing – the band is tighter because alignment concentrates the variability.

The exponential family method handles non-Gaussian data (e.g., count data) by transforming to a natural parameter scale, computing bands there, and transforming back.

Finally, the SCB Degras method is different in kind: it builds a confidence band for the mean function rather than individual curves. It answers “where does the true population mean lie?” rather than “where will the next curve fall?”

Mathematical Framework

Setup

Let X1,,XnX_1, \ldots, X_n be i.i.d. random functions observed on a grid t1,,tT[a,b]t_1, \ldots, t_T \in [a, b] with mean function μ(t)=E[X(t)]\mu(t) = E[X(t)] and covariance function C(s,t)=Cov(X(s),X(t))C(s, t) = \text{Cov}(X(s), X(t)).

A (1α)(1 - \alpha)-tolerance band is a region [(t),u(t)][\ell(t), u(t)] such that

P(Xnew(t)[(t),u(t)]for all t)1αP\bigl(X_{\text{new}}(t) \in [\ell(t), u(t)] \;\text{for all } t\bigr) \geq 1 - \alpha

for a new independent draw XnewX_{\text{new}} from the same process.

FPCA Method (Rathnayake and Cuevas, 2016)

By the Karhunen-Loève expansion, each curve can be represented as

Xi(t)=μ(t)+k=1Kξikϕk(t)X_i(t) = \mu(t) + \sum_{k=1}^K \xi_{ik} \phi_k(t)

where ϕk\phi_k are the eigenfunctions of CC and ξik\xi_{ik} are uncorrelated PC scores with Var(ξik)=λk\text{Var}(\xi_{ik}) = \lambda_k. The method proceeds:

  1. Estimate μ̂\hat\mu, ϕ̂k\hat\phi_k, and scores ξ̂ik\hat\xi_{ik} from data
  2. Bootstrap: resample ξ̂ik*\hat\xi^*_{ik} from the empirical score distribution and reconstruct Xi*(t)=μ̂(t)+k=1Kξ̂ik*ϕ̂k(t)X^*_i(t) = \hat\mu(t) + \sum_{k=1}^K \hat\xi^*_{ik} \hat\phi_k(t)
  3. Pointwise band: At each tjt_j, set (tj)\ell(t_j) and u(tj)u(t_j) to the α/2\alpha/2 and 1α/21 - \alpha/2 quantiles of the bootstrap distribution
  4. Simultaneous band: Find the smallest c>0c > 0 such that a fraction 1α\geq 1 - \alpha of bootstrap curves lie entirely within μ̂(t)±cσ̂(t)\hat\mu(t) \pm c \cdot \hat\sigma(t), where σ̂(t)\hat\sigma(t) is the pointwise bootstrap standard deviation

The simultaneous band is wider (controls family-wise coverage) while the pointwise band is narrower (controls marginal coverage at each tt).

Conformal Method (Lei and Wasserman, 2014)

The conformal approach is distribution-free. Split the data into a training set of size ntrainn_{\text{train}} and a calibration set of size ncaln_{\text{cal}}:

  1. Compute μ̂(t)\hat\mu(t) from the training set
  2. For each calibration curve XjX_j, compute a non-conformity score:
    • Sup-norm: Rj=supt|Xj(t)μ̂(t)|R_j = \sup_t |X_j(t) - \hat\mu(t)|
    • L2L^2: Rj=(|Xj(t)μ̂(t)|2dt)1/2R_j = \bigl(\int |X_j(t) - \hat\mu(t)|^2 \, dt\bigr)^{1/2}
  3. Set q̂\hat{q} to the (1α)(ncal+1)/ncal\lceil (1 - \alpha)(n_{\text{cal}} + 1) \rceil / n_{\text{cal}} quantile of {R1,,Rncal}\{R_1, \ldots, R_{n_{\text{cal}}}\}
  4. The band is μ̂(t)±q̂\hat\mu(t) \pm \hat{q} (sup-norm) or μ̂(t)±q̂w(t)\hat\mu(t) \pm \hat{q} \cdot w(t) (L2L^2, with local weights)

The key guarantee is finite-sample validity: P(Rnewq̂)1αP(R_{\text{new}} \leq \hat{q}) \geq 1 - \alpha, with no distributional assumptions.

SCB Degras Method (Degras, 2011)

This constructs a simultaneous confidence band for the mean μ(t)\mu(t) rather than a tolerance band for individual curves. Let

Sn(t)=n(Xn(t)μ(t))σ̂(t)S_n(t) = \frac{\sqrt{n}(\bar{X}_n(t) - \mu(t))}{\hat\sigma(t)}

be the standardized process. Under regularity conditions, SnS_n converges to a Gaussian process GG with known covariance structure. The critical value cαc_\alpha is obtained via a multiplier bootstrap:

  1. Generate W1*,,Wn*iidN(0,1)W_1^*, \ldots, W_n^* \stackrel{\text{iid}}{\sim} N(0, 1)
  2. Compute G*(t)=1nσ̂(t)i=1nWi*(Xi(t)Xn(t))G^*(t) = \frac{1}{\sqrt{n} \hat\sigma(t)} \sum_{i=1}^n W_i^* (X_i(t) - \bar{X}_n(t))
  3. Set $c_= $ the (1α)(1-\alpha)-quantile of supt|G*(t)|\sup_t |G^*(t)| across bootstrap replicates

The SCB is then Xn(t)±cασ̂(t)/n\bar{X}_n(t) \pm c_\alpha \hat\sigma(t) / \sqrt{n}.

Exponential Family Method

For functional data from an exponential family with density

f(x|θ)=h(x)exp(θxA(θ))f(x | \theta) = h(x) \exp(\theta x - A(\theta))

the method applies the canonical link gg to transform data to the natural parameter scale, computes FPCA tolerance bands on the transformed data, and maps back through g1g^{-1}:

  1. Transform: Yi(t)=g(Xi(t))Y_i(t) = g(X_i(t))
  2. Compute FPCA band [Y(t),uY(t)][\ell_Y(t), u_Y(t)] on {Yi}\{Y_i\}
  3. Back-transform: (t)=g1(Y(t))\ell(t) = g^{-1}(\ell_Y(t)), u(t)=g1(uY(t))u(t) = g^{-1}(u_Y(t))

For Gaussian data (g=identityg = \text{identity}), this reduces to the standard FPCA band. For Poisson data (g=logg = \log), the band respects the non-negativity constraint.

Elastic Method

When curves exhibit phase variability (horizontal shifts), standard tolerance bands are inflated because they treat timing differences as amplitude variation. The elastic method removes this:

  1. Compute the Karcher mean μ̂K\hat\mu_K and warping functions γ̂1,,γ̂n\hat\gamma_1, \ldots, \hat\gamma_n using the elastic (Fisher-Rao) framework
  2. Align: X̃i(t)=Xi(γ̂i(t))\tilde{X}_i(t) = X_i(\hat\gamma_i(t))
  3. Compute an FPCA tolerance band on the aligned data {X̃i}\{\tilde{X}_i\}

The resulting band is tighter because alignment concentrates variability into the amplitude component, reducing the effective variance at each grid point.

Generate Sample Data

library(fdars)
#> 
#> Attaching package: 'fdars'
#> The following objects are masked from 'package:stats':
#> 
#>     cov, decompose, deriv, median, sd, var
#> The following object is masked from 'package:base':
#> 
#>     norm
set.seed(42)

argvals <- seq(0, 1, length.out = 50)
n <- 60
data <- matrix(0, n, 50)
for (i in 1:n) {
  data[i, ] <- sin(2 * pi * argvals) + rnorm(1, 0, 0.3) +
               rnorm(50, 0, 0.1)
}
fd <- fdata(data, argvals = argvals)
plot(fd)

FPCA Bootstrap Band

The FPCA method reconstructs curves from their principal component scores and uses bootstrap resampling to estimate quantiles. Two types are available:

  • Pointwise: Independent interval at each evaluation point (narrower)
  • Simultaneous: Single scaling factor across all points (wider, controls family-wise error)
band_pw <- tolerance.band(fd, method = "fpca", coverage = 0.95,
                          band.type = "pointwise", nb = 200, seed = 42)
print(band_pw)
#> Functional Tolerance Band
#>   Method: fpca 
#>   Coverage: 0.95 
#>   Grid points: 50 
#>   Mean half-width: 0.7171
plot(band_pw)

band_sim <- tolerance.band(fd, method = "fpca", coverage = 0.95,
                           band.type = "simultaneous", nb = 200, seed = 42)
plot(band_sim)

Conformal Prediction Band

The conformal method is distribution-free: it makes no parametric assumptions about the data-generating process. It splits data into a training set and calibration set, computing non-conformity scores on the calibration set to determine band width.

Two score types are available: - supnorm: Maximum deviation (constant band width) - l2: Integrated squared deviation (variable band width)

band_conf <- tolerance.band(fd, method = "conformal", coverage = 0.95,
                            score.type = "supnorm", seed = 42)
plot(band_conf)

Mean Confidence Band (SCB Degras)

The Degras (2011) method is fundamentally different from the tolerance band methods above. It constructs a simultaneous confidence band for the mean function μ(t)=E[X(t)]\mu(t) = E[X(t)], not a tolerance band for individual curves.

What it tells you: “The true population mean lies within this band with 95% confidence.” The band shrinks as nn \to \infty because we estimate the mean more precisely. In contrast, tolerance bands do not shrink with nn – they target the spread of individual curves, which is a fixed population property.

How it works: The method standardizes the empirical process n(Xn(t)μ(t))/σ̂(t)\sqrt{n}(\bar{X}_n(t) - \mu(t)) / \hat\sigma(t) and uses a multiplier bootstrap to estimate the distribution of its supremum. The critical value cαc_\alpha controls the family-wise coverage across all tt simultaneously – the band is valid uniformly over the entire domain, not just pointwise.

When to use it:

  • Testing whether a hypothesized mean function falls within the band
  • Comparing mean functions across groups (non-overlapping SCBs indicate significant differences)
  • Assessing sample size adequacy (very wide SCBs suggest more data is needed)
band_scb <- tolerance.band(fd, method = "scb", coverage = 0.95,
                           nb = 200, seed = 42)
plot(band_scb)

Note how much narrower the SCB is compared to the tolerance bands above – it targets the mean, not individual curves. The width scales as O(1/n)O(1/\sqrt{n}).

Exponential Family Band

For data from exponential family distributions (e.g., count data), the exponential family method applies the appropriate link function transformation before computing FPCA bands:

# Gaussian data (identity link)
band_exp <- tolerance.band(fd, method = "exponential", family = "gaussian",
                           nb = 200, seed = 42)
plot(band_exp)

Elastic (Alignment-Based) Band

The elastic method first aligns curves using the Karcher mean in the elastic metric, removing phase variability. It then computes an FPCA tolerance band on the aligned data. This produces tighter bands when curves exhibit timing differences:

# Generate data with phase variability
set.seed(42)
data_phase <- matrix(0, n, 50)
for (i in 1:n) {
  shift <- runif(1, -0.05, 0.05)
  data_phase[i, ] <- sin(2 * pi * (argvals - shift)) + rnorm(1, 0, 0.2) +
                     rnorm(50, 0, 0.05)
}
fd_phase <- fdata(data_phase, argvals = argvals)

band_elastic <- tolerance.band(fd_phase, method = "elastic", coverage = 0.95,
                               nb = 200, max.iter = 10, seed = 42)
print(band_elastic)
#> Functional Tolerance Band
#>   Method: elastic 
#>   Coverage: 0.95 
#>   Grid points: 50 
#>   Mean half-width: 0.5351
plot(band_elastic)

Compare the elastic band to FPCA on the same phase-shifted data:

band_fpca_phase <- tolerance.band(fd_phase, method = "fpca", coverage = 0.95,
                                  nb = 200, seed = 42)
cat("FPCA mean half-width:   ", round(mean(band_fpca_phase$half_width), 4), "\n")
#> FPCA mean half-width:    0.5563
cat("Elastic mean half-width:", round(mean(band_elastic$half_width), 4), "\n")
#> Elastic mean half-width: 0.5351

The elastic band is narrower because alignment concentrates variance into amplitude rather than spreading it across both amplitude and phase.

Choosing a Method

Your goal Recommended method
General-purpose tolerance band FPCA (method = "fpca")
No distributional assumptions Conformal (method = "conformal")
Data with timing differences Elastic (method = "elastic")
Count or binary functional data Exponential (method = "exponential")
Confidence band for the mean SCB Degras (method = "scb")

References

  • Rathnayake, L.N. and Cuevas, A. (2016). Tolerance bands for functional data. Technometrics, 58(3):326–334.
  • Lei, J. and Wasserman, L. (2014). Distribution-free prediction bands for non-parametric regression. Journal of the Royal Statistical Society: Series B, 76(1):71–96.
  • Degras, D. (2011). Simultaneous confidence bands for nonparametric regression with functional data. Statistica Sinica, 21(4):1735–1765.