Skip to content

Conformal Prediction

Conformal prediction provides distribution-free, finite-sample prediction intervals (for regression) and prediction sets (for classification). Unlike asymptotic confidence intervals, conformal guarantees hold for any sample size and any data distribution:

\[ P\bigl(Y_{\text{new}} \in \hat{C}(X_{\text{new}})\bigr) \geq 1 - \alpha \]

fdars implements split conformal methods for functional regression and classification.


How split conformal works

  1. Split the training data into a proper training set and a calibration set.
  2. Fit the model on the proper training set.
  3. Compute residuals (nonconformity scores) on the calibration set.
  4. Construct prediction intervals/sets for new observations using the calibration quantile.

Coverage guarantee

For a calibration set of size \(n_{\text{cal}}\) and miscoverage level \(\alpha\), the coverage guarantee is:

\[ P\bigl(Y_{\text{new}} \in \hat{C}(X_{\text{new}})\bigr) \geq 1 - \alpha \]

This holds marginally (over both the calibration set and new data) without any distributional assumptions.


Conformal FPC regression

Wraps fregre_lm with split conformal calibration to produce prediction intervals.

import numpy as np
from fdars import Fdata
from fdars.conformal import conformal_fregre_lm

# --- Simulate data ---
np.random.seed(42)
n_train, n_test, m = 200, 50, 81
t = np.linspace(0, 1, m)
beta_true = np.sin(4 * np.pi * t)

def make_data(n):
    raw = np.zeros((n, m))
    for i in range(n):
        raw[i] = (
            np.random.randn() * np.sin(2 * np.pi * t)
            + np.random.randn() * np.cos(2 * np.pi * t)
            + 0.3 * np.random.randn(m)
        )
    fd = Fdata(raw, argvals=t)
    response = np.trapz(fd.data * beta_true, fd.argvals, axis=1) + 0.5 * np.random.randn(n)
    return fd, response

fd_train, train_response = make_data(n_train)
fd_test, test_response = make_data(n_test)

# --- Conformal prediction ---
result = conformal_fregre_lm(
    fd_train.data, train_response, fd_test.data,
    ncomp=3,
    cal_fraction=0.25,   # 25% of training data for calibration
    alpha=0.1,           # 90% prediction intervals
    seed=42,
)

lower       = result["lower"]        # (n_test,)
upper       = result["upper"]        # (n_test,)
predictions = result["predictions"]  # (n_test,)
coverage    = result["coverage"]     # empirical coverage (if test labels provided)

# Check coverage on test set
actual_coverage = np.mean((test_response >= lower) & (test_response <= upper))
print(f"Target coverage:  {1 - 0.1:.0%}")
print(f"Empirical coverage: {actual_coverage:.0%}")
print(f"Mean interval width: {np.mean(upper - lower):.4f}")
Key Type Description
lower ndarray (n_test,) Lower bounds of prediction intervals
upper ndarray (n_test,) Upper bounds of prediction intervals
predictions ndarray (n_test,) Point predictions
coverage float Reported coverage
Parameter Default Description
ncomp 3 Number of FPC components
cal_fraction 0.25 Fraction of training data reserved for calibration
alpha 0.1 Miscoverage level (\(1 - \alpha\) = coverage target)
seed 42 Random seed for the train/calibration split

Conformal nonparametric regression

Uses kernel regression (fregre_np) as the base model, with conformal calibration on top.

from fdars.conformal import conformal_fregre_np

result = conformal_fregre_np(
    fd_train.data, train_response, fd_test.data, fd_train.argvals,
    cal_fraction=0.25,
    alpha=0.1,
    h_func=1.0,
    h_scalar=1.0,
    seed=42,
)

actual_coverage = np.mean((test_response >= result["lower"]) &
                          (test_response <= result["upper"]))
print(f"NP conformal coverage: {actual_coverage:.0%}")
print(f"Mean interval width:   {np.mean(result['upper'] - result['lower']):.4f}")
Parameter Default Description
h_func 1.0 Functional bandwidth
h_scalar 1.0 Scalar bandwidth
cal_fraction 0.25 Calibration fraction
alpha 0.1 Miscoverage level

Conformal classification

Produces prediction sets for classification: a set of possible labels for each test observation, with guaranteed marginal coverage.

import numpy as np
from fdars import Fdata
from fdars.conformal import conformal_classif

# --- Simulate three-class data ---
np.random.seed(7)
n_train, n_test = 150, 30
m = 101
t = np.linspace(0, 1, m)

templates = [
    np.sin(2 * np.pi * t),
    np.cos(2 * np.pi * t),
    np.sin(4 * np.pi * t),
]

def make_classif_data(n):
    raw = np.zeros((n, m))
    labels = np.zeros(n, dtype=np.int64)
    for i in range(n):
        k = i % 3
        raw[i] = templates[k] + 0.4 * np.random.randn(m)
        labels[i] = k
    fd = Fdata(raw, argvals=t)
    return fd, labels

fd_train, train_labels = make_classif_data(n_train)
fd_test, test_labels = make_classif_data(n_test)

result = conformal_classif(
    fd_train.data, train_labels, fd_test.data,
    ncomp=3,
    classifier="lda",
    cal_fraction=0.25,
    alpha=0.1,
    seed=42,
)

pred_sets = result["prediction_sets"]  # list of lists
coverage  = result["coverage"]

# Inspect prediction sets
for i in range(min(5, n_test)):
    correct = test_labels[i] in pred_sets[i]
    print(f"  Test {i}: set={pred_sets[i]}, true={test_labels[i]}, "
          f"covered={'yes' if correct else 'NO'}")

actual_coverage = np.mean([test_labels[i] in pred_sets[i] for i in range(n_test)])
print(f"\nTarget coverage:   {1 - 0.1:.0%}")
print(f"Empirical coverage: {actual_coverage:.0%}")
print(f"Mean set size:      {np.mean([len(s) for s in pred_sets]):.2f}")
Key Type Description
prediction_sets list[list[int]] Prediction set for each test observation
coverage float Reported coverage
Parameter Default Description
classifier "lda" Base classifier: "lda", "qda", or "knn"
ncomp 3 Number of FPC components
cal_fraction 0.25 Calibration fraction
alpha 0.1 Miscoverage level

Interpreting prediction set sizes

  • Set size = 1: the model is confident about a single class.
  • Set size > 1: ambiguity -- multiple classes are plausible at the specified confidence level.
  • Empty set: can occur in rare edge cases; indicates the calibration set was too small.

Practical considerations

Choosing cal_fraction

The calibration fraction controls the bias-variance trade-off:

  • Larger calibration set (e.g., 0.3--0.5): tighter, more accurate coverage but the model is trained on less data.
  • Smaller calibration set (e.g., 0.1--0.2): more training data but wider intervals and noisier coverage.

A common choice is cal_fraction=0.25.

Choosing alpha

alpha Coverage target Typical use case
0.01 99% Safety-critical applications
0.05 95% Standard scientific inference
0.10 90% Exploratory analysis
0.20 80% Screening / ranking

Full example: comparing conformal methods

import numpy as np
from fdars import Fdata
from fdars.conformal import conformal_fregre_lm, conformal_fregre_np

np.random.seed(123)
n_train, n_test, m = 300, 100, 81
t = np.linspace(0, 1, m)
beta_true = np.exp(-((t - 0.5)**2) / 0.02)

def make_data(n):
    raw = np.zeros((n, m))
    for i in range(n):
        raw[i] = sum(
            np.random.randn() * np.sin((2*k+1) * np.pi * t)
            for k in range(4)
        ) + 0.2 * np.random.randn(m)
    fd = Fdata(raw, argvals=t)
    resp = np.trapz(fd.data * beta_true, fd.argvals, axis=1) + 0.4 * np.random.randn(n)
    return fd, resp

fd_train, train_resp = make_data(n_train)
fd_test, test_resp = make_data(n_test)

for alpha in [0.05, 0.10, 0.20]:
    # Linear conformal
    lm = conformal_fregre_lm(
        fd_train.data, train_resp, fd_test.data,
        ncomp=4, cal_fraction=0.25, alpha=alpha,
    )
    cov_lm = np.mean((test_resp >= lm["lower"]) & (test_resp <= lm["upper"]))
    width_lm = np.mean(lm["upper"] - lm["lower"])

    # Nonparametric conformal
    np_r = conformal_fregre_np(
        fd_train.data, train_resp, fd_test.data, fd_train.argvals,
        cal_fraction=0.25, alpha=alpha,
    )
    cov_np = np.mean((test_resp >= np_r["lower"]) & (test_resp <= np_r["upper"]))
    width_np = np.mean(np_r["upper"] - np_r["lower"])

    print(f"alpha={alpha:.2f} | LM: cov={cov_lm:.0%} width={width_lm:.3f} | "
          f"NP: cov={cov_np:.0%} width={width_np:.3f}")