Statistical Process Monitoring¶

Monitor streams of functional data for out-of-control behavior using FPCA-based control charts. The workflow has two phases:

Phase I -- Estimate the in-control distribution from historical data and compute control limits.
Phase II -- Project each incoming observation onto the learned subspace and check whether its \(T^2\) or SPE statistic exceeds the limits.

Concepts¶

FPCA-based monitoring¶

Each curve \(x_i(t)\) is centered by subtracting the mean \(\hat\mu(t)\) and projected onto the first \(K\) functional principal components, yielding a score vector \(\boldsymbol\xi_i \in \mathbb{R}^K\). Two complementary statistics capture different kinds of departure:

Statistic	What it measures	Formula
Hotelling \(T^2\)	Systematic shift in the FPC subspace	\(T^2 = \sum_{k=1}^{K} \xi_k^2 / \lambda_k\)
SPE (Q)	Residual variation outside the subspace	\(\mathrm{SPE} = \int [\tilde x(t)]^2 \, dt\) where \(\tilde x\) is the reconstruction residual

Control limits for both are estimated from the Phase I data so that the in-control false-alarm rate is approximately \(\alpha\).

Phase I -- estimating the baseline¶

import numpy as np
from fdars import Fdata
from fdars.simulation import simulate
from fdars.spm import spm_phase1

# Generate 80 in-control curves on a 100-point grid
argvals = np.linspace(0, 1, 100)
fd_ic = Fdata(simulate(80, argvals, n_basis=5, seed=1), argvals=argvals)

# Phase I estimation (3 components, alpha = 0.05)
p1 = spm_phase1(fd_ic.data, fd_ic.argvals, ncomp=3, alpha=0.05)

spm_phase1 returns a dictionary with the following keys:

Key	Shape	Description
`t2`	`(n,)`	\(T^2\) statistic for every Phase I observation
`spe`	`(n,)`	SPE statistic for every Phase I observation
`t2_limit`	scalar	Upper control limit for \(T^2\)
`spe_limit`	scalar	Upper control limit for SPE
`mean`	`(m,)`	Estimated mean function \(\hat\mu(t)\)
`loadings`	`(m, ncomp)`	FPCA rotation matrix (eigenfunctions)
`weights`	`(m,)`	Integration weights for the inner product
`eigenvalues`	`(ncomp,)`	Eigenvalues \(\lambda_1, \dots, \lambda_K\)

Choosing ncomp

A common rule of thumb is to retain enough components to explain 90--95 % of the total variance. Too few components make the SPE chart over-sensitive; too many inflate the \(T^2\) chart.

Phase II -- monitoring new observations¶

from fdars.spm import spm_monitor

# Simulate 20 new in-control observations + 10 faulty ones
data_new_ic = simulate(20, argvals, n_basis=5, seed=2)

# Inject a mean shift into the last 10 curves
data_fault = simulate(10, argvals, n_basis=5, seed=3) + 2.0
fd_new = Fdata(np.vstack([data_new_ic, data_fault]), argvals=argvals)

# Monitor
p2 = spm_monitor(
    mean=p1["mean"],
    loadings=p1["loadings"],
    weights=p1["weights"],
    eigenvalues=p1["eigenvalues"],
    t2_limit=p1["t2_limit"],
    spe_limit=p1["spe_limit"],
    new_data=fd_new.data,
    argvals=fd_new.argvals,
)

The returned dictionary contains:

Key	Shape	Description
`t2`	`(n_new,)`	\(T^2\) for each new observation
`spe`	`(n_new,)`	SPE for each new observation
`t2_alarm`	`(n_new,)` bool	`True` where \(T^2\) exceeds the limit
`spe_alarm`	`(n_new,)` bool	`True` where SPE exceeds the limit

# How many faults were caught?
n_t2_alarms = int(p2["t2_alarm"].sum())
n_spe_alarms = int(p2["spe_alarm"].sum())
print(f"T2 alarms: {n_t2_alarms}, SPE alarms: {n_spe_alarms}")

Hotelling \(T^2\) from scores¶

If you already have FPC scores (e.g., from your own FPCA), you can compute the \(T^2\) statistic directly:

from fdars.spm import hotelling_t2

# scores: (n, p) array of FPC scores
# eigenvalues: (p,) array
t2_values = hotelling_t2(scores, eigenvalues)

Full worked example¶

The script below ties everything together: simulate in-control data, run Phase I, introduce a fault, monitor in Phase II, and visualize the control chart.

import numpy as np
from fdars import Fdata
from fdars.simulation import simulate
from fdars.spm import spm_phase1, spm_monitor

# ── 1. Simulate in-control data ──────────────────────────────
argvals = np.linspace(0, 1, 100)
fd_ic = Fdata(simulate(100, argvals, n_basis=5, seed=10), argvals=argvals)

# ── 2. Phase I ───────────────────────────────────────────────
p1 = spm_phase1(fd_ic.data, fd_ic.argvals, ncomp=3, alpha=0.05)
print(f"T2 limit : {p1['t2_limit']:.3f}")
print(f"SPE limit: {p1['spe_limit']:.3f}")

# ── 3. Simulate Phase II data (in-control + fault) ──────────
data_ok  = simulate(30, argvals, n_basis=5, seed=20)
data_bad = simulate(20, argvals, n_basis=5, seed=30) + 3.0  # mean shift
fd_new = Fdata(np.vstack([data_ok, data_bad]), argvals=argvals)

# ── 4. Phase II monitoring ───────────────────────────────────
p2 = spm_monitor(
    mean=p1["mean"],
    loadings=p1["loadings"],
    weights=p1["weights"],
    eigenvalues=p1["eigenvalues"],
    t2_limit=p1["t2_limit"],
    spe_limit=p1["spe_limit"],
    new_data=fd_new.data,
    argvals=fd_new.argvals,
)

# ── 5. Report ────────────────────────────────────────────────
obs_ids = np.arange(1, len(fd_new) + 1)
alarm_idx = obs_ids[p2["t2_alarm"] | p2["spe_alarm"]]
print(f"Alarm observations: {alarm_idx}")

# ── 6. Visualize (optional, requires matplotlib) ─────────────
try:
    import matplotlib.pyplot as plt

    fig, axes = plt.subplots(2, 1, figsize=(10, 6), sharex=True)

    axes[0].plot(obs_ids, p2["t2"], "o-", markersize=3)
    axes[0].axhline(p1["t2_limit"], color="red", linestyle="--", label="UCL")
    axes[0].set_ylabel("Hotelling T²")
    axes[0].legend()

    axes[1].plot(obs_ids, p2["spe"], "o-", markersize=3)
    axes[1].axhline(p1["spe_limit"], color="red", linestyle="--", label="UCL")
    axes[1].set_ylabel("SPE (Q)")
    axes[1].set_xlabel("Observation index")
    axes[1].legend()

    fig.suptitle("FPCA-based Control Charts")
    plt.tight_layout()
    plt.savefig("spm_control_chart.png", dpi=150)
    plt.show()
except ImportError:
    pass

Performance note

Both spm_phase1 and spm_monitor delegate all linear algebra to Rust. Phase I on 500 curves of length 200 typically completes in under 10 ms.