fdars

High-performance Functional Data Analysis for Python, powered by Rust

fdars is a high-performance Python toolkit for functional data analysis powered by a Rust backend. Treat entire curves, spectra, and trajectories as single observations -- then smooth, align, decompose, and analyze them.

Built on fdars-core, the same engine that drives the fdars R package, fdars gives you native-speed computation with a familiar NumPy interface.

The `Fdata` Class¶

The central object in fdars is Fdata -- a functional data container that bundles observation data, evaluation grid, identifiers, and per-observation metadata into a single object (mirroring the R package's fdata class).

import numpy as np
import pandas as pd
from fdars import Fdata

# Create functional data from a (n_obs, n_points) array + grid
t = np.linspace(0, 1, 100)
X = np.random.randn(30, 100)

# Attach metadata as a pandas DataFrame
meta = pd.DataFrame({
    "group": ["control"] * 15 + ["treatment"] * 15,
    "age": np.random.randint(20, 60, 30),
})
fd = Fdata(X, argvals=t, metadata=meta)
fd
# Fdata (1D)  –  30 obs × 100 points  –  range [0.0, 1.0]  –  metadata: group, age

# Methods delegate to the Rust backend
mu = fd.mean()                     # pointwise mean
fd_c = fd.center()                 # centered Fdata
d1 = fd.deriv(nderiv=1)            # first derivative (returns Fdata)
norms = fd.norm(p=2.0)             # L2 norms per curve
depths = fd.depth("fraiman_muniz") # depth values
D = fd.distance(method="lp")       # self-distance matrix

# Subset -- metadata DataFrame and IDs are preserved
fd_sub = fd[0:10]
fd_sub.metadata  # DataFrame with 10 rows

See the Fdata reference and Introduction for a full walkthrough.

Learn

Tutorials and guides to get started with functional data analysis in Python.

Introduction to fdars

What is FDA? Core concepts, data layout, and your first analysis with fdars.

Simulation Toolbox

Generate synthetic curves with Karhunen-Loeve expansions and Gaussian processes.

Smoothing

Nadaraya-Watson, local polynomial, k-NN, and basis smoothing with automatic bandwidth selection.

Working with Derivatives

Compute first, second, and higher-order derivatives of functional data.

Represent

Basis expansions, dimensionality reduction, depth, and distances for functional data.

Functional PCA

Extract dominant modes of variation with weighted FPCA.

Basis Representation

B-spline, Fourier, and P-spline basis expansions with automatic selection.

Depth Functions

Fraiman-Muniz, band, modal, random projection, Tukey, and spatial depth.

Distance Metrics

Lp, Hausdorff, DTW, Soft-DTW, Fourier, and horizontal-shift distances.

Align

Curve registration and elastic alignment methods.

Elastic Alignment

SRSF-based alignment, Karcher mean, and elastic FPCA.

Shape Analysis

Shape-preserving registration and geodesic computations.

Regression

Functional regression, classification, and prediction.

Scalar-on-Function

FPC linear, PLS, and nonparametric regression with a scalar response.

Function-on-Scalar

FOSR and FANOVA for predicting functional responses.

Classification

LDA, QDA, k-NN, and kernel classifiers with cross-validation.

Elastic Regression

Regression models in the SRSF space for phase-invariant prediction.

Explainability

SHAP, PDP, permutation importance, and significant region detection.

Conformal Prediction

Distribution-free prediction intervals with split conformal and jackknife+.

Robust Regression

Depth-weighted and trimmed regression resistant to outliers.

Monitoring

Statistical process monitoring for functional profiles.

Process Monitoring

Phase I/II control charts, EWMA, CUSUM for functional quality profiles.

Analyze

Clustering, outlier detection, tolerance bands, and seasonal decomposition.

Clustering

K-means, fuzzy c-means, and GMM clustering for functional data.

Outlier Detection

LRT, outliergram, and magnitude-shape methods for anomaly detection.

Tolerance Bands

FPCA-based tolerance bands, conformal bands, and Degras SCBs.

Seasonal Analysis

SAZED, autoperiod, STL, and peak detection for periodic functional data.

Equivalence Testing

TOST-based equivalence tests for functional means.

Covariance Functions

Gaussian, exponential, Matern, and periodic covariance kernels.

Installation¶

pip install fdars

fdars ships pre-built wheels for Linux, macOS, and Windows on Python 3.9+. The only runtime dependency is NumPy.

Development install

To build from source (requires a Rust toolchain):

git clone https://github.com/sipemu/pyfda.git
cd fdars
pip install maturin
maturin develop --release

Quick Example¶

A minimal end-to-end workflow: create an Fdata object, compute depth rankings, and cluster.

import numpy as np
import pandas as pd
from fdars import Fdata
from fdars.simulation import simulate
from fdars.clustering import kmeans_fd

# 1. Simulate 60 curves on a regular grid
argvals = np.linspace(0, 1, 100)
data = simulate(n=60, argvals=argvals, n_basis=7, seed=42)

# 2. Wrap in an Fdata object with metadata
meta = pd.DataFrame({"batch": np.repeat(["A", "B", "C"], 20)})
fd = Fdata(data, argvals=argvals, metadata=meta)
print(fd)
# Fdata (1D)  –  60 obs × 100 points  –  range [0.0, 1.0]  –  metadata: batch

# 3. Rank curves by Fraiman-Muniz depth
depths = fd.depth("fraiman_muniz")
deepest = np.argmax(depths)
print(f"Most central curve: {deepest}, depth = {depths[deepest]:.4f}")

# 4. Center the data
fd_c = fd.center()     # returns Fdata with metadata preserved

# 5. Cluster into 3 groups
result = kmeans_fd(fd.data, fd.argvals, k=3, seed=0)
print(f"Cluster sizes: {np.bincount(result['cluster'])}")

Rust under the hood

Every method call on Fdata crosses into compiled Rust code via PyO3. There is no Python loop over the 60 curves -- the entire computation runs at native speed with multithreaded parallelism where applicable.

Package Modules¶

Module	Description
`fdars.Fdata`	Functional data container — the main entry point (1D curves, 2D surfaces, metadata)
`fdars.fdata`	Low-level functional data operations: mean, center, derivatives, norms, normalization
`fdars.depth`	Fraiman-Muniz, modal, band, random projection, Tukey, spatial depth
`fdars.metric`	Lp, Hausdorff, DTW, Soft-DTW, Fourier, horizontal-shift
`fdars.basis`	B-spline, Fourier, P-spline basis operations
`fdars.smoothing`	Nadaraya-Watson, local polynomial, k-NN, bandwidth CV
`fdars.clustering`	K-means, fuzzy c-means, GMM
`fdars.regression`	FPCA, PLS, nonparametric, robust, FOSR, FANOVA
`fdars.alignment`	SRSF alignment, Karcher mean, elastic FPCA
`fdars.outliers`	LRT, outliergram, magnitude-shape
`fdars.seasonal`	SAZED, autoperiod, STL, peak detection
`fdars.spm`	Phase I/II, EWMA, CUSUM process monitoring
`fdars.classification`	LDA, QDA, k-NN, kernel classifiers
`fdars.tolerance`	FPCA, conformal, Degras tolerance/confidence bands
`fdars.conformal`	Split conformal, jackknife+ prediction
`fdars.simulation`	Karhunen-Loeve simulation, Gaussian processes
`fdars.explain`	SHAP, PDP, permutation importance, significant regions