Skip to content

Covariance Functions

Covariance functions (kernels) describe the correlation structure of a stochastic process. They are the building blocks for Gaussian process simulation, Kriging, and kernel-based smoothing. fdars provides kernel-based covariance matrix construction and Gaussian process sample generation, all computed in Rust.


Available kernels

Kernel Formula Character
Gaussian (squared exponential) \(C(s,t) = \sigma^2 \exp\!\bigl(-\tfrac{(s-t)^2}{2\ell^2}\bigr)\) Infinitely smooth sample paths
Exponential $C(s,t) = \sigma^2 \exp!\bigl(-\tfrac{ s-t
Matern (\(\nu=1.5\)) $C(s,t) = \sigma^2 \bigl(1 + \tfrac{\sqrt{3}\, s-t
Periodic $C(s,t) = \sigma^2 \exp!\bigl(-\tfrac{2\sin^2(\pi s-t

In all formulas, \(\ell\) is the length scale and \(\sigma^2\) is the variance.


Computing a covariance matrix

import numpy as np
from fdars.simulation import covariance_matrix

argvals = np.linspace(0, 1, 100)

# Gaussian kernel
cov_gauss = covariance_matrix(argvals, kernel="gaussian", length_scale=0.2, variance=1.0)

# Exponential kernel
cov_exp = covariance_matrix(argvals, kernel="exponential", length_scale=0.2, variance=1.0)

print(f"Shape: {cov_gauss.shape}")           # (100, 100)
print(f"Symmetric: {np.allclose(cov_gauss, cov_gauss.T)}")  # True

Parameters

Parameter Type Default Description
argvals ndarray (m,) -- Evaluation grid
kernel str "gaussian" "gaussian" or "exponential"
length_scale float 0.2 Kernel length scale \(\ell\)
variance float 1.0 Kernel variance \(\sigma^2\)

Returns an ndarray of shape (m, m).

Effect of length scale

A small \(\ell\) produces rapidly varying (wiggly) functions; a large \(\ell\) produces smooth, slowly varying functions. Try values from 0.05 to 0.5 on \([0,1]\) to build intuition.


Generating Gaussian process samples

gaussian_process draws \(n\) sample paths from a zero-mean GP with the specified kernel. The function internally constructs the covariance matrix and performs a Cholesky decomposition.

from fdars import Fdata
from fdars.simulation import gaussian_process

argvals = np.linspace(0, 1, 200)

# 50 smooth curves (Gaussian kernel)
fd_gauss = Fdata(gaussian_process(50, argvals, kernel="gaussian", length_scale=0.15, seed=1), argvals=argvals)

# 50 rough curves (exponential kernel)
fd_exp = Fdata(gaussian_process(50, argvals, kernel="exponential", length_scale=0.15, seed=1), argvals=argvals)

# 50 Matern curves
fd_mat = Fdata(gaussian_process(50, argvals, kernel="matern", length_scale=0.15, seed=1), argvals=argvals)

# 50 periodic curves
fd_per = Fdata(gaussian_process(50, argvals, kernel="periodic", length_scale=0.15, seed=1), argvals=argvals)

Parameters

Parameter Type Default Description
n int -- Number of sample paths
argvals ndarray (m,) -- Evaluation grid
kernel str "gaussian" "gaussian", "exponential", "matern", or "periodic"
length_scale float 0.2 Kernel length scale
variance float 1.0 Kernel variance
seed int None Random seed (omit for non-deterministic)

Returns an ndarray of shape (n, m).


Comparing kernel shapes

The following script visualizes the covariance structure and sample paths for each kernel side by side.

import numpy as np
from fdars import Fdata
from fdars.simulation import covariance_matrix, gaussian_process

argvals = np.linspace(0, 1, 200)
kernels = ["gaussian", "exponential"]

try:
    import matplotlib.pyplot as plt

    fig, axes = plt.subplots(2, 2, figsize=(12, 8))

    for col, kern in enumerate(kernels):
        # Covariance row from the midpoint
        cov = covariance_matrix(argvals, kernel=kern, length_scale=0.15)
        mid = len(argvals) // 2
        axes[0, col].plot(argvals, cov[mid, :])
        axes[0, col].set_title(f"{kern.title()} kernel — C(0.5, t)")
        axes[0, col].set_xlabel("t")

        # Sample paths
        fd_samples = Fdata(gaussian_process(10, argvals, kernel=kern, length_scale=0.15, seed=42), argvals=argvals)
        for s in fd_samples.data:
            axes[1, col].plot(argvals, s, linewidth=0.7, alpha=0.7)
        axes[1, col].set_title(f"{kern.title()} — 10 sample paths")
        axes[1, col].set_xlabel("t")

    plt.tight_layout()
    plt.savefig("covariance_kernels.png", dpi=150)
    plt.show()
except ImportError:
    pass

Using GP samples for simulation studies

GP samples provide a convenient way to create realistic synthetic functional data for benchmarking FDA methods. Here is a complete example that generates data from two different kernels and compares clustering results.

import numpy as np
from fdars import Fdata
from fdars.simulation import gaussian_process
from fdars.clustering import kmeans_fd, silhouette_score
from fdars.metric import lp_self_1d

argvals = np.linspace(0, 1, 150)

# Group 1: smooth Gaussian kernel
g1 = gaussian_process(30, argvals, kernel="gaussian", length_scale=0.25, seed=1)

# Group 2: rough exponential kernel + vertical shift
g2 = gaussian_process(30, argvals, kernel="exponential", length_scale=0.10, seed=2) + 2.0

fd = Fdata(np.vstack([g1, g2]), argvals=argvals)

# Cluster
km = kmeans_fd(fd.data, fd.argvals, k=2, seed=42)
print(f"Converged: {km['converged']}, iterations: {km['iter']}")

# Evaluate
dist = lp_self_1d(fd.data, fd.argvals)
labels = km["cluster"].astype(np.int64)
sil = silhouette_score(dist, labels)
print(f"Mean silhouette: {np.mean(sil):.3f}")

Performance

Generating 1000 GP samples on a 500-point grid takes roughly 50 ms. The bottleneck is the Cholesky decomposition of the \(m \times m\) covariance matrix, which is \(O(m^3)\).