Working with Derivatives¶
Derivatives of functional data reveal how curves change -- velocity,
acceleration, curvature, and higher-order dynamics. The Fdata class provides a
deriv() convenience method for both 1D and 2D functional data (low-level
functions are still available in fdars.fdata).
Why Derivatives Matter¶
Many scientific questions are naturally about rates of change rather than absolute levels:
- Growth curves -- velocity (first derivative) and acceleration (second derivative) of growth.
- Spectroscopy -- first and second derivatives enhance peaks and remove baseline drift.
- Motion capture -- velocity and acceleration from position trajectories.
- Process monitoring -- rate-of-change charts detect shifts earlier than level charts.
Notation
For a functional observation \(x_i(t)\), the \(r\)-th derivative is \(x_i^{(r)}(t) = \frac{d^r x_i}{dt^r}\).
First Derivative (1D)¶
fd.deriv() computes numerical derivatives using finite differences. The Fdata
object already carries the data matrix and evaluation grid argvals.
from fdars.simulation import simulate
# Simulate smooth curves
argvals = np.linspace(0, 1, 200)
data = simulate(n=30, argvals=argvals, n_basis=5, seed=42)
fd = Fdata(data, argvals=argvals)
print(fd) # Fdata (1D) – 30 obs × 200 points – range [0.0, 1.0]
# First derivative
fd_d1 = fd.deriv()
print(fd_d1) # Fdata (1D) – 30 obs × 200 points – range [0.0, 1.0]
The result is an Fdata object of the same shape: each row is the first
derivative of the corresponding input curve, evaluated at the same grid points.
Interpreting the Output¶
# Mean of the original curves
mu = fd.mean()
# Mean of the first derivatives
mu_deriv = fd_d1.mean()
# At a point where mu_deriv ~ 0, the mean function has a local extremum
zero_crossings = np.where(np.diff(np.sign(mu_deriv)))[0]
print(f"Approximate extrema of the mean at t = {fd.argvals[zero_crossings]}")
Higher-Order Derivatives¶
Use the nderiv parameter to compute second, third, or higher derivatives in
a single call:
# Second derivative (acceleration / curvature)
fd_d2 = fd.deriv(nderiv=2)
print(fd_d2.shape) # (30, 200)
# Third derivative
fd_d3 = fd.deriv(nderiv=3)
Numerical instability
Each successive derivative amplifies noise. On raw noisy data, higher-order derivatives can become meaningless. Always smooth first (see the section below).
Comparing Derivative Orders¶
# Compute derivatives of orders 1 through 4
derivs = {}
for order in range(1, 5):
derivs[order] = fd.deriv(nderiv=order)
# Show the range of values -- note how it grows with order
for order, fd_d in derivs.items():
r = fd_d.data[0] # first curve
print(f"Order {order}: range = [{r.min():.2f}, {r.max():.2f}]")
2D Functional Data Derivatives¶
For functional data observed on a 2D domain (surfaces), fd.deriv() on a 2D
Fdata returns partial derivatives in both directions plus the mixed partial
derivative.
Data Layout for 2D¶
A 2D functional observation is a surface \(x_i(s, t)\) evaluated on an
\(m_1 \times m_2\) grid. In fdars, this is stored as a 2D array of shape
(n_obs, m1 * m2) -- each surface is flattened into a single row.
# Create a simple 2D functional dataset
m1, m2 = 30, 40
argvals_s = np.linspace(0, 1, m1)
argvals_t = np.linspace(0, 1, m2)
S, T = np.meshgrid(argvals_s, argvals_t, indexing="ij")
# 10 surfaces: sin(a*s) * cos(b*t) with random a, b
rng = np.random.default_rng(0)
n_obs = 10
data_2d = np.zeros((n_obs, m1 * m2))
for i in range(n_obs):
a, b = rng.uniform(1, 5, size=2)
surface = np.sin(a * S) * np.cos(b * T)
data_2d[i] = surface.ravel()
fd_2d = Fdata(data_2d, argvals=(argvals_s, argvals_t))
print(fd_2d) # Fdata (2D) – 10 obs × 30×40 grid – ...
Computing Partial Derivatives¶
fd_ds, fd_dt, fd_dsdt = fd_2d.deriv()
print(f"d/ds shape: {fd_ds.shape}") # (10, 1200)
print(f"d/dt shape: {fd_dt.shape}") # (10, 1200)
print(f"d2/dsdt shape: {fd_dsdt.shape}") # (10, 1200)
The three returned arrays are:
| Array | Meaning |
|---|---|
fd_ds |
\(\partial x_i / \partial s\) -- partial derivative w.r.t. the first dimension |
fd_dt |
\(\partial x_i / \partial t\) -- partial derivative w.r.t. the second dimension |
fd_dsdt |
\(\partial^2 x_i / \partial s \, \partial t\) -- mixed partial derivative |
Reshaping for Visualization¶
To inspect a single surface, reshape back to the grid:
# Partial derivative of the first surface w.r.t. s
ds_surface = fd_ds.data[0].reshape(m1, m2)
print(ds_surface.shape) # (30, 40)
Combining Derivatives with Smoothing¶
Differentiating noisy data amplifies the noise. The standard workflow is:
- Smooth the raw curves.
- Differentiate the smoothed curves.
Kernel Smoothing + Derivative¶
from fdars.simulation import simulate
from fdars.smoothing import nadaraya_watson, optim_bandwidth
# Noisy data
argvals = np.linspace(0, 1, 200)
clean = simulate(n=20, argvals=argvals, n_basis=5, seed=42)
fd_clean = Fdata(clean, argvals=argvals)
rng = np.random.default_rng(0)
fd_noisy = fd_clean + rng.normal(0, 0.3, size=clean.shape)
# Smooth each curve with an optimal bandwidth
smoothed = np.zeros_like(fd_noisy.data)
for i in range(fd_noisy.n_obs):
bw = optim_bandwidth(fd_noisy.argvals, fd_noisy.data[i])
smoothed[i] = nadaraya_watson(
fd_noisy.argvals, fd_noisy.data[i], fd_noisy.argvals, bandwidth=bw["h_opt"]
)
fd_smoothed = Fdata(smoothed, argvals=argvals)
# Now differentiate using the Fdata convenience method
fd_d1_smooth = fd_smoothed.deriv()
fd_d1_noisy = fd_noisy.deriv()
# Compare noise levels
print(f"Std of d1(noisy): {fd_d1_noisy.data.std():.2f}")
print(f"Std of d1(smooth): {fd_d1_smooth.data.std():.2f}")
Basis Smoothing + Derivative¶
Basis smoothing is more efficient for large datasets because you smooth all curves at once:
from fdars.basis import smooth_basis_gcv
# Smooth all 20 curves simultaneously
result = smooth_basis_gcv(fd_noisy.data, fd_noisy.argvals, n_basis=25)
fd_smooth_basis = Fdata(result["fitted"], argvals=fd_noisy.argvals)
# Differentiate using Fdata convenience method
fd_d1_basis = fd_smooth_basis.deriv()
fd_d2_basis = fd_smooth_basis.deriv(nderiv=2)
print(f"d1 range: [{fd_d1_basis.data.min():.2f}, {fd_d1_basis.data.max():.2f}]")
print(f"d2 range: [{fd_d2_basis.data.min():.2f}, {fd_d2_basis.data.max():.2f}]")
Recommended pipeline
For most applications, basis smoothing followed by numerical
differentiation gives the best balance of speed, smoothness, and accuracy.
Use smooth_basis_gcv with enough basis functions and let GCV choose
the penalty.
Full Example: Growth Curve Analysis¶
A classic FDA application is analyzing human growth data. Here we simulate growth-like curves and extract velocity and acceleration:
import numpy as np
from fdars import Fdata
from fdars.simulation import simulate
from fdars.basis import smooth_basis_gcv
# Simulate growth-like curves (monotone, decelerating)
age = np.linspace(1, 18, 200) # ages 1 to 18
argvals_01 = np.linspace(0, 1, 200) # simulation on [0,1]
raw = simulate(n=40, argvals=argvals_01, n_basis=4,
efun_type="poly", eval_type="exponential", seed=7)
# Transform to look like growth: cumulative sum + scaling
growth = np.cumsum(np.abs(raw), axis=1)
growth = 50 + 130 * (growth - growth.min(axis=1, keepdims=True)) / \
(growth.max(axis=1, keepdims=True) - growth.min(axis=1, keepdims=True))
# Smooth and wrap in Fdata
result = smooth_basis_gcv(growth, age, n_basis=20)
fd_growth = Fdata(result["fitted"], argvals=age)
# Velocity (cm/year) -- use Fdata convenience method
fd_velocity = fd_growth.deriv()
mean_vel = fd_velocity.mean()
# Acceleration (cm/year^2)
fd_acceleration = fd_growth.deriv(nderiv=2)
mean_acc = fd_acceleration.mean()
# Find the age of peak velocity
peak_age_idx = np.argmax(mean_vel)
print(f"Mean peak growth velocity at age {fd_growth.argvals[peak_age_idx]:.1f}")
# Find where acceleration crosses zero (inflection points)
zero_crossings = np.where(np.diff(np.sign(mean_acc)))[0]
print(f"Growth acceleration sign changes at ages: {fd_growth.argvals[zero_crossings].round(1)}")
Summary¶
| Method / Function | Usage | Description |
|---|---|---|
fd.deriv(nderiv=1) |
Fdata convenience |
Numerical derivative; returns a new Fdata |
fd.deriv() (2D) |
Fdata convenience |
Returns (fd_ds, fd_dt, fd_dsdt) tuple of Fdata |
deriv_1d(data, argvals, nderiv=1) |
fdars.fdata |
Low-level: derivative of a raw 2D array |
deriv_2d(data, argvals_s, argvals_t) |
fdars.fdata |
Low-level: partial derivatives of raw 2D data |
Key points:
- Smooth before differentiating -- raw derivatives of noisy data are unreliable.
nderiv-- set to 2 or 3 for second/third derivatives in a single call.- 2D data -- returns \(\partial/\partial s\), \(\partial/\partial t\), and \(\partial^2/\partial s \, \partial t\).
Next Steps¶
- Smoothing -- choose the right smoother before differentiating.
- Functional PCA -- decompose derivatives into principal components.
- Basis Representation -- smooth with basis expansions for optimal pre-processing.