Working with Derivatives¶

Derivatives of functional data reveal how curves change -- velocity, acceleration, curvature, and higher-order dynamics. The Fdata class provides a deriv() convenience method for both 1D and 2D functional data (low-level functions are still available in fdars.fdata).

import numpy as np
from fdars import Fdata

Why Derivatives Matter¶

Many scientific questions are naturally about rates of change rather than absolute levels:

Growth curves -- velocity (first derivative) and acceleration (second derivative) of growth.
Spectroscopy -- first and second derivatives enhance peaks and remove baseline drift.
Motion capture -- velocity and acceleration from position trajectories.
Process monitoring -- rate-of-change charts detect shifts earlier than level charts.

Notation

For a functional observation \(x_i(t)\), the \(r\)-th derivative is \(x_i^{(r)}(t) = \frac{d^r x_i}{dt^r}\).

First Derivative (1D)¶

fd.deriv() computes numerical derivatives using finite differences. The Fdata object already carries the data matrix and evaluation grid argvals.

from fdars.simulation import simulate

# Simulate smooth curves
argvals = np.linspace(0, 1, 200)
data = simulate(n=30, argvals=argvals, n_basis=5, seed=42)
fd = Fdata(data, argvals=argvals)
print(fd)  # Fdata (1D)  –  30 obs × 200 points  –  range [0.0, 1.0]

# First derivative
fd_d1 = fd.deriv()
print(fd_d1)  # Fdata (1D)  –  30 obs × 200 points  –  range [0.0, 1.0]

The result is an Fdata object of the same shape: each row is the first derivative of the corresponding input curve, evaluated at the same grid points.

Interpreting the Output¶

# Mean of the original curves
mu = fd.mean()

# Mean of the first derivatives
mu_deriv = fd_d1.mean()

# At a point where mu_deriv ~ 0, the mean function has a local extremum
zero_crossings = np.where(np.diff(np.sign(mu_deriv)))[0]
print(f"Approximate extrema of the mean at t = {fd.argvals[zero_crossings]}")

Higher-Order Derivatives¶

Use the nderiv parameter to compute second, third, or higher derivatives in a single call:

# Second derivative (acceleration / curvature)
fd_d2 = fd.deriv(nderiv=2)
print(fd_d2.shape)  # (30, 200)

# Third derivative
fd_d3 = fd.deriv(nderiv=3)

Numerical instability

Each successive derivative amplifies noise. On raw noisy data, higher-order derivatives can become meaningless. Always smooth first (see the section below).

Comparing Derivative Orders¶

# Compute derivatives of orders 1 through 4
derivs = {}
for order in range(1, 5):
    derivs[order] = fd.deriv(nderiv=order)

# Show the range of values -- note how it grows with order
for order, fd_d in derivs.items():
    r = fd_d.data[0]  # first curve
    print(f"Order {order}: range = [{r.min():.2f}, {r.max():.2f}]")

2D Functional Data Derivatives¶

For functional data observed on a 2D domain (surfaces), fd.deriv() on a 2D Fdata returns partial derivatives in both directions plus the mixed partial derivative.

Data Layout for 2D¶

A 2D functional observation is a surface \(x_i(s, t)\) evaluated on an \(m_1 \times m_2\) grid. In fdars, this is stored as a 2D array of shape (n_obs, m1 * m2) -- each surface is flattened into a single row.

# Create a simple 2D functional dataset
m1, m2 = 30, 40
argvals_s = np.linspace(0, 1, m1)
argvals_t = np.linspace(0, 1, m2)
S, T = np.meshgrid(argvals_s, argvals_t, indexing="ij")

# 10 surfaces: sin(a*s) * cos(b*t) with random a, b
rng = np.random.default_rng(0)
n_obs = 10
data_2d = np.zeros((n_obs, m1 * m2))
for i in range(n_obs):
    a, b = rng.uniform(1, 5, size=2)
    surface = np.sin(a * S) * np.cos(b * T)
    data_2d[i] = surface.ravel()

fd_2d = Fdata(data_2d, argvals=(argvals_s, argvals_t))
print(fd_2d)  # Fdata (2D)  –  10 obs × 30×40 grid  –  ...

Computing Partial Derivatives¶

fd_ds, fd_dt, fd_dsdt = fd_2d.deriv()

print(f"d/ds shape:    {fd_ds.shape}")    # (10, 1200)
print(f"d/dt shape:    {fd_dt.shape}")    # (10, 1200)
print(f"d2/dsdt shape: {fd_dsdt.shape}")  # (10, 1200)

The three returned arrays are:

Array	Meaning
`fd_ds`	\(\partial x_i / \partial s\) -- partial derivative w.r.t. the first dimension
`fd_dt`	\(\partial x_i / \partial t\) -- partial derivative w.r.t. the second dimension
`fd_dsdt`	\(\partial^2 x_i / \partial s \, \partial t\) -- mixed partial derivative

Reshaping for Visualization¶

To inspect a single surface, reshape back to the grid:

# Partial derivative of the first surface w.r.t. s
ds_surface = fd_ds.data[0].reshape(m1, m2)
print(ds_surface.shape)  # (30, 40)

Combining Derivatives with Smoothing¶

Differentiating noisy data amplifies the noise. The standard workflow is:

Smooth the raw curves.
Differentiate the smoothed curves.

Kernel Smoothing + Derivative¶

from fdars.simulation import simulate
from fdars.smoothing import nadaraya_watson, optim_bandwidth

# Noisy data
argvals = np.linspace(0, 1, 200)
clean = simulate(n=20, argvals=argvals, n_basis=5, seed=42)
fd_clean = Fdata(clean, argvals=argvals)

rng = np.random.default_rng(0)
fd_noisy = fd_clean + rng.normal(0, 0.3, size=clean.shape)

# Smooth each curve with an optimal bandwidth
smoothed = np.zeros_like(fd_noisy.data)
for i in range(fd_noisy.n_obs):
    bw = optim_bandwidth(fd_noisy.argvals, fd_noisy.data[i])
    smoothed[i] = nadaraya_watson(
        fd_noisy.argvals, fd_noisy.data[i], fd_noisy.argvals, bandwidth=bw["h_opt"]
    )
fd_smoothed = Fdata(smoothed, argvals=argvals)

# Now differentiate using the Fdata convenience method
fd_d1_smooth = fd_smoothed.deriv()
fd_d1_noisy  = fd_noisy.deriv()

# Compare noise levels
print(f"Std of d1(noisy):   {fd_d1_noisy.data.std():.2f}")
print(f"Std of d1(smooth):  {fd_d1_smooth.data.std():.2f}")

Basis Smoothing + Derivative¶

Basis smoothing is more efficient for large datasets because you smooth all curves at once:

from fdars.basis import smooth_basis_gcv

# Smooth all 20 curves simultaneously
result = smooth_basis_gcv(fd_noisy.data, fd_noisy.argvals, n_basis=25)
fd_smooth_basis = Fdata(result["fitted"], argvals=fd_noisy.argvals)

# Differentiate using Fdata convenience method
fd_d1_basis = fd_smooth_basis.deriv()
fd_d2_basis = fd_smooth_basis.deriv(nderiv=2)

print(f"d1 range: [{fd_d1_basis.data.min():.2f}, {fd_d1_basis.data.max():.2f}]")
print(f"d2 range: [{fd_d2_basis.data.min():.2f}, {fd_d2_basis.data.max():.2f}]")

Recommended pipeline

For most applications, basis smoothing followed by numerical differentiation gives the best balance of speed, smoothness, and accuracy. Use smooth_basis_gcv with enough basis functions and let GCV choose the penalty.

Full Example: Growth Curve Analysis¶

A classic FDA application is analyzing human growth data. Here we simulate growth-like curves and extract velocity and acceleration:

import numpy as np
from fdars import Fdata
from fdars.simulation import simulate
from fdars.basis import smooth_basis_gcv

# Simulate growth-like curves (monotone, decelerating)
age = np.linspace(1, 18, 200)   # ages 1 to 18
argvals_01 = np.linspace(0, 1, 200)  # simulation on [0,1]

raw = simulate(n=40, argvals=argvals_01, n_basis=4,
               efun_type="poly", eval_type="exponential", seed=7)

# Transform to look like growth: cumulative sum + scaling
growth = np.cumsum(np.abs(raw), axis=1)
growth = 50 + 130 * (growth - growth.min(axis=1, keepdims=True)) / \
    (growth.max(axis=1, keepdims=True) - growth.min(axis=1, keepdims=True))

# Smooth and wrap in Fdata
result = smooth_basis_gcv(growth, age, n_basis=20)
fd_growth = Fdata(result["fitted"], argvals=age)

# Velocity (cm/year) -- use Fdata convenience method
fd_velocity = fd_growth.deriv()
mean_vel = fd_velocity.mean()

# Acceleration (cm/year^2)
fd_acceleration = fd_growth.deriv(nderiv=2)
mean_acc = fd_acceleration.mean()

# Find the age of peak velocity
peak_age_idx = np.argmax(mean_vel)
print(f"Mean peak growth velocity at age {fd_growth.argvals[peak_age_idx]:.1f}")

# Find where acceleration crosses zero (inflection points)
zero_crossings = np.where(np.diff(np.sign(mean_acc)))[0]
print(f"Growth acceleration sign changes at ages: {fd_growth.argvals[zero_crossings].round(1)}")

Summary¶

Method / Function	Usage	Description
`fd.deriv(nderiv=1)`	`Fdata` convenience	Numerical derivative; returns a new `Fdata`
`fd.deriv()` (2D)	`Fdata` convenience	Returns `(fd_ds, fd_dt, fd_dsdt)` tuple of `Fdata`
`deriv_1d(data, argvals, nderiv=1)`	`fdars.fdata`	Low-level: derivative of a raw 2D array
`deriv_2d(data, argvals_s, argvals_t)`	`fdars.fdata`	Low-level: partial derivatives of raw 2D data

Key points:

Smooth before differentiating -- raw derivatives of noisy data are unreliable.
nderiv -- set to 2 or 3 for second/third derivatives in a single call.
2D data -- returns \(\partial/\partial s\), \(\partial/\partial t\), and \(\partial^2/\partial s \, \partial t\).

Next Steps¶

Smoothing -- choose the right smoother before differentiating.
Functional PCA -- decompose derivatives into principal components.
Basis Representation -- smooth with basis expansions for optimal pre-processing.