Wavelets & Time-Varying Frequency Analysis

From the failure of stationarity to the mother wavelet: a concise introduction to wavelet analysis and its advantage over classical Fourier methods.

The stationarity problem

Classical spectral analysis rests on a key assumption: the covariance between observations depends only on the lag between them, not on when they occur. Formally, \(\text{Cov}(X(t),\, X(t+h)) = \gamma(h)\).

When this fails β€” when \(\text{Cov}(X(t), X(t+h)) = \gamma(t,h)\) depends on \(t\) β€” the process is nonstationary. Its frequency content changes over time. Such data are called time-varying frequency (TVF) data. Examples include seismic signals, insect noises, and financial volatility.

The usual power spectrum \(P(f) = \sum_{k=-\infty}^{\infty} \gamma(k)\,e^{-2\pi ijk}\) is no longer well-defined when \(\gamma\) depends on \(t\) as well as lag. You need a time-localized frequency representation.

Hierarchy of solutions

1. Fourier Transform (baseline)

\[G(f) = \int_{-\infty}^{\infty} e^{-2\pi ifx}\, g(x)\, dx\] Global by construction β€” no time localization. Useless for TVF data.

2. Short-Time Fourier Transform (STFT)

A window \(h(x-t)\) with \(|h(t)| \to 0\) as \(|t| \to \infty\) localizes the transform around time \(t\): \[G(t,f) = \int_{-\infty}^{\infty} g(x)\, h(x-t)\, e^{-2\pi ifx}\, dx\] Partial fix, but the window width is fixed. This creates an unavoidable trade-off: narrow window = good time resolution, poor frequency resolution. Wide window = the reverse. This is the signal-processing analogue of the Heisenberg uncertainty principle.

3. Wigner-Ville Spectrum

Generalizes the power spectrum to nonstationary processes: \[W(t,f) = \sum_{k=-\infty}^{\infty} \gamma\!\left(t+\tfrac{k}{2},\, t-\tfrac{k}{2}\right) e^{-2\pi ijk}\] More theoretically principled, but suffers from cross-term interference in practice (it is bilinear, so products of signal components produce spurious features).

4. Wavelets β€” Adaptive Resolution

Instead of a fixed window, use dilations of a single function β€” the mother wavelet β€” whose scale stretches or compresses to match the frequency of interest. High-frequency events get a narrow window; low-frequency trends get a wide one.

Fourier vs. wavelets: the key analogy

The Fourier basis is built from dilations of \(e^{ix}\): \[S_2 = \{e^{ikx},\ k \in \mathbb{Z}\}\] This works locally, but globally \(|e^{ix}| = 1\) everywhere β€” it never decays. Therefore \(e^{ix} \notin L^2(\mathbb{R})\), meaning \(\int_{-\infty}^{\infty} |e^{ix}|^2\, dx = \infty\).

Theorem (Fourier series). A well-behaved function \(g\) on \([-\pi, \pi]\) can be expressed as a linear combination of the basis functions in \(S_1 = \{1,\, \cos kx,\, \sin kx;\ k=1,2,\ldots\}\), equivalently \(S_2 = \{e^{ikx};\ k \in \mathbb{Z}\}\). The advantage of \(S_2\) is that it is generated by a single function under integral dilations.

Wavelets build on this idea but fix the decay problem. A function is called square integrable if \[\int_{-\infty}^{\infty} |g(x)|^2\, dx < \infty \quad (\text{written } g \in L^2(\mathbb{R}))\] A necessary condition for \(g \in L^2(\mathbb{R})\) is \(|g(t)| \to 0\) as \(|t| \to \infty\). This is what sine and cosine fail to satisfy β€” and what the mother wavelet \(\Omega\) must satisfy.

The mother wavelet

Let \(\Omega \in L^2(\mathbb{R})\) be the mother wavelet. Its entire family of basis elements is generated by two operations:

  • Dilation by integer \(j\): stretches or compresses the wavelet (controls scale/frequency)
  • Translation by integer \(k\): shifts it in time (controls location)

\[\Omega_{j,k}(x) = 2^{-j/2}\, \Omega(2^{-j}x - k)\] The factor \(2^{-j/2}\) normalizes so that \(\|\Omega_{j,k}\| = \|\Omega\|\) for all \(j, k\). Large \(j\) β†’ stretched wavelet β†’ captures low-frequency structure. Small \(j\) β†’ compressed wavelet β†’ captures high-frequency bursts.

The function \(\Omega\) is called the mother wavelet because it gives birth to the entire collection of basis elements through dilation and translation alone. Common families: Haar, Daubechies, Morlet, Mexican hat. Each makes different smoothness/support trade-offs.

Fundamental approximation result

Any \(g \in L^2(\mathbb{R})\) admits the representation: \[g(x) = \sum_{j=-\infty}^{\infty} \sum_{k=-\infty}^{\infty} d_{j,k}\, \Omega_{j,k}(x)\] with equality in the mean-square sense (\(L^2\) convergence). The wavelet transform coefficients are: \[d_{j,k} = \frac{1}{\|\Omega\|^2} \int_{-\infty}^{\infty} g(t)\, \Omega_{j,k}(t)\, dt\]

This is a projection of \(g\) onto each basis element β€” exactly analogous to Fourier coefficients \(\hat{g}(k) = \int g(t)\, e^{-ikt}\, dt\), but now indexed by both scale \(j\) and location \(k\).

A large \(|d_{j,k}|\) means \(g\) has significant activity at scale \(j\) near time \(k\). This gives you a time-scale map of the signal β€” precisely what is needed for TVF data.

Comparison table

Method Time resolution Frequency resolution Cross-terms Nonstationary
FourierNone (global)HighNoNo
STFTFixed windowFixedNoPartial
Wigner-VilleAdaptiveAdaptiveYesYes
WaveletsAdaptiveAdaptiveNoYes

Figures

Stationary signal Constant frequency, constant variance Nonstationary (TVF) signal Frequency changes across time β†’ standard spectrum fails
Left: a stationary signal with constant frequency. Right: a TVF signal whose frequency content accelerates β€” the classical Fourier spectrum conflates both regimes into a single smeared estimate.
Mother wavelet \(\Omega\) β†’ dilated/translated children \(\Omega_{j,k}\) j=0, k=0 (mother) j=1 (wider scale) j=βˆ’1 (narrower scale) Dilation index j controls frequency; translation index k shifts in time. Same mother wavelet, infinite family.
Schematic of the mother wavelet \(\Omega\) and two children obtained by dilation. Larger \(j\) stretches the wavelet (low frequency); smaller \(j\) compresses it (high frequency). Translation index \(k\) moves the wavelet along the time axis.