Signal Processing Foundations — Week 5

Fast Fourier
Transform

The DFT costs O(N²) — too slow for real-time audio, radar, and 5G. Cooley-Tukey's divide-and-conquer trick slashes this to O(N log N), making the modern digital world possible.

O(N log N) Cooley-Tukey Butterfly Diagram Radix-2 DIT Twiddle Factor scipy.fft

Why the Naive DFT is Too Slow

The DFT is mathematically correct — but its brute-force O(N²) cost makes it unusable for real-time processing. This section counts the operations and shows why that number grows catastrophically.

After this section you will be able to
  • Count the exact number of complex multiplications required by the naive DFT for any given N.
  • Explain why O(N²) scaling becomes a hard wall for real-time audio, radar, and communications.
  • Compare DFT and FFT operation counts side-by-side for specific values of N (e.g. N = 8, 1024).

Imagine you need to build a real-time audio equalizer. Your app must analyse a 4096-sample audio frame sixty times per second — before the next frame arrives. You implement the DFT formula directly. The CPU runs for a few seconds… and then crashes. The math is right. So what went wrong?

🎯
Why it matters: Every audio codec, radar processor, and 5G base-station alive today runs on the FFT. Without it, real-time frequency analysis was physically impossible — a one-second audio clip required billions of operations on hardware capable of thousands. Understanding why the naive DFT fails is what makes the FFT's elegance visible.
🔗
Think of it this way

The naive DFT is like matching every student in a class with every student individually to check friendship — that's N×N handshakes. Double the class size and the handshakes quadruple. The FFT finds a shortcut: split the class in half, check each half separately, then combine — cutting work from N² to N log N.

N = 8
64 multiplications — fast enough for simple demos
N = 1 024
≈ 1 million multiplications per transform
N = 65 536
4.3 billion muls — impossible in real-time
FFT N=65 536
≈ 1 million muls — 4000× faster than naive DFT
DFT cost
Complex multiplications for a naive DFT of length N
1M
ops at N=1024
1,048,576 complex multiplications for a 1024-point DFT
~100×
typical speedup
FFT vs naive DFT for N = 1024 on modern hardware
1965
Cooley-Tukey
Year the FFT algorithm was (re)published and changed DSP forever
Problem

The DFT Formula and Its Hidden Cost

For a length-N discrete signal $x[n]$, the DFT computes N frequency bins:

$$X[k] = \sum_{n=0}^{N-1} x[n]\, e^{-j\frac{2\pi}{N}kn}, \quad k = 0,1,\ldots,N{-}1$$
$X[k] = \sum_{n=0}^{N-1} x[n] \cdot e^{-j\frac{2\pi}{N}kn}$
$X[k]$
Output
Frequency-domain value at bin k
$x[n]$
Input
Time-domain sample at index n
$e^{-j\frac{2\pi}{N}kn}$
Twiddle
Unit-circle rotation — one complex multiply per term
$N$
Length
Number of input samples (must be ≥1)

Each $X[k]$ requires N complex multiplications. Computing all N outputs costs $N \times N = N^2$ complex multiplications total — the O(N²) bottleneck.

📝 Worked Example — Count operations for N = 8
1
Given: N = 8 (one small audio frame)
2
Multiplications per output bin: Each X[k] sums N = 8 terms, each needing one complex multiply → 8 muls per bin
3
Total bins: k = 0, 1, …, 7 → N = 8 output bins
4
Total complex multiplications: N × N = 8 × 8 = 64
5
Total complex additions: N × (N − 1) = 8 × 7 = 56
N = 8 naive DFT: 64 multiplications + 56 additions = 120 operations total
Quick Check

For N = 16, how many complex multiplications does the naive DFT require?

N × N = 16 × 16 = 256 multiplications (4× more than N=8, because doubling N quadruples the work).

Why Quadratic Growth is a Hard Wall

Doubling N quadruples the work. This is not a constant slowdown — it is an exponentially growing tax:

NDFT muls (N²)FFT muls (~N log₂N / 2)Speedup
86412
644,09619221×
1,0241,048,5765,120205×
65,5364.3 billion1,048,5764,096×

At N = 65,536 (a typical audio FFT frame), the naive DFT requires 4.3 billion complex multiplications — more than a modern CPU core can execute in several seconds. The FFT does it in milliseconds.

📝 Worked Example — Audio codec feasibility check
1
Setup: Audio codec processes N = 4096 samples at 44 frames/sec
2
Naive DFT muls per frame: N² = 4096² = 16,777,216 ≈ 16.8 million
3
Per second: 44 frames × 16.8 M muls = 739 million complex muls/sec
4
Each complex mul = 6 real ops, so real ops/sec ≈ 4.4 billion — beyond most embedded DSPs
5
FFT muls per frame: (N/2) log₂N = 2048 × 12 = 24,576 ≈ 25K → easily real-time
Naive DFT: 739 M muls/sec → impossible. FFT: 1.1 M muls/sec → trivially real-time.
Quick Check

If the codec doubles its frame size to N = 8192, how many times larger does the naive DFT cost become?

Doubling N quadruples cost: 8192² / 4096² = 4×. The naive DFT cost grows by 4×, from 16.8M to 67.1M muls/frame.
Choose output bin k ∈ {0 … N−1} N bins total Multiply N samples x[n] · e^(−j2πkn/N) N muls per bin Sum N products → one X[k] N−1 additions Repeat for all N bins k = 0…N−1 N iterations Total Cost O(N²) N × N muls

DFT operation pipeline: each of N output bins requires N multiplications → N × N = O(N²) total.

⚠️
Common Mistake

"Bigger N just means more data to process — it can't be that much slower."

Linear thinking is wrong here. If N doubles from 1024 to 2048, work does not double — it quadruples (4× more operations). At N = 65,536, the DFT is 4,096 times slower than FFT. This is why the FFT is not an optimisation — it is a categorical change in what is possible.

Solution
Pause & Predict

Before exploring the widget: if N doubles from 64 to 128, does the DFT operation count increase by 2× or 4×? What about the FFT (N log₂N)?

Hint: Write out 64² and 128² side-by-side, then compare 64×6 vs 128×7.

Try It: DFT vs FFT Operation Count

Drag the slider to set N (power of 2). Watch how DFT (O(N²), red) and FFT (O(N log₂N), teal) operation counts diverge.

DFT: N² muls FFT: (N/2) log₂N muls
Live Calculation
N = 64
Implementation
Python · NumPy — Timing Naive DFT vs scipy.fft
import numpy as np import time def naive_dft(x): # Builds full N×N twiddle matrix — O(N²) memory and ops N = len(x) n = np.arange(N) k = n.reshape((N, 1)) W = np.exp(-2j * np.pi * k * n / N) return W @ x rng = np.random.default_rng(42) for N in [64, 256, 1024, 4096]: x = rng.standard_normal(N) + 1j * rng.standard_normal(N) t0 = time.perf_counter() for _ in range(20): naive_dft(x) dft_ms = (time.perf_counter() - t0) / 20 * 1000 t0 = time.perf_counter() from scipy import fft for _ in range(200): fft.fft(x) fft_ms = (time.perf_counter() - t0) / 200 * 1000 print(f"N={N:5d} | DFT {dft_ms:8.3f} ms | FFT {fft_ms:8.3f} ms | speedup {dft_ms/fft_ms:7.1f}×")
Output
N= 64 | DFT 0.041 ms | FFT 0.004 ms | speedup 10.2× N= 256 | DFT 0.520 ms | FFT 0.006 ms | speedup 86.7× N= 1024 | DFT 7.800 ms | FFT 0.009 ms | speedup 866.7× N= 4096 | DFT 127.400 ms | FFT 0.015 ms | speedup 8493.3×
Key Takeaway
The naive DFT's O(N²) cost makes it 8,000× slower than scipy.fft at N=4096 — not because it is wrong, but because it computes N outputs by repeating N multiplications N times with no reuse.
📡
Real-World Application

Automotive Radar — From Impractical to Real-Time

Modern FMCW radar processes 256-point chirp signals at 10 ms intervals. Naive DFT at N = 256 costs 65,536 muls per sweep. With 100+ sweeps per frame at 100 frames/sec: over 655 million muls/sec — far beyond an embedded DSP at 200 MHz. The FFT reduces this to 256 × 8 = 2,048 muls per sweep, well within embedded range. The FFT is why your car's collision-avoidance radar works in real time.

CheckpointO(N²) Bottleneck — Quick Retrieval

Q1 A naive DFT of length N = 512 requires how many complex multiplications?

N² = 512² = 262,144 complex multiplications.

Q2 If you double N from 512 to 1024, by what factor does the naive DFT cost increase?

4× (factor of 4) — because (2N)² = 4N². Doubling N always quadruples naive DFT work.

Q3 Which line in the DFT formula causes the O(N²) cost — the summation index n, the output index k, or both?

Both: the outer loop over k (N iterations) multiplied by the inner summation over n (N terms per k) → N × N = N² operations.

Cooley-Tukey & the Butterfly

By splitting one N-point DFT into two N/2-point DFTs and recombining with twiddle factors, the algorithm recursively reuses computations — converting O(N²) into O(N log N).

After this section you will be able to
  • Derive the DIT butterfly equation by splitting a DFT sum into even- and odd-indexed halves.
  • Calculate the twiddle factor $W_N^k = e^{-j2\pi k/N}$ for specific values of k and N by hand.
  • Trace the flow of a 4-point FFT through a butterfly diagram, labelling each twiddle multiplication.

In 1965, James Cooley and John Tukey published a paper that Carl Friedrich Gauss had essentially written 160 years earlier (without publishing it). The insight is deceptively simple: the even-indexed and odd-indexed samples of any signal transform completely independently — and their sub-transforms can be recombined in O(N) time. One observation; a million-fold speedup.

✂️
The core idea: Divide the problem in half, solve each half recursively, and merge. Applied log₂N times, this converts N² work into N log N. The "butterfly" is the elementary merge operation — two inputs, one twiddle multiply, two outputs.
🔗
Think of it this way

The FFT is like merge-sort for frequency analysis. Just as merge-sort splits an array in half, sorts each half, then merges — the FFT splits the signal into even/odd halves, transforms each half, then "butterflies" them together. Both algorithms run in O(N log N) for the same reason.

$W_N^k$
Twiddle Factor
Unit-circle rotation by k/N turns; the "glue" between sub-transforms
$W_8^1 = e^{-j\pi/4}$
$X_e[k]$
Even DFT
N/2-point DFT of the even-indexed samples x[0], x[2], …
length N/2
$X_o[k]$
Odd DFT
N/2-point DFT of the odd-indexed samples x[1], x[3], …
length N/2
log₂N
Stages
Number of recursive splits before reaching trivial 1-point DFTs
N=1024 → 10 stages
(N/2)log₂N
FFT muls
Exact complex multiplication count for radix-2 Cooley-Tukey
N log₂N
butterfly ops
Total 2-input butterfly operations across all log₂N stages
|1|
twiddle magnitude
Every twiddle factor lies on the unit circle — no amplitude change, only rotation
Problem

Step 1: Split Even and Odd Indices

Rewrite the DFT sum by separating even-indexed ($n = 2r$) and odd-indexed ($n = 2r+1$) samples:

$$X[k] = \underbrace{\sum_{r=0}^{N/2-1} x[2r]\,e^{-j\frac{2\pi}{N/2}kr}}_{X_e[k]} + \; e^{-j\frac{2\pi}{N}k}\underbrace{\sum_{r=0}^{N/2-1} x[2r+1]\,e^{-j\frac{2\pi}{N/2}kr}}_{X_o[k]}$$
$X[k] = X_e[k] + W_N^k \cdot X_o[k]$
$X_e[k]$
Even DFT
N/2-point DFT of even samples
$W_N^k$
Twiddle
$e^{-j2\pi k/N}$ — one complex mul
$X_o[k]$
Odd DFT
N/2-point DFT of odd samples

This is the butterfly equation. Each stage costs O(N) to combine two N/2-point DFTs — and there are log₂N stages, giving O(N log N) total.

📝 Worked Example — 4-point FFT split (N = 4)
1
Input: x = [x[0], x[1], x[2], x[3]], N = 4
Even: x_e = [x[0], x[2]];  Odd: x_o = [x[1], x[3]]
2
2-point DFTs (N/2 = 2):
$X_e[0] = x[0] + x[2]$,   $X_e[1] = x[0] - x[2]$
$X_o[0] = x[1] + x[3]$,   $X_o[1] = x[1] - x[3]$
3
Twiddle factors $W_4^k = e^{-j\pi k/2}$:
$W_4^0 = e^0 = 1$,   $W_4^1 = e^{-j\pi/2} = -j$
4
Butterfly recombination:
$X[0] = X_e[0] + W_4^0 X_o[0] = (x[0]+x[2]) + (x[1]+x[3])$
$X[1] = X_e[1] + W_4^1 X_o[1] = (x[0]-x[2]) - j(x[1]-x[3])$
$X[2] = X_e[0] - W_4^0 X_o[0] = (x[0]+x[2]) - (x[1]+x[3])$
$X[3] = X_e[1] - W_4^1 X_o[1] = (x[0]-x[2]) + j(x[1]-x[3])$
Naive: 16 muls. After 1 butterfly split: 2×(2²) + 4 = 12 muls. Savings: 25% — and each sub-problem splits again!
Quick Check

What is the twiddle factor $W_8^2$? Express as a+jb (exact values only).

$W_8^2 = e^{-j2\pi\cdot2/8} = e^{-j\pi/2} = \cos(-\pi/2) + j\sin(-\pi/2) = 0 - j = -j$

Step 2: Twiddle Factor Symmetry

The key trick that enables the upper half of X[k] to be computed for free:

$$W_N^{k + N/2} = e^{-j\frac{2\pi}{N}(k+N/2)} = e^{-j\frac{2\pi k}{N}} \cdot e^{-j\pi} = -W_N^k$$
$W_N^{k+N/2} = -W_N^k$
$W_N^k$
Lower half
Twiddle for bin k (already computed)
$-$
Sign flip
Upper half twiddle is just negated — zero extra cost
$N/2$
Half-period
Phase periodicity gives the symmetry

This means the complete butterfly only needs one twiddle multiply per pair, not two. Combined with the even/odd split, each stage costs exactly N/2 multiplications — and log₂N stages gives (N/2)log₂N total muls.

📝 Worked Example — Twiddle symmetry for N = 8
1
N = 8, k = 1:
$W_8^1 = e^{-j2\pi/8} = e^{-j\pi/4} = \frac{\sqrt{2}}{2} - j\frac{\sqrt{2}}{2} \approx 0.707 - 0.707j$
2
Upper half twiddle k = 1 + N/2 = 5:
$W_8^5 = -W_8^1 = -(0.707 - 0.707j) = -0.707 + 0.707j$
3
Butterfly equations for the pair (k=1, k=5):
$X[1] = X_e[1] + W_8^1 \cdot X_o[1]$
$X[5] = X_e[1] - W_8^1 \cdot X_o[1]$   ← same multiply, flipped sign!
One complex mul ($W_8^1 \cdot X_o[1]$) produces both X[1] and X[5]. The symmetry halves the twiddle cost.
Quick Check

For N = 8, k = 3: compute $W_8^3$ and $W_8^7$ (show that they are negatives of each other).

$W_8^3 = e^{-j3\pi/4} = -\tfrac{\sqrt2}{2} - j\tfrac{\sqrt2}{2}$.   $W_8^7 = e^{-j7\pi/4} = \tfrac{\sqrt2}{2} + j\tfrac{\sqrt2}{2} = -W_8^3$. ✓
RADIX-2 DIT BUTTERFLY (N = 4) INPUT STAGE 1 (N/2 DFTs) STAGE 2 (combine) x[0] x[2] x[1] x[3] Xₑ[0] x[0]+x[2] Xₑ[1] x[0]−x[2] X_o[0] x[1]+x[3] X_o[1] x[1]−x[3] X[0] +W⁰ X[1] +W¹ X[2] −W⁰ X[3] −W¹ W⁴₁=−j W⁴₀=1 + (add) − (subtract) twiddle W

Radix-2 DIT butterfly for N=4. Stage 1 computes two 2-point DFTs (even/odd split). Stage 2 combines with twiddle factors. Solid purple = add; dashed red = subtract.

💡
Key Insight

Why "butterfly"? Draw the stage-2 connections on paper: two nodes on the left fan out to two nodes on the right, and the crossing wires form a shape that looks like butterfly wings. Each butterfly operation takes two complex inputs and produces two complex outputs using one twiddle multiply — the most efficient possible merge.

Solution
Pause & Predict

For N = 8, how many butterfly stages are needed? How many twiddle multiplications per stage? What is the total FFT multiplication count?

Hint: stages = log₂(8), twiddle muls per stage = N/2, total = (N/2) × stages.

Try It: Butterfly Stage Tracer

Set N and a stage index. The widget draws the butterfly connections for that stage and shows twiddle factor values.

FFT Operation Count for current N
N = 8 → 3 stages
Implementation
Python · NumPy — Recursive Cooley-Tukey FFT
import numpy as np def fft_recursive(x): # Cooley-Tukey radix-2 DIT FFT — N must be a power of 2 N = len(x) if N <= 1: return x # 1-point DFT: trivially the value itself # ── Split even/odd indices ────────────────────────────────── X_even = fft_recursive(x[::2]) # x[0], x[2], x[4], … X_odd = fft_recursive(x[1::2]) # x[1], x[3], x[5], … # ── Compute twiddle factors W_N^k for k = 0 … N/2−1 ──────── k = np.arange(N // 2) W = np.exp(-2j * np.pi * k / N) # shape (N/2,) # ── Butterfly merge: X[k] and X[k + N/2] ────────────────── twiddle_odd = W * X_odd return np.concatenate([X_even + twiddle_odd, X_even - twiddle_odd]) # ── Verify against scipy.fft ────────────────────────────────── from scipy import fft x = np.array([1, 2, 3, 4, 5, 6, 7, 8], dtype=float) our = fft_recursive(x) ref = fft.fft(x) print("Max difference:", np.max(np.abs(our - ref))) # ≈ 1e-14
Output
Max difference: 1.3322676295501878e-14
Key Takeaway
The butterfly equation $X[k] = X_e[k] + W_N^k X_o[k]$ is the FFT's "atom" — one twiddle multiply plus two additions that, applied recursively across log₂N stages, reduces total work from N² to (N/2)log₂N multiplications.
📶
Real-World Application

5G/LTE OFDM — The FFT Powering Every Smartphone

OFDM (Orthogonal Frequency Division Multiplexing) — the modulation scheme in every 4G/5G base station — is literally a bank of inverse FFTs. A 5G base station may compute 4096-point FFTs at millions of frames per second across hundreds of subcarriers simultaneously. The radix-2 butterfly structure enables this at silicon-efficient O(N log N) cost. Without Cooley-Tukey, high-speed wireless communications as we know them would be impossible.

CheckpointButterfly Algorithm — Quick Retrieval

Q1 For N = 8, how many butterfly stages does the radix-2 FFT have, and how many complex multiplications per stage?

3 stages (log₂8 = 3), with N/2 = 4 twiddle multiplications per stage → 12 total complex multiplications.

Q2 What is the exact value of $W_4^1$ (twiddle factor for N=4, k=1)?

$W_4^1 = e^{-j2\pi/4} = e^{-j\pi/2} = \cos(-90°) + j\sin(-90°) = 0 - j = -j$

Q3 Why does $W_N^{k+N/2} = -W_N^k$? Express the proof in one sentence.

$W_N^{k+N/2} = e^{-j2\pi(k+N/2)/N} = e^{-j2\pi k/N} \cdot e^{-j\pi} = W_N^k \cdot (-1) = -W_N^k$, because $e^{-j\pi} = -1$.

Real-Time Spectrum Analysis with scipy.fft

Theory meets practice: compute, interpret, and display a frequency spectrum from a real audio or synthesized signal using scipy.fft. Master zero-padding, the fftfreq axis, and the magnitude spectrum.

After this section you will be able to
  • Use scipy.fft.fft and scipy.fft.fftfreq to compute and label the frequency axis of a signal's spectrum.
  • Explain why only the first N/2 frequency bins are meaningful for a real-valued signal (negative-frequency symmetry).
  • Apply zero-padding to increase frequency-axis resolution without changing the signal's actual spectral content.

You now know that the FFT is fast. But how do you actually use it? What does the output array mean? Why does the second half of the spectrum look like a mirror of the first? And why does zero-padding seem to "add detail" to a spectrum that wasn't there before?

🎛️
Why it matters: scipy.fft.fft is the Swiss-army knife of signal analysis — used in audio equalizers, spectrum analysers, vibration monitoring, and neural signal processing. Reading its output correctly is a foundational skill for any DSP practitioner.
🔗
Think of it this way

The FFT output is like a piano keyboard laid flat. The first bin (k=0) is DC (0 Hz). Each subsequent bin is one step higher in frequency. The second half mirrors the first because a real note played on a physical keyboard sounds the same played forwards or backwards — real signals have symmetric (conjugate) spectra.

Raw Signal x[n], length N float64 array scipy.fft.fft(x) X[k], complex128 length N fftfreq(N, 1/fs) freq axis (Hz) 0 … +fs/2 … −fs/2 np.abs(X[:N//2]) magnitude spectrum only positive freqs Spectrum Plot freq (Hz) vs magnitude peaks = signal freqs
N/2
useful bins
Real signals have conjugate-symmetric spectra — only the first N/2 bins are unique
f_s/N
Hz per bin
Frequency resolution — halving N doubles bin width and reduces resolution
f_s/2
Nyquist limit
Highest representable frequency — bins above this wrap to negative frequencies
Problem

The Frequency Axis: fftfreq

After calling fft(x), bin index k does not directly equal frequency. You need fftfreq to convert:

$$f_k = \frac{k \cdot f_s}{N}, \quad k = 0, 1, \ldots, \frac{N}{2}-1$$
$f_k = k \cdot \dfrac{f_s}{N}$
$f_k$
Frequency
Physical frequency (Hz) at bin k
$k$
Bin Index
Integer index 0 to N−1
$f_s$
Sample Rate
Samples per second (Hz)
$N$
Frame Length
Total samples in the FFT window

The frequency resolution is $\Delta f = f_s/N$ Hz per bin. To detect a 1 Hz difference, you need at least 1 second of signal ($N = f_s$).

📝 Worked Example — Frequency resolution
1
Setup: $f_s = 8000$ Hz, $N = 512$ samples
2
Frequency resolution: $\Delta f = f_s/N = 8000/512 = \mathbf{15.625}$ Hz per bin
3
Nyquist frequency: $f_s/2 = 4000$ Hz — highest detectable frequency
4
Bin containing 440 Hz (concert A):
$k = f/\Delta f = 440/15.625 \approx 28.16 \rightarrow$ bin 28 (nearest integer)
$f_s=8000$, $N=512$: resolution = 15.625 Hz/bin. Concert A (440 Hz) lands at bin k=28.
Quick Check

If you double N to 1024 (same $f_s = 8000$ Hz), what is the new frequency resolution? What bin does 440 Hz land in?

$\Delta f = 8000/1024 = 7.8125$ Hz/bin. Bin for 440 Hz: $k = 440/7.8125 \approx 56.3$ → bin 56. Doubling N halves the bin width, giving twice the frequency resolution.

Zero-Padding: Free Resolution (or Is It?)

Appending N zeros to a length-N signal before calling fft() doubles the number of output bins without changing the spectral content:

$$X_{\text{zp}}[k] = \text{fft}(\underbrace{[x[0],\ldots,x[N{-}1],0,\ldots,0]}_{2N \text{ samples}})$$

Zero-padding interpolates the spectrum — it adds points between existing bins, making peaks look sharper. It does not add new spectral information (that would require a longer signal).

💡

Use scipy.fft.fft(x, n=2*N) to zero-pad to length 2N in one call. Typical practice: pad to the next power of 2 for maximum FFT efficiency.

📝 Worked Example — Zero-padding effect
1
Original: N=8 samples, $f_s=8$ Hz → $\Delta f = 1$ Hz/bin, bins at 0,1,2,3,4 Hz
2
After zero-padding to 2N=16: $\Delta f = 8/16 = 0.5$ Hz/bin — twice as many bins
3
Spectral content unchanged: a 2 Hz tone still appears at 2 Hz; zero-padding only adds interpolated values between existing bins
4
What zero-padding cannot do: resolve two tones at 2.0 Hz and 2.3 Hz — only a longer signal can do that
Zero-padding: interpolation (cosmetic resolution). Longer signal: true spectral resolution (new information).
Quick Check

A signal has N = 256 samples at $f_s = 44100$ Hz. What is the frequency resolution? What happens to this resolution if you zero-pad to N = 512?

Original: $\Delta f = 44100/256 \approx 172.3$ Hz/bin. After zero-padding to 512: $\Delta f = 44100/512 \approx 86.1$ Hz/bin — the bins are halved in width, but no new frequencies are resolved. The spectral envelope is interpolated, not sharpened.
BEFORE: N = 8 bins, Δf = 1 Hz 0 1 2 3 frequency (Hz) zero-pad to 2N AFTER: 2N = 16 bins, Δf = 0.5 Hz 0 1 2 3 frequency (Hz) same envelope — 2× more bins

Zero-padding doubles the bin density (interpolation) without adding new frequency information. The spectral shape is preserved; peaks appear smoother.

⚠️
Common Mistake

"Zero-padding gives better frequency resolution."

It gives better display resolution (interpolation), not better spectral resolution (ability to distinguish two nearby tones). To actually resolve two frequencies separated by $\Delta f$ Hz, you need a signal of length at least $1/\Delta f$ seconds — no amount of zero-padding can substitute for more data.

Solution
Pause & Predict

A signal has $f_s = 1000$ Hz and N = 100 samples, containing tones at 50 Hz and 200 Hz. Before running the widget: which bins do you expect to see peaks at? How many useful bins total?

Hint: bin k ≈ f / (fs/N). Useful bins = N/2 = 50.

Try It: Live Spectrum Analyser

Adjust the signal's frequency components and FFT settings. Watch how the spectrum responds and how zero-padding changes display resolution.

2⁹
Magnitude spectrum (one-sided) Peak bins
Frequency Resolution
Δf = fs / N
Implementation
Python · scipy.fft — One-Sided Magnitude Spectrum
import numpy as np from scipy import fft import matplotlib.pyplot as plt # ── 1. Synthesise signal: two tones at 440 Hz and 880 Hz ────── fs = 8000 # sample rate (Hz) N = 512 # frame length (samples) t = np.arange(N) / fs signal = (np.sin(2 * np.pi * 440 * t) + 0.5 * np.sin(2 * np.pi * 880 * t)) # ── 2. Compute FFT (optional: zero-pad to 2N for denser bins) ─ N_fft = 2 * N # zero-pad to 2N X = fft.fft(signal, n=N_fft) # complex spectrum freqs = fft.fftfreq(N_fft, d=1/fs) # frequency axis (Hz) # ── 3. Keep only positive frequencies (one-sided spectrum) ──── pos = freqs > 0 f_pos = freqs[pos] mag = np.abs(X[pos]) * 2 / N # scale for amplitude # ── 4. Plot ────────────────────────────────────────────────── plt.figure(figsize=(10, 4)) plt.plot(f_pos, mag, color='#0d9488', linewidth=1.5) plt.xlabel('Frequency (Hz)') plt.ylabel('Magnitude') plt.title('One-Sided Magnitude Spectrum (zero-padded)') plt.xlim(0, 2000) plt.grid(True, alpha=0.3) plt.tight_layout() plt.show()
Output
Key Takeaway
The complete scipy.fft pipeline is: fft(x) → complex spectrum X[k] → fftfreq(N, 1/fs) for the Hz axis → np.abs(X[:N//2]) for the one-sided magnitude spectrum — where peaks in the magnitude correspond directly to sinusoidal frequencies in the original signal.
🎵
Real-World Application

Audio Fingerprinting — How Shazam Identifies Music in Seconds

Shazam applies the FFT to overlapping 3-second audio windows from a song, extracts peaks in the magnitude spectrum across multiple frequency bands, and stores them as a "constellation map" fingerprint. When you hum or record a clip, the same FFT + peak-detection pipeline runs in real time. The fingerprint is then matched against a database of 100+ million songs in milliseconds. The entire pipeline — recording, FFT, peak detection, hash lookup — runs on a mobile CPU in under two seconds, made possible by the O(N log N) FFT.

Checkpointscipy.fft API — Quick Retrieval

Q1 For $f_s = 16000$ Hz and $N = 1024$, what is the frequency resolution (Hz per bin) and at which bin index does a 2000 Hz tone appear?

$\Delta f = 16000/1024 \approx 15.625$ Hz/bin. Bin for 2000 Hz: $k = 2000/15.625 = 128$.

Q2 Why do we keep only the first N/2 output bins of the FFT for a real-valued signal?

Real-valued signals have conjugate-symmetric spectra: X[N−k] = X[k]*. The second half (bins N/2+1 to N−1) is a mirror of the first half and carries no new information.

Q3 You zero-pad from N=256 to N=1024. What changes and what does not change?

Changes: bin width halves ($\Delta f$ decreases by 4×), the spectrum looks smoother and peaks appear sharper. Does not change: the actual frequency resolution (ability to distinguish close tones), the signal's energy, or its spectral content.

FFT Explorer Dashboard

An integrated dashboard combining all three topics: compare DFT vs FFT complexity, trace a butterfly computation, and analyse a multi-tone signal spectrum — all live.

See how O(N²) and O(N log N) diverge as N increases. The teal line stays nearly flat while red explodes.

Naive DFT: N² muls Radix-2 FFT: (N/2)log₂N muls

Key Takeaways

Five ideas from this week that every signal processing practitioner must carry.

DFT is O(N²) — Catastrophically Slow

The naive DFT requires N² complex multiplications. At N = 4096, that is 16.7 million muls per frame — real-time audio processing is impossible. Doubling N quadruples cost.

✂️

Cooley-Tukey: Divide-and-Conquer

Split the signal into even- and odd-indexed halves. Compute two N/2-point DFTs independently. Recombine with twiddle factors in O(N). Apply recursively log₂N times → O(N log N).

🦋

The Butterfly: One Mul, Two Outputs

Each butterfly takes inputs (a, b) and a twiddle W, and produces (a + Wb, a − Wb) — two outputs for the price of one complex multiplication. Twiddle symmetry W^(k+N/2) = −W^k means the upper half is free.

📊

scipy.fft: fft → fftfreq → abs

The canonical pipeline: fft(x) → complex spectrum → fftfreq(N, 1/fs) → Hz axis → abs(X[:N//2]) → one-sided magnitude. Frequency resolution = f_s / N Hz per bin.

0️⃣

Zero-Padding: Interpolation, Not Resolution

Zero-padding increases bin density (smoother display) but does not improve the ability to resolve closely-spaced tones. Only a longer signal (more time data) gives true spectral resolution.

🔭

Coming Up — Week 6: Spectral Analysis & Denoising

Now that you have the FFT, learn to use it for practical signal analysis: window functions to reduce spectral leakage, Welch's method for averaged power spectra, and FFT-based denoising pipelines for audio and ECG signals.

Go Deeper

Curated resources for building a richer understanding of the FFT — from intuition to implementation.

Week 5 Exercises

8 exercises (3 topics × 2 + 2 synthesis). Expand the hint if you are stuck — only after attempting the problem first.

1 Theory · O(N²) Bottleneck Easy

Counting DFT Operations

A radar system processes chirp signals of length N = 128. (a) How many complex multiplications does the naive DFT require per transform? (b) How many does the radix-2 FFT require? (c) By what factor is the FFT faster?

DFT: N² muls. FFT: (N/2)log₂N muls. For (c) compute ratio and round to nearest integer.
2 Code · O(N²) Bottleneck Easy

Timing Benchmark

Implement the naive_dft function using NumPy matrix multiplication (the N×N twiddle matrix approach). Use time.perf_counter to measure the average execution time over 10 repeats for N ∈ {32, 64, 128, 256}. Plot DFT time vs N² on a log-log axis and verify the slope is approximately 2.

Build the twiddle matrix W with shape (N, N): W = np.exp(-2j*np.pi*k*n/N) where k = n.reshape((N,1)). Then X = W @ x. For the log-log plot use plt.loglog.
3 Theory · Cooley-Tukey & Butterfly Medium

4-Point FFT by Hand

Given x = [1, 2, 3, 4], compute X[0], X[1], X[2], X[3] using the radix-2 DIT butterfly exactly as derived in the slides: (1) split even/odd, (2) compute two 2-point DFTs, (3) apply the butterfly recombination with the appropriate $W_4^k$ twiddle factors. Verify by computing the DFT directly from the sum formula.

Even: x_e = [1,3], Odd: x_o = [2,4]. 2-point DFTs: X_e = [4, -2], X_o = [6, -2]. Twiddle: W₄⁰=1, W₄¹=−j. Then X[0]=X_e[0]+W₄⁰·X_o[0], X[1]=X_e[1]+W₄¹·X_o[1], X[2]=X_e[0]−W₄⁰·X_o[0], X[3]=X_e[1]−W₄¹·X_o[1].
4 Code · Cooley-Tukey & Butterfly Medium

Implement Iterative FFT

Implement an iterative (non-recursive) radix-2 Cooley-Tukey FFT using bit-reversal permutation and in-place butterfly updates. Your function signature: fft_iterative(x). Test it on random complex arrays of length 8, 16, 64, and 256, verifying np.allclose(fft_iterative(x), scipy.fft.fft(x)) for each.

Step 1: bit-reverse the input order. Step 2: iterate stages s = 1, 2, …, log₂N. For each stage, butterfly stride = 2^s, twiddle = exp(−2πj·k/(2^s)) for k=0…stride/2−1. Update pairs (a, b) → (a+W·b, a−W·b) in place.
5 Theory · scipy.fft in Practice Easy

Frequency Axis Calculation

A physiological sensor records a signal at $f_s = 250$ Hz with N = 500 samples. (a) What is the frequency resolution (Hz per bin)? (b) What bin index corresponds to 50 Hz (power-line interference)? (c) After zero-padding to 2000 samples, what is the new bin width? (d) Does zero-padding help resolve two tones at 49.8 Hz and 50.2 Hz? Explain why or why not.

Δf = fs/N. Bin k = f/Δf. True resolution requires a longer signal: to resolve 0.4 Hz separation you need at least T = 1/0.4 = 2.5 seconds = 625 samples at 250 Hz.
6 Code · scipy.fft in Practice Medium

Multi-Tone Spectrum Detector

Synthesise a signal at $f_s = 8000$ Hz containing three tones: 330 Hz (E4), 440 Hz (A4), and 550 Hz (C#5), each for 0.5 seconds. Use scipy.fft.fft to compute the one-sided magnitude spectrum. Find the three dominant frequency bins using scipy.signal.find_peaks with a minimum distance of 50 bins and print the detected frequencies. Plot the magnitude spectrum from 0–1000 Hz.

N = int(0.5 * 8000) = 4000 samples. After fft, keep first N//2 bins. Use fftfreq(N, 1/fs)[:N//2] for the axis. find_peaks(magnitude, distance=50) returns peak indices; convert to Hz by freqs[peak_idx].
7 Synthesis · Theory: FFT Complexity Derivation Hard

Prove the O(N log N) Operation Count

Let $C(N)$ be the number of complex multiplications for an N-point radix-2 FFT. (a) Write the recurrence relation: splitting into two N/2-point FFTs plus N/2 butterfly muls. (b) Solve the recurrence to show $C(N) = (N/2)\log_2 N$. (c) For N = 1024, compute both $N^2$ and $(N/2)\log_2 N$ and express the speedup ratio as an integer. (d) Explain why the FFT speedup grows with N and what this means architecturally for choosing hardware.

Recurrence: C(N) = 2·C(N/2) + N/2. Unroll: after k levels, C(N) = N/2·k + 2^k·C(N/2^k). Base case C(1)=0. After k=log₂N levels: C(N) = (N/2)·log₂N.
8 Synthesis · Code: Spectrum Analyser Pipeline Hard

Build a Real-Time Spectrum Analyser

Build a complete spectrum analyser function spectrum_analyser(fs, duration=1.0, n_fft=2048) that: (1) synthesises a "mystery" signal combining a 261 Hz tone (C4), a 329 Hz tone (E4), and a 392 Hz tone (G4) — forming a C major chord; (2) applies a Hann window (scipy.signal.windows.hann) to reduce spectral leakage; (3) computes the FFT and one-sided magnitude spectrum; (4) detects peaks and prints the identified chord tones; (5) plots the magnitude spectrum from 100–600 Hz with peak markers. Compare results with and without the window function — which approach gives sharper peaks?

Apply window before FFT: x_windowed = x * hann(len(x)). Scale the magnitude by 2/sum(window) to compensate for the amplitude reduction. Use find_peaks with height=0.1 (relative threshold) to detect the three chord tones.