Feature Detection

Module 5 - Corners, Harris, SIFT, ORB, Matching

What are Image Features?

A feature is a compact, meaningful piece of information extracted from an image. Good features are repeatable (found at the same location despite viewpoint/lighting changes), distinctive (distinguishable from each other), and efficient (fast to compute).

Features enable: image stitching (panoramas), 3D reconstruction, object recognition, AR tracking, and more.

Corners vs Edges vs Flat Regions

Consider a small image patch and how it moves:

Flat region

Looks the same in all directions. No information. Can't localize.

Edge

Can localize perpendicular to edge, not along it. Ambiguous.

Corner

Looks different in all directions. Uniquely localizable. Best feature!

Harris Corner Detector

Harris (1988) measures how much intensity changes when you shift a window in any direction, using the Structure Tensor M:

M = Σ w(x,y) [ Iₓ² IₓIᵧ ]

[ IₓIᵧ Iᵧ² ]

Symbol guide

M	structure tensor (2×2 matrix) - summarizes how intensity changes in all directions around a pixel
Σ w(x,y)	weighted sum over a local neighborhood window; w is a Gaussian that gives more weight to the center
Iₓ	horizontal image gradient - how much intensity changes left/right at each pixel
Iᵧ	vertical image gradient - how much intensity changes up/down at each pixel
Iₓ²	squared horizontal gradient - strong in regions with vertical edges
Iᵧ²	squared vertical gradient - strong in regions with horizontal edges
IₓIᵧ	cross term - large when both gradients are strong simultaneously, indicating a corner
[ … ]	2×2 matrix notation; the off-diagonal IₓIᵧ terms capture edge orientation correlation

The Harris response R is:

R = det(M) - k · trace(M)² = λ₁λ₂ - k(λ₁+λ₂)²

Symbol guide

R	Harris response score - positive = corner, negative = edge, near zero = flat region
det(M)	determinant of M = λ₁ × λ₂ - large when intensity changes strongly in both directions (corner)
trace(M)	trace of M = λ₁ + λ₂ - sum of eigenvalues, measuring total gradient energy in the window
k	sensitivity constant, typically 0.04–0.06 - trades off corner vs edge detection; higher k = fewer corners
λ₁, λ₂	eigenvalues of M - represent the dominant gradient strengths in two orthogonal directions
λ₁λ₂	product of eigenvalues - large only if both directions have strong gradients (true corner)
(λ₁+λ₂)²	squared sum - penalizes flat regions and single edges where one eigenvalue dominates

R ≫ 0: Corner (both eigenvalues large)
R ≈ 0: Flat region (both eigenvalues small)
R ≪ 0: Edge (one eigenvalue much larger)

k is typically 0.04–0.06. After computing R, apply a threshold and non-maximum suppression to get corner locations.

Harris Corner Detector - Live

k value: 0.04

Threshold: 10000

Window σ: 1

Show

Detected corners: ,

SIFT and ORB

Harris finds corners but they're not scale-invariant. SIFT (Scale-Invariant Feature Transform, Lowe 2004) detects keypoints at multiple scales using a Difference-of-Gaussian (DoG) pyramid, then builds a 128-dim descriptor based on gradient orientations. Invariant to scale, rotation, and partial illumination change.

ORB (Oriented FAST + Rotated BRIEF) is a fast, free alternative to SIFT. It uses FAST for keypoint detection and BRIEF for binary descriptors. ~100× faster than SIFT, suitable for real-time.

Detector	Scale-inv.	Rot-inv.	Speed	License
Harris	✗	✗	Fast	Free
SIFT	✓	✓	Slow	Free (post-2020)
SURF	✓	✓	Medium	Patented
ORB	~✓	✓	Very fast	Free

Feature Matching

Once you have descriptors from two images, find correspondences by comparing descriptors. Common strategies:

Brute-force: Compare every descriptor to every other. O(n²). Use Hamming distance for binary descriptors (ORB), L2 for float descriptors (SIFT).
FLANN: Approximate nearest neighbour - much faster for large descriptor sets.
Ratio test (Lowe's): Only accept match if best match is significantly better than second-best: d1 / d2 < 0.75. Removes ambiguous matches.

Matching Visualizer (Synthetic)

Quiz

← Edges Next: Segmentation →