Rank Intuition
A low-rank matrix expresses a large update using fewer degrees of freedom. If $A\in\mathbb{R}^{d\times r}$ and $B\in\mathbb{R}^{r\times k}$, then $AB\in\mathbb{R}^{d\times k}$ has rank at most $r$.
LoRA Equation
LoRA freezes a base weight $W$ and trains a low-rank update:
$$W' = W + \frac{\alpha}{r}AB$$
This can dramatically reduce trainable parameters when $r \ll \min(d,k)$.
NumPy Update
import numpy as np
np.random.seed(4)
d_in, d_out, rank = 6, 4, 2
W = np.random.randn(d_in, d_out)
A = np.random.randn(d_in, rank) * 0.01
B = np.zeros((rank, d_out))
alpha = 8
x = np.random.randn(3, d_in)
base = x @ W
lora_update = x @ (A @ B) * (alpha / rank)
out = base + lora_update
print("base shape:", base.shape)
print("trainable params:", A.size + B.size, "vs full:", W.size)Why it works: fine-tuning often needs a structured change in the model's behavior, not an arbitrary full-rank update to every weight matrix.
Rank Sweep
Try ranks 1, 2, 4, and 8 on the same synthetic target matrix. Compare reconstruction error and parameter count.