Complete Algebra Proof Collection

The Deeper "Why" of Algebra

Factoring, Polynomial, Quadratic, Cartesian, Conic, Symmetry, Logarithmic & Matrix Proofs

Factoring Identity Proofs
- Greatest Common Factor (GCF) — Distributive Property
- Difference of Squares
- Perfect Square Trinomial
- Sum of Cubes
- Difference of Cubes
Polynomial Proofs
- The Division Algorithm
- The Remainder Theorem
- The Factor Theorem
- The Rational Root Theorem
- Fundamental Theorem of Algebra (intuition)
The Quadratic Formula
- Derivation by Completing the Square
- The Discriminant
Cartesian / Coordinate Proofs
- The Midpoint Formula
- Perpendicular Slopes: $m_1 \cdot m_2 = -1$
- Diagonals of a Parallelogram Bisect Each Other
Conic Section Proofs
- Deriving the Parabola from Focus & Directrix
- Deriving the Ellipse from Two Foci
- Deriving the Hyperbola from Two Foci
Symmetry Proofs
- Why $f(-x)=f(x)$ Gives Y-Axis Symmetry
- Every Function = Even Part + Odd Part
Logarithmic & Exponential Proofs
- The Change of Base Formula
- What Is Euler's Number $e$? (Derivation)
- Why $e$ Is Irrational
Matrix & Row Reduction Proofs
- Row-Echelon Form (REF) — Why It Works
- Reduced Row-Echelon Form (RREF) — Uniqueness

I. Factoring Identity Proofs

Each factoring pattern can be proven by expanding the right side and verifying it equals the left. These aren't formulas to memorize on faith — they are algebraic identities with clean proofs.

1. Greatest Common Factor (GCF) — The Distributive Property

Identity

$$ab + ac = a(b + c)$$

Proof

This is the distributive property read in reverse. The distributive property states:

$$a(b + c) = ab + ac$$

This is an axiom — one of the foundational rules of arithmetic accepted without proof. Factoring out the GCF is simply reading this axiom from right to left: if every term shares the factor $a$, we can "undistribute" it.

Example: $6x^2 + 3x = 3x(2x + 1)$. The GCF of $6x^2$ and $3x$ is $3x$. Verify: $3x \cdot 2x + 3x \cdot 1 = 6x^2 + 3x$. ✓

Why it's fundamental

GCF factoring is the first step in every factoring problem. Always look for a common factor before trying any other technique. It simplifies everything that follows.

■ Q.E.D.

2. Difference of Squares

Identity

$$a^2 - b^2 = (a + b)(a - b)$$

Proof — Expand the right side

Multiply $(a + b)(a - b)$ using distribution (FOIL):

$$a \cdot a + a \cdot (-b) + b \cdot a + b \cdot (-b)$$

$$= a^2 - ab + ab - b^2$$

The middle terms $-ab$ and $+ab$ cancel:

$$= a^2 - b^2 \qquad \checkmark$$

Why only difference, not sum?

$a^2 + b^2$ does not factor over the real numbers. The middle terms would need to cancel, but $(a + bi)(a - bi) = a^2 + b^2$ — the factorization requires complex numbers. Over the reals, $a^2 + b^2$ is irreducible.

■ Q.E.D.

3. Perfect Square Trinomial

Identities

$$a^2 + 2ab + b^2 = (a + b)^2$$ $$a^2 - 2ab + b^2 = (a - b)^2$$

Proof — Expand $(a + b)^2$

$(a + b)^2 = (a + b)(a + b)$. Distribute:

$$a \cdot a + a \cdot b + b \cdot a + b \cdot b = a^2 + ab + ab + b^2 = a^2 + 2ab + b^2 \qquad \checkmark$$

The proof for $(a - b)^2 = a^2 - 2ab + b^2$ is identical but with $b$ replaced by $-b$:

$$(a + (-b))^2 = a^2 + 2a(-b) + (-b)^2 = a^2 - 2ab + b^2 \qquad \checkmark$$

How to recognize one

A trinomial $x^2 + bx + c$ is a perfect square if and only if $c = (b/2)^2$. Check: is the constant term the square of half the linear coefficient? If yes, it factors as $(x + b/2)^2$. This recognition is the key to completing the square.

■ Q.E.D.

4. Sum of Cubes

Identity

$$a^3 + b^3 = (a + b)(a^2 - ab + b^2)$$

Proof — Expand the right side

Distribute $a$ and $b$ separately across the trinomial:

$$a(a^2 - ab + b^2) + b(a^2 - ab + b^2)$$

$$= a^3 - a^2b + ab^2 + a^2b - ab^2 + b^3$$

Cancel the middle terms: $-a^2b + a^2b = 0$ and $+ab^2 - ab^2 = 0$:

$$= a^3 + b^3 \qquad \checkmark$$

■ Q.E.D.

5. Difference of Cubes

Identity

$$a^3 - b^3 = (a - b)(a^2 + ab + b^2)$$

Proof — Expand the right side

Distribute $a$ and $(-b)$:

$$a(a^2 + ab + b^2) + (-b)(a^2 + ab + b^2)$$

$$= a^3 + a^2b + ab^2 - a^2b - ab^2 - b^3$$

Again, the middle terms cancel in pairs:

$$= a^3 - b^3 \qquad \checkmark$$

The pattern & generalization

Both cube formulas have the same structure: a binomial times a trinomial, where the trinomial is engineered so the cross terms cancel. The sign pattern is: SOAP — Same, Opposite, Always Positive. The binomial has the Same sign as the original, the first term of the trinomial has the Opposite sign, and the last term is Always Positive.

This generalizes: $a^n - b^n = (a-b)(a^{n-1} + a^{n-2}b + \cdots + ab^{n-2} + b^{n-1})$ for any positive integer $n$.

■ Q.E.D.

II. Polynomial Proofs

1. The Division Algorithm

Theorem

For any polynomial $f(x)$ and any nonzero polynomial $d(x)$, there exist unique polynomials $q(x)$ (quotient) and $r(x)$ (remainder) such that: $$f(x) = d(x) \cdot q(x) + r(x)$$ where $\deg(r) < \deg(d)$ or $r(x) = 0$.

Proof — Existence (by construction)

The proof constructs $q(x)$ and $r(x)$ through the long division algorithm:

Base case: If $\deg(f) < \deg(d)$, then $q(x) = 0$ and $r(x) = f(x)$. Done.

Inductive step: If $\deg(f) \geq \deg(d)$, let $f(x) = a_n x^n + \cdots$ and $d(x) = b_m x^m + \cdots$. Form the term $\frac{a_n}{b_m} x^{n-m}$ and compute:

$$f_1(x) = f(x) - \frac{a_n}{b_m} x^{n-m} \cdot d(x)$$

This cancels the leading term of $f$, so $\deg(f_1) < \deg(f)$. Repeat this process with $f_1$ in place of $f$. Each step reduces the degree by at least 1, so the process must terminate when the degree drops below $\deg(d)$. The accumulated terms form $q(x)$; the leftover is $r(x)$.

Proof — Uniqueness

Suppose $f = dq_1 + r_1$ and $f = dq_2 + r_2$. Then:

$$d(q_1 - q_2) = r_2 - r_1$$

The left side has degree $\geq \deg(d)$ (unless $q_1 = q_2$). The right side has degree $< \deg(d)$ (since both remainders do). The only way these can be equal is if both sides are $0$. Therefore $q_1 = q_2$ and $r_1 = r_2$.

Analogy to integers

This is exactly the polynomial version of integer division: $17 = 5 \times 3 + 2$. The dividend ($17$) equals the divisor ($5$) times the quotient ($3$) plus the remainder ($2$), and the remainder is strictly smaller than the divisor.

■ Q.E.D.

2. The Remainder Theorem

Theorem

When a polynomial $f(x)$ is divided by $(x - c)$, the remainder equals $f(c)$.

Proof

By the Division Algorithm, we can write:

$$f(x) = (x - c)\,q(x) + r$$

Since we're dividing by $(x - c)$ (degree 1), the remainder $r$ must be a constant (degree 0). Now evaluate at $x = c$:

$$f(c) = (c - c)\,q(c) + r = 0 \cdot q(c) + r = r$$

Therefore $r = f(c)$.

■ Q.E.D.

3. The Factor Theorem

Theorem

$(x - c)$ is a factor of $f(x)$ if and only if $f(c) = 0$.

Proof (⟹ direction)

If $(x - c)$ is a factor, then $f(x) = (x - c)\,q(x)$ for some polynomial $q(x)$.

Evaluate at $x = c$: $f(c) = (c - c)\,q(c) = 0$.

Proof (⟸ direction)

If $f(c) = 0$, then by the Remainder Theorem, the remainder when dividing $f(x)$ by $(x-c)$ is $0$.

So $f(x) = (x - c)\,q(x) + 0 = (x-c)\,q(x)$, which means $(x-c)$ is a factor.

■ Q.E.D.

4. The Rational Root Theorem

Theorem

If the polynomial $f(x) = a_n x^n + a_{n-1}x^{n-1} + \cdots + a_1 x + a_0$ has integer coefficients and $\frac{p}{q}$ (in lowest terms) is a rational root, then $p$ divides $a_0$ and $q$ divides $a_n$.

Proof

Since $\frac{p}{q}$ is a root, we have $f\!\left(\frac{p}{q}\right) = 0$:

$$a_n \left(\frac{p}{q}\right)^n + a_{n-1}\left(\frac{p}{q}\right)^{n-1} + \cdots + a_1\left(\frac{p}{q}\right) + a_0 = 0$$

Multiply both sides by $q^n$ to clear all denominators:

$$a_n p^n + a_{n-1} p^{n-1} q + \cdots + a_1 p q^{n-1} + a_0 q^n = 0$$

Show $p$ divides $a_0$

Move $a_0 q^n$ to the right side:

$$a_n p^n + a_{n-1} p^{n-1} q + \cdots + a_1 p q^{n-1} = -a_0 q^n$$

Every term on the left has $p$ as a factor. Therefore $p$ divides the right side $a_0 q^n$. Since $\gcd(p, q) = 1$ (the fraction is in lowest terms), $p$ doesn't share any factors with $q^n$, so $p$ must divide $a_0$.

Show $q$ divides $a_n$

Similarly, move $a_n p^n$ to the right side:

$$a_{n-1} p^{n-1} q + \cdots + a_1 p q^{n-1} + a_0 q^n = -a_n p^n$$

Every term on the left has $q$ as a factor. So $q$ divides $a_n p^n$. Since $\gcd(p, q) = 1$, $q$ must divide $a_n$.

What this gives you

A finite list of candidates to test: all fractions $\pm\frac{p}{q}$ where $p$ divides the constant term and $q$ divides the leading coefficient. Test each by evaluating $f(p/q)$. If it equals zero, you've found a root. This is the "test and factor" strategy for polynomials.

■ Q.E.D.

5. Fundamental Theorem of Algebra (Intuition)

Theorem

Every polynomial of degree $n \geq 1$ with complex coefficients has exactly $n$ roots (counting multiplicity) in $\mathbb{C}$.

A rigorous proof requires complex analysis (Liouville's Theorem or winding numbers), but we can build intuition:

Key ideas

Degree 1: $ax + b = 0$ always has root $x = -b/a$. ✓

Degree 2: The quadratic formula always produces two roots (real or complex), since $\sqrt{b^2 - 4ac}$ is always defined in $\mathbb{C}$. ✓

General pattern: Once we find one root $c$ (guaranteed to exist by the theorem), the Factor Theorem gives us $f(x) = (x-c)\,q(x)$ where $q(x)$ has degree $n-1$. Apply the theorem again to $q(x)$, and repeat. After $n$ steps:

$$f(x) = a(x - c_1)(x - c_2)\cdots(x - c_n)$$

This gives exactly $n$ roots $c_1, c_2, \ldots, c_n$ (some may repeat).

The real magic

The deepest part is proving that at least one root exists. This requires the completeness of $\mathbb{C}$ — the complex numbers have "no holes." Real numbers alone aren't enough ($x^2 + 1 = 0$ has no real roots). The complex numbers were essentially invented to make this theorem true.

III. The Quadratic Formula

Derivation by Completing the Square

We start with

$$ax^2 + bx + c = 0 \qquad \text{where } a \neq 0$$

Step 1 — Move the constant, divide by $a$

$$x^2 + \frac{b}{a}\,x = -\frac{c}{a}$$

↓

Step 2 — Complete the square

Add $\left(\frac{b}{2a}\right)^2 = \frac{b^2}{4a^2}$ to both sides:

$$x^2 + \frac{b}{a}\,x + \frac{b^2}{4a^2} = \frac{b^2}{4a^2} - \frac{c}{a} = \frac{b^2 - 4ac}{4a^2}$$

↓

Step 3 — Factor the left side as a perfect square

$$\left(x + \frac{b}{2a}\right)^2 = \frac{b^2 - 4ac}{4a^2}$$

↓

Step 4 — Take the square root, isolate $x$

$$x + \frac{b}{2a} = \pm\frac{\sqrt{b^2 - 4ac}}{2a}$$

$$\boxed{\;x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}\;}$$

The Discriminant

The expression under the square root, $\Delta = b^2 - 4ac$, is called the discriminant. It determines the nature of the roots before you solve:

Discriminant	Meaning	Roots
$b^2 - 4ac > 0$	Positive under the radical	Two distinct real roots
$b^2 - 4ac = 0$	Zero under the radical	One repeated real root
$b^2 - 4ac < 0$	Negative under the radical	Two complex conjugate roots

■ Q.E.D.

IV. Cartesian / Coordinate Proofs

1. The Midpoint Formula

Theorem

The midpoint of the segment from $(x_1, y_1)$ to $(x_2, y_2)$ is: $$M = \left(\frac{x_1 + x_2}{2},\;\frac{y_1 + y_2}{2}\right)$$

Proof

The midpoint $M$ is equidistant from both endpoints. On a number line, the midpoint of $a$ and $b$ is their average: $(a+b)/2$. Coordinates are independent, so:

$$x_M = \frac{x_1 + x_2}{2}, \qquad y_M = \frac{y_1 + y_2}{2}$$

Verification: The distance from $(x_1, y_1)$ to $M$ equals the distance from $M$ to $(x_2, y_2)$:

$$d_1 = \sqrt{\left(\frac{x_2 - x_1}{2}\right)^2 + \left(\frac{y_2 - y_1}{2}\right)^2} = d_2 \qquad \checkmark$$

■ Q.E.D.

2. Perpendicular Slopes: $m_1 \cdot m_2 = -1$

Theorem

Two non-vertical lines are perpendicular if and only if the product of their slopes is $-1$.

Proof

Consider two lines through the origin with slopes $m_1$ and $m_2$. Pick a point on each at $x = 1$: $A = (1, m_1)$ and $B = (1, m_2)$.

The lines are perpendicular when triangle $OAB$ has a right angle at $O$. By the Pythagorean Theorem:

$$|OA|^2 + |OB|^2 = |AB|^2$$

Compute: $|OA|^2 = 1 + m_1^2$, $|OB|^2 = 1 + m_2^2$, $|AB|^2 = (m_1 - m_2)^2$.

$$(1 + m_1^2) + (1 + m_2^2) = m_1^2 - 2m_1 m_2 + m_2^2$$

The $m_1^2$ and $m_2^2$ terms cancel:

$$2 = -2m_1 m_2 \qquad \Longrightarrow \qquad m_1 m_2 = -1$$

■ Q.E.D.

3. Diagonals of a Parallelogram Bisect Each Other

Theorem

The diagonals of any parallelogram intersect at their mutual midpoint.

Proof

Place the parallelogram with vertices at $A = (0, 0)$, $B = (a, 0)$, $C = (a + b, c)$, $D = (b, c)$.

$$\text{Midpoint of } AC = \left(\frac{a+b}{2},\;\frac{c}{2}\right)$$

$$\text{Midpoint of } BD = \left(\frac{a+b}{2},\;\frac{c}{2}\right)$$

The midpoints are identical — the diagonals bisect each other.

The power of coordinates

This proof is three lines of algebra. With traditional Euclidean geometry (congruent triangles, alternate interior angles), it takes a full page. This is why Descartes' invention of the coordinate plane was revolutionary.

■ Q.E.D.

V. Conic Section Proofs

1. Deriving the Parabola from Focus & Directrix

Definition

A parabola is the set of all points equidistant from a fixed point (focus $F$) and a fixed line (directrix $\ell$).

Place $F = (0, p)$ and directrix at $y = -p$.

Derivation

Equidistance condition: $\sqrt{x^2 + (y-p)^2} = |y + p|$. Square both sides:

$$x^2 + y^2 - 2py + p^2 = y^2 + 2py + p^2$$

Cancel and simplify:

$$x^2 = 4py \qquad \Longrightarrow \qquad \boxed{\;y = \frac{x^2}{4p}\;}$$

Connection to $y = ax^2$

Setting $a = 1/(4p)$ gives $y = ax^2$. So $a$ determines the focal distance: $p = 1/(4a)$. Wider parabola = focus farther from vertex.

■ Q.E.D.

2. Deriving the Ellipse from Two Foci

Definition

An ellipse is the set of all points where the sum of distances to two foci equals a constant $2a$.

Derivation

Foci at $(\pm c, 0)$. The defining condition $\sqrt{(x+c)^2 + y^2} + \sqrt{(x-c)^2 + y^2} = 2a$, after isolating and squaring twice, simplifies to:

$$\frac{x^2}{a^2} + \frac{y^2}{a^2 - c^2} = 1$$

Define $b^2 = a^2 - c^2$ (positive since $a > c$):

$$\boxed{\;\frac{x^2}{a^2} + \frac{y^2}{b^2} = 1\;}$$

■ Q.E.D.

3. Deriving the Hyperbola from Two Foci

Definition

A hyperbola is the set of all points where the absolute difference of distances to two foci equals a constant $2a$.

Derivation

Same technique as the ellipse, but with subtraction and $c > a$:

$$\left|\sqrt{(x+c)^2 + y^2} - \sqrt{(x-c)^2 + y^2}\right| = 2a$$

Define $b^2 = c^2 - a^2$ (positive since $c > a$):

$$\boxed{\;\frac{x^2}{a^2} - \frac{y^2}{b^2} = 1\;}$$

Ellipse vs. Hyperbola

Ellipse: $b^2 = a^2 - c^2$ (plus sign, sum of distances). Hyperbola: $b^2 = c^2 - a^2$ (minus sign, difference of distances). Both are slices of a cone — hence "conic sections."

■ Q.E.D.

VI. Symmetry Proofs

1. Why $f(-x) = f(x)$ Gives Y-Axis Symmetry

Theorem

A function $f$ has y-axis symmetry if and only if $f(-x) = f(x)$ for all $x$.

Proof (⟹)

If the graph is symmetric about the y-axis, then for every point $(x, y)$ on the graph, $(-x, y)$ is also on the graph. This means $f(x) = y = f(-x)$.

Proof (⟸)

If $f(-x) = f(x)$, then for any point $(x, f(x))$ on the graph, the point $(-x, f(-x)) = (-x, f(x))$ is also on the graph — the mirror image across the y-axis.

Why even powers are even functions

For $f(x) = x^n$ where $n$ is even: $f(-x) = (-x)^n = (-1)^n x^n = x^n = f(x)$. ✓

For odd $n$: $f(-x) = (-1)^n x^n = -x^n = -f(x)$ — odd function (origin symmetry).

■ Q.E.D.

2. Every Function = Even Part + Odd Part

Theorem

Any function $f(x)$ (with symmetric domain) decomposes uniquely as: $$f(x) = \underbrace{\frac{f(x) + f(-x)}{2}}_{\text{even part } E(x)} + \underbrace{\frac{f(x) - f(-x)}{2}}_{\text{odd part } O(x)}$$

Proof — $E$ is even, $O$ is odd, they sum to $f$

$E(-x) = \frac{f(-x) + f(x)}{2} = E(x)$ ✓ (even)

$O(-x) = \frac{f(-x) - f(x)}{2} = -\frac{f(x) - f(-x)}{2} = -O(x)$ ✓ (odd)

$E(x) + O(x) = \frac{f(x)+f(-x)}{2} + \frac{f(x)-f(-x)}{2} = f(x)$ ✓

Application

This is used in signal processing: any signal splits into symmetric and antisymmetric components. Example: $e^x = \cosh(x) + \sinh(x)$, where $\cosh$ is even and $\sinh$ is odd.

■ Q.E.D.

VII. Logarithmic & Exponential Proofs

1. The Change of Base Formula

Theorem

$$\log_b a = \frac{\ln a}{\ln b}$$

Proof

Let $x = \log_b a$, so $b^x = a$. Take $\ln$ of both sides:

$$x \ln b = \ln a \qquad \Longrightarrow \qquad x = \frac{\ln a}{\ln b}$$

Since $x = \log_b a$: $\quad \log_b a = \frac{\ln a}{\ln b}$. ✓

■ Q.E.D.

2. What Is Euler's Number $e$? (Derivation)

Definition

$$e = \lim_{n \to \infty} \left(1 + \frac{1}{n}\right)^n \approx 2.71828\ldots$$

Where $e$ comes from — compound interest

Invest $\$1$ at 100% annual interest. If compounded $n$ times per year, the amount after one year is:

$$A = \left(1 + \frac{1}{n}\right)^n$$

$n$	Compounding	$\left(1 + \frac{1}{n}\right)^n$
$1$	Annually	$2.000000$
$12$	Monthly	$2.613035$
$365$	Daily	$2.714567$
$10{,}000$	~Every 5 min	$2.718146$
$\to \infty$	Continuously	$e = 2.71828\ldots$

As compounding becomes continuous, the result converges to $e$. This is the natural growth constant.

The series definition

$e$ can also be defined as the sum of an infinite series:

$$e = \sum_{k=0}^{\infty} \frac{1}{k!} = 1 + 1 + \frac{1}{2} + \frac{1}{6} + \frac{1}{24} + \frac{1}{120} + \cdots$$

This converges rapidly — just 10 terms give 7 correct decimal places.

Why $e$ is special

$e^x$ is the only function that is its own derivative: $\frac{d}{dx}e^x = e^x$. This makes $e$ the natural base for calculus, differential equations, and the Kalman filter state-transition matrix $e^{At}$.

■

3. Why $e$ Is Irrational

Theorem (Fourier, c. 1815)

$e$ cannot be written as a fraction $p/q$ of integers.

Step 1 — Assume $e = p/q$ for contradiction

Suppose $e$ is rational: $e = p/q$ with $q \geq 2$ (since $2 < e < 3$, it's not an integer). Multiply both sides by $q!$:

$$q!\cdot e = \underbrace{q!\sum_{k=0}^{q}\frac{1}{k!}}_{\text{integer } A} + \underbrace{q!\sum_{k=q+1}^{\infty}\frac{1}{k!}}_{\text{tail } T}$$

Step 2 — $A$ is an integer, so $T$ must be an integer

$q! \cdot e = (q-1)! \cdot p$ (an integer). $A$ is an integer because each $q!/k!$ is an integer when $k \leq q$. Therefore $T = q! \cdot e - A$ must also be an integer.

Step 3 — But $0 < T < 1$

$$T = \frac{1}{q+1} + \frac{1}{(q+1)(q+2)} + \cdots < \frac{1}{q+1}\cdot\frac{1}{1 - \frac{1}{q+1}} = \frac{1}{q} \leq \frac{1}{2}$$

So $T$ is positive and less than $1$. No integer exists between 0 and 1. Contradiction.

Historical note

Hermite later proved (1873) that $e$ is transcendental — not the root of any polynomial with integer coefficients. This is even stronger than irrationality.

■ Q.E.D.

VIII. Matrix & Row Reduction Proofs

1. Row-Echelon Form (REF) — Why It Works

Definition

A matrix is in Row-Echelon Form (REF) if:
(1) All zero rows are at the bottom.
(2) The leading entry (pivot) of each nonzero row is to the right of the pivot of the row above it.
(3) All entries below each pivot are zero.

Why row operations preserve solutions

The three elementary row operations are:

(i) Swap two rows. (ii) Multiply a row by a nonzero scalar. (iii) Add a multiple of one row to another.

Each operation corresponds to a reversible algebraic manipulation of the system of equations:

(i) Reordering equations doesn't change their solutions.

(ii) Multiplying both sides of an equation by $c \neq 0$ doesn't change its solution set — divide by $c$ to reverse.

(iii) If $E_1$ and $E_2$ are both true, then $E_2 + kE_1$ is also true, and $E_2 = (E_2 + kE_1) - kE_1$ recovers the original. The solution set is unchanged.

Since every step is reversible, the transformed system has exactly the same solution set as the original. ✓

Why REF enables back-substitution

The staircase structure of REF means the last nonzero equation has only one variable (solve directly), the second-to-last has two (substitute the first, solve), and so on up the staircase. This is back-substitution, and it works precisely because each row introduces exactly one new variable relative to the row below it.

Existence — Gaussian elimination always produces REF

The algorithm is constructive: (1) Find the leftmost nonzero column. (2) Use row swaps to place a nonzero entry (pivot) at the top. (3) Use row operations to eliminate all entries below the pivot. (4) Repeat on the submatrix below and to the right of the pivot.

Each iteration reduces the submatrix size, so the process terminates in at most $\min(m, n)$ steps for an $m \times n$ matrix. The result is in REF.

REF is not unique

Different sequences of row operations can produce different REF matrices for the same system. The solution set is always the same, but the specific pivot values and non-pivot entries may differ. This is why RREF (below) was invented — it is unique.

■ Q.E.D.

2. Reduced Row-Echelon Form (RREF) — Uniqueness

Definition

A matrix is in Reduced Row-Echelon Form (RREF) if it satisfies all REF conditions plus:
(4) Each pivot is $1$.
(5) Each pivot is the only nonzero entry in its column (all entries above each pivot are also zero).

How to get from REF to RREF

Starting from REF, apply two more operations:

Scale: Divide each row by its pivot to make every pivot equal to $1$.

Eliminate upward: Use back-substitution row operations to zero out all entries above each pivot, not just below.

The result is a matrix where each pivot column has exactly one $1$ and the rest zeros — the solution is read off directly without back-substitution.

Proof — RREF is unique

Unlike REF, the RREF of a matrix is unique — regardless of which sequence of row operations you use, you always arrive at the same RREF. Here's why:

The pivot columns of the RREF correspond to the linearly independent columns of the original matrix — this is a property of the column space, not of the reduction path. The values in the non-pivot columns are determined by expressing those columns as linear combinations of the pivot columns — and these relationships are inherent to the matrix, not the reduction method.

Formally: if $R_1$ and $R_2$ are both in RREF and row-equivalent (same solution set), then $R_1 = R_2$. The proof proceeds column by column, showing that the pivot positions must be identical and the non-pivot entries are forced by the requirement that the same linear combinations hold in both forms.

What you can read from RREF

One solution: Every column is a pivot column → the last column gives the unique solution.

No solution: A row of the form $[0 \; 0 \; \cdots \; 0 \;|\; c]$ with $c \neq 0$ → inconsistent system.

Infinitely many solutions: Non-pivot columns correspond to free variables — you can set them to any value, and the pivot variables are determined in terms of them. The solution set is a parametric family.

Connection to your SLAM work

Row reduction and RREF are the foundation of linear algebra, which is the primary mathematical tool in Kalman filters. The state estimation equations $\hat{\mathbf{x}} = \mathbf{K}\mathbf{z}$ involve solving systems of linear equations at every time step. Understanding how and why these solutions are found (and when they're unique) is essential groundwork.

■ Q.E.D.