Skip to content
100% in your browser. Nothing you paste is uploaded — all processing runs locally. Read more →

How QR codes actually work

On this page
  1. The visual anatomy
  2. The encoding pipeline
  3. Why mode matters
  4. Reed-Solomon error correction (the magic)
  5. ECC levels
  6. Block interleaving
  7. The 8 masks
  8. Format info
  9. Version info
  10. What scanners actually do
  11. Designing for scannability
  12. Try the generator
  13. Related across the network

A QR code that’s missing 25% of its surface still scans. Cover a corner with a Coca-Cola logo and the phone camera reads it instantly. This isn’t because phones are smart; it’s because the QR encoder did most of the work years before anyone scanned it. Here’s what actually happens when you generate a QR code.

The visual anatomy

Every QR code has the same skeleton:

The “data + ECC” region contains the actual payload plus Reed-Solomon parity codewords for error correction.

The encoding pipeline

Generating a QR code is roughly seven stages:

  1. Pick mode (numeric / alphanumeric / byte / kanji). Byte handles any UTF-8.
  2. Pack data into a bit stream: 4-bit mode indicator + character count + data bytes.
  3. Pad to fit: terminator (0000), zero-pad to byte, then pad codewords (11101100 and 00010001 alternating).
  4. Pick smallest version (1–40, ranging from 21×21 to 177×177 modules) that fits the data + ECC for the chosen ECC level.
  5. Generate Reed-Solomon ECC for each block; interleave blocks.
  6. Lay out modules: function patterns first, then snake the data + ECC bits into the remaining cells.
  7. Apply mask + write format/version info: try all 8 masks, score each by penalty rules, pick the lowest-score result.

Each stage has constraints baked in by the spec. Skip step 3’s padding correctly and the entire payload is unreadable.

Why mode matters

The mode tells the scanner how to interpret the bit stream:

ModeBits per characterWhen used
Numeric (0–9)3.33 (10 bits per 3 digits)Phone numbers, IDs
Alphanumeric (0–9 A–Z $%*+-./:)5.5 (11 bits per 2 chars)URLs in caps, IDs
Byte8UTF-8 anything
Kanji13Japanese text

A URL with lowercase letters falls back to byte mode (8 bits per character). The same URL in all-caps fits alphanumeric mode (5.5 bits per char) — about 30% denser. Some QR libraries auto-pick mode; ours uses byte mode for everything for simplicity and UTF-8 support.

Reed-Solomon error correction (the magic)

The key insight: the QR code carries the original data plus mathematically-generated parity codewords that let the scanner recover from up to N errors per block. The math is over GF(256), the same Galois field used by AES key schedules.

Concretely:

  1. Treat each codeword (1 byte = 1 element of GF(256)) as a coefficient of a polynomial.
  2. Multiply the data polynomial by x^k (where k = number of ECC codewords).
  3. Compute the remainder when divided by a generator polynomial g(x) = (x - α)(x - α²)...(x - α^k) — this remainder is the ECC codewords.
  4. Append the ECC to the data.

When the scanner reads back the (possibly damaged) codewords, it performs the same polynomial division. If the remainder isn’t zero, errors are present. The Berlekamp-Massey algorithm (or a simpler syndrome decoder) locates and corrects up to k/2 errors.

ECC levels

QR has four ECC levels:

LevelRecoveryk as % of total
L~7%~7%
M~15%~15%
Q~25%~25%
H~30%~30%

H-level is what lets you slap a logo in the centre and still scan.

Block interleaving

Bigger QR codes split the data into multiple Reed-Solomon blocks. A v10 QR at level Q has 4 blocks; v40 at level H has 81 blocks. The encoder generates ECC per block, then interleaves them codeword by codeword:

Block 1: D1 D2 D3 ... + E1 E2 ...
Block 2: D1' D2' D3' ... + E1' E2' ...
Block 3: D1'' ...

Interleaved: D1 D1' D1'' D2 D2' D2'' ... E1 E1' E1'' E2 ...

This way, a localised burst of errors (e.g., a coffee stain across one region) damages a few codewords from each block instead of wiping out one block entirely. Each block can independently correct its share.

The 8 masks

After laying out data, the encoder applies a mask to data modules only (not function patterns). Each mask is a simple geometric pattern based on the module’s row/column:

0: (x + y) % 2 == 0
1: y % 2 == 0
2: x % 3 == 0
3: (x + y) % 3 == 0
4: (floor(x/3) + floor(y/2)) % 2 == 0
5: (x*y % 2 + x*y % 3) == 0
6: ((x*y % 2) + (x*y % 3)) % 2 == 0
7: ((x+y % 2) + (x*y % 3)) % 2 == 0

Where mask is true, the data bit is XORed (flipped). Why? To break up patterns that look like finder patterns or all-same-colour areas, which would confuse the scanner.

The encoder applies all 8 masks, scores each result by 4 penalty rules:

  1. Runs of 5+ same-colour modules in a row or column (3 + N).
  2. 2×2 blocks of same-colour modules (3 each).
  3. Patterns that look like a finder pattern (40 each).
  4. Imbalance between dark and light modules (10 per 5% off 50/50).

The mask with the lowest total score wins. This is why two QR codes with identical content can look slightly different — they may have chosen different masks.

Format info

The 5 bits encoding “ECC level + chosen mask” need to be readable before the rest of the code, so they’re stored with their own error correction:

The scanner reads both copies, checks each against the BCH polynomial, and picks the most likely original 5 bits.

Version info

For QR versions 7 through 40, the version number is stored in two 6×3 blocks with 18-bit BCH(18,6) error correction:

For versions 1–6, the version is implicit in the QR’s overall size (21+4×ver modules per side), so no version-info region is needed.

What scanners actually do

In reverse:

  1. Find the three finder patterns: edge-detect, look for the 1:1:3:1:1 dark/light/dark/light/dark module ratio of a finder.
  2. Calculate orientation + perspective from the finder positions.
  3. Read alignment patterns to refine the perspective (especially important for curved surfaces or oblique angles).
  4. Read format info (twice; pick the best). This tells the scanner the ECC level and which mask was applied.
  5. Read version info if it’s a v7+ QR.
  6. Sample modules at the calculated grid positions.
  7. Reverse the mask to recover the unmasked data + ECC bits.
  8. De-interleave blocks.
  9. Run Reed-Solomon decode on each block; correct errors.
  10. Concatenate data, peel off mode + length headers, decode.

That’s the whole pipeline. Most of it runs in under 100 ms on a phone.

Designing for scannability

A few practical takeaways:

Try the generator

The QR generator on this site implements all of the above: Reed-Solomon ECC, mask scoring, format info with BCH parity, the works. Open the page, paste a URL, and the SVG renders instantly — entirely in your browser. The “version” and “mask” shown under the QR are the chosen output of the algorithm above.