# Chacha20

*
*

This page covers the basics of the ChaCha20 steam cipher invented by Daniel J Bernstein. I learnt about the cipher by reading the Salsa20 paper, the ChaCha20 paper and this video.

The ChaCha20 algorithm takes 16 4-byte words as input:

- 8 words: 256-bit key
- 2 words: IV/nonce (to be used only once with a key)
- 2 words are a counter (makes the stream cipher allow for 2
^{64}random access for use cases like videos) - 4 words are constant defined in the ChaCha20 specification

Each application of the `ChaCha20`

function gives us 16 words (512 bits) of deterministic random-bit generation and can be assembled into a keystream like so:

`ks := ChaCha20(key, nonce, 0) || ChaCha20(key, nonce, 1) || ... || ChaCha20(key, nonce, n)`

It’s evident in the above formulation how the counter can be used to “seek” into up to 2^{64}parts of the keystream.

## Algorithm to generate 512 DRBG bits

Let’s see how the `ChaCha20`

function works. All the inputs are loaded into a 16 words as shown below:

constant | constant | constant | constant |
---|---|---|---|

key |
key |
key |
key |

key |
key |
key |
key |

input |
input |
input |
input |

Then, a `QUARTERROUND(a, b, c, d)`

function is applied to 4-words at a time. DJB chose 4-words to optimize for speed on various computing architectures.

```
QUARTERROUND(a, b, c, d) {
a += b; d ^= a; d <<<= 16;
c += d; b ^= c; b <<<= 12;
a += b; d ^= a; d <<<= 8;
c += d; b ^= c; b <<<= 7;
}
```

The `QUARTERROUND`

function is applied to 4 columns in the matrix, comprising one *round* and then applied to 4 diagonals in the matrix, comprising another *round*. This is repeated 10 times, totalling 20 *rounds*. Hence, ChaCha*20*. Two rounds are depicted below where `xi`

represents the i^{th} word in the matrix if it were represented linearly as 16 words.

```
QUARTERROUND(x0, x4, x8, x12)
QUARTERROUND(x1, x5, x9, x13)
QUARTERROUND(x2, x6,x10, x14)
QUARTERROUND(x3, x7,x11, x15)
QUARTERROUND(x0, x5,x10, x15)
QUARTERROUND(x1, x6,x11, x12)
QUARTERROUND(x2, x7, x8, x13)
QUARTERROUND(x3, x4, x9, x14)
```

At the end of 20 rounds, we have 512 deterministically generated pseudo-random bits based on the key material and the inputs. To generate more bits, we increment the counter, giving us 2^{64} * 64 bytes = 2^{70} bytes. **Per the recommendation in the TLS 1.3 spec, the key must be rotated after 2 ^{70} bytes of keystream has been obtained.** In practice, most AEAD specifications will codify re-keying at much smaller intervals (like the 1GB re-keying requirement in [email protected]) and the 4064 bytes re-keying requirement in [email protected]

## Characteristics

- Faster than Salsa20 which is faster than AES in rand-ctr mode.
- More secure than AES baased on the cryptanalysis available to date
- Faster than Salsa20 and even AES which the AES-NI hardware acceleration is not available. It is only marginally slower when AES-NI is available.
- Produces higher diffusion on most benchmarks compared to Salsa20 and therefore DJB speculates it gives similar level of security at a lower numebr of rounds.
- Used in TLS 1.3 authenticated encryption with the Poly1305 Carter-Wegman MAC.