Caesar Cipher - Breaking The Cipher

Breaking The Cipher

Decryption
shift
Candidate plaintext
0 exxegoexsrgi
1 dwwdfndwrqfh
2 cvvcemcvqpeg
3 buubdlbupodf
4 attackatonce
5 zsszbjzsnmbd
6 yrryaiyrmlac
...
23 haahjrhavujl
24 gzzgiqgzutik
25 fyyfhpfytshj

The Caesar cipher can be easily broken even in a ciphertext-only scenario. Two situations can be considered:

  1. an attacker knows (or guesses) that some sort of simple substitution cipher has been used, but not specifically that it is a Caesar scheme;
  2. an attacker knows that a Caesar cipher is in use, but does not know the shift value.

In the first case, the cipher can be broken using the same techniques as for a general simple substitution cipher, such as frequency analysis or pattern words. While solving, it is likely that an attacker will quickly notice the regularity in the solution and deduce that a Caesar cipher is the specific algorithm employed.

In the second instance, breaking the scheme is even more straightforward. Since there are only a limited number of possible shifts (26 in English), they can each be tested in turn in a brute force attack. One way to do this is to write out a snippet of the ciphertext in a table of all possible shifts — a technique sometimes known as "completing the plain component". The example given is for the ciphertext "EXXEGOEXSRGI"; the plaintext is instantly recognisable by eye at a shift of four. Another way of viewing this method is that, under each letter of the ciphertext, the entire alphabet is written out in reverse starting at that letter. This attack can be accelerated using a set of strips prepared with the alphabet written down them in reverse order. The strips are then aligned to form the ciphertext along one row, and the plaintext should appear in one of the other rows.

Another brute force approach is to match up the frequency distribution of the letters. By graphing the frequencies of letters in the ciphertext, and by knowing the expected distribution of those letters in the original language of the plaintext, a human can easily spot the value of the shift by looking at the displacement of particular features of the graph. This is known as frequency analysis. For example in the English language the plaintext frequencies of the letters E, T, (usually most frequent), and Q, Z (typically least frequent) are particularly distinctive. Computers can also do this by measuring how well the actual frequency distribution matches up with the expected distribution; for example, the chi-squared statistic can be used.

For natural language plaintext, there will, in all likelihood, be only one plausible decryption, although for extremely short plaintexts, multiple candidates are possible. For example, the ciphertext MPQY could, plausibly, decrypt to either "aden" or "know" (assuming the plaintext is in English); similarly, "ALIIP" to "dolls" or "wheel"; and "AFCCP" to "jolly" or "cheer" (see also unicity distance).

Multiple encryptions and decryptions provide no additional security. This is because two encryptions of, say, shift A and shift B, will be equivalent to an encryption with shift A + B. In mathematical terms, the encryption under various keys forms a group.

Read more about this topic:  Caesar Cipher

Famous quotes containing the words breaking the, breaking and/or cipher:

    There’s kind of a Sleeping Beauty magic about the kid. I thought I’d done something toward breaking the spell. Seems not. Prince Charmless, that’s me.
    Dodie Smith, and Lewis Allen. Roderick Fitzgerald (Ray Milland)

    [T]here is no breaking out of the intentional vocabulary by explaining its members in other terms.
    Willard Van Orman Quine (b. 1908)

    The eye is the first circle; the horizon which it forms is the second; and throughout nature this primary figure is repeated without end. It is the highest emblem in the cipher of the world.
    Ralph Waldo Emerson (1803–1882)