Decrypting Text

The Colossus at Bletchley

Frequency analysis

Encrypted text is sometimes achieved by replacing one letter by another. To start deciphering the encryption it is useful to get a frequency count of all the letters. The most frequent letter may represent the most common letter in English E followed by T, A, O and I whereas the least frequent are Q, Z and X. Common percentages in standard English are:

a b c d e f g h i j k l m
8.2 1.5 2.8 4.3 12.7 2.2 2.0 6.1 7.0 0.2 0.8 4.0 2.4
n o p q r s t u v w x y z
6.7 7.5 1.9 0.1 6.0 6.3 9.1 2.8 1.0 2.4 0.2 2.0 0.1

 

and ranked in order:

e t a o i n s h r d l u c
12.7 9.1 8.2 7.5 7.0 6.7 6.3 6.1 6.0 4.3 4.0 2.8 2.8
m w f y g p b v k x j q z
2.4 2.4 2.2 2.0 2.0 1.9 1.5 1.0 0.8 0.2 0.2 0.1 0.1

 

Common pairs are consonants TH and vowels EA. Others are OF, TO, IN, IT, IS, BE, AS, AT, SO, WE, HE, BY, OR, ON, DO, IF, ME, MY, UP. Common pairs of repeated letters are SS, EE, TT, FF, LL, MM and OO. Common triplets of text are THE, EST, FOR, AND, HIS, ENT or THA.

If the results show that E followed by T are the most common letters then the ciphertext may be a transposition cipher rather than a substitution. If one of the characters has a 20% frequency then the language could be German since it has a high percentage of E. Italian has 3 letters with a frequency greater than 10% and 9 characters are less than 1%.

The box below contains example ciphertext. You can paste any text that you want to decipher over this example text. Press the button. Useful statistics will appear.

Enter your ciphertext below:

Display more than  occurrences of repeated small substrings
 Click here to show percentages of single letters