Posted in Science & Nature

Cryptography: Caesar Cipher

One of the earliest known uses of cryptography can be traced back to ancient Rome. Julius Caesar was well-known for his use of a type of substitution cipher dubbed “Caesar cipher” or “Caesar shift”. The encryption is very simple: shift every letter a certain value down the alphabet (the value is known as the key). For example, Caesar used a key of 3 to encrypt his messages to his general, so the message “ATTACK AT DAWN” would be encrypted into “DWWDFN DW GDZQ” (use the scheme of a=0, b=1, c=2, d=3…).

Although it was an efficient encryption system in ancient times, since then it has been revised to be much more secure. The Caesar cipher has thus been demoted to the preferred code used by children and teenagers for basic decoding puzzles.

Due to the simplicity of the encryption, cracking the Caesar cipher is quite easy with the use frequency analysis, pattern recognition and brute force analysis. Brute force analysis can be used if the attacker knows that a Caesar cipher has been used. If that is the case, the message can be decrypted using every possible key (e.g. 1, 2, 3…) until a message that makes sense is acquired.

Posted in Science & Nature

Cryptography: Frequency Analysis

A cipher is a message that has been encoded using a certain key. The most common and basic type of ciphers are encrypted using letter substitution, where each letter represents a different, respective letter. For example, the message may be encoded in a way so that each letter represents a letter three values before it on the alphabet (e.g. if a=0, b=1… “a” becomes “d”, “b” becomes “e” etc.). This creates a jumble of letters that appears to be indecipherable.

However, the characteristics of substitution ciphers make them the most decipherable type of encryptions. As each letter can only represent one other letter, as long as the key is cracked (i.e. what letter is what), the message and any future messages can be cracked. The most important tool in decrypting substitution ciphers is pattern recognition and frequency analysis.

Frequency analysis relies on the fact that every language has certain letters that are more used than others. In the English language, the letters that are most used, in order, are: E, T, A, O, I, N, S, H, R, D, L, U (realistically, only E, T, A, O are significant and the rest are neither reliable nor useful in frequency analysis).

For example, if Eve intercepted a long, encrypted message that she suspects to be a simple substitution cipher, she will first analyse the text for the most common letter, bigram (two letter sequence) and trigram. If she found that I is the most common single letter, XL the most common bigram and XLI the most common trigram, she can ascertain with considerable accuracy that I=e, X=t and L=h (“th” and “the” are the most common bigram and trigram respectively). Once she substitutes these letters into the cipher, she will soon discover that certain patterns arise. Eve may notice words such as “thCt” and deduce that C=a, or find familiar words and fill in the blanks in the key. The discovery of each letter leads to more patterns and the vicious cycle easily breaks the code.

Frequency analysis is extremely useful as it can be used to attack any simple substitution ciphers, even if they do not use letters. For example, in Sir Arthur Conan Doyle’s Sherlock Holmes tale The Adventure of the Dancing Men, Sherlock Holmes uses frequency analysis to interpret a cryptogram showing a string of hieroglyphs depicting dancing men.

To reinforce this weakness in substitution ciphers, many cryptographers have devised better encryption methods such as polyalphabetic substitution, where several alphabets are used (e.g. a grid of two alphabets – also called a tabula recta).