So far, the three ciphers introduced could all easily be cracked using frequency analysis and the Kasiski examination. Is there a cipher that is easy to implement yet difficult to break for a beginner cryptanalyst? An extremely popular and surprisingly powerful cipher is the book cipher. Essentially, the book cipher replaces a keyword with an entire book. Instead of replacing a letter for a letter or symbol in a systematic and mathematical way (such as a set shift number or using a tabula recta), the book cipher replaces letters for numbers that refer to a certain text within a book. As the only way to decode the message is to have the book, it is an extremely secure way of enciphering a message given that both parties have an identical copy of the book.
There are many variations of the book cipher. The most popular type is giving a page number, with the first letter of the page being the plaintext. A variant of this is giving a set of three numbers for every letter: the page number, the line number and the word number (or just two: page and line, then take the first letter). Ironically, this may be less secure at times as it may reveal that it is a book cipher. However, doing this for each letter makes the enciphering and deciphering process incredibly long and arduous.
A shortcut method is to refer to a word within a page (using the three-number set coordinates method described above) to shorten the ciphertext. Although this method is much easier in practice, it poses the challenge of finding a book that includes all the words in the plaintext, which may be difficult if the code is for military or espionage purposes.
Because of this, and the fact that both parties (or everyone in the ring) need identical versions of the book while not standing out too much, the most common books used are the dictionary (typically a famous version such as the Oxford Dictionary) or the bible (again, a standard version is used). These books are not only good because they incorporate a massive vocabulary, but they are also inconspicuous while being carried around in an enemy territory.
The book cipher is a very difficult code to crack for most people without advanced cryptanalysis training. Thus, the easiest way to crack is to deduce what book is the keytext. There are numerous ways to do this, but one way would be to cross-match the books of two known spies until common books are found. In the setting of spies in a foreign country, a book such as a traveller’s guide or phrasebook dictionary can be considered a likely target as it can be carried around easily while containing many words. Ergo, the secret behind cracking the book cipher is less about cryptography and more about using the science of deduction.
The Kasiski examination can be used to attack polyalphabetic substitution ciphers such as the Vigenère cipher, revealing the keyword that was used to encrypt the message. Before this method was devised by Friedrick Kasiski in 1863, the Vigenère cipher was considered “indecipherable” as there was no simple way to figure out the encryption unless the keyword was known. But with the Kasiski examination, even the Vigenère cipher is not safe anymore.
The Kasiski examination is based on the fact that assuming the number of letters of the keyword is n, every nth column is encoded in the same shift as each other. Simply put, every nth column can be treated as a single monoalphabetic substitution cipher that can be broken with frequency analysis. Ergo, all the cryptanalyst needs to do to convert the Vigenère cipher into a Caesar cipher is know the length of the keyword.
To find the length of the keyword, look for a string of repeated text in the ciphertext (make sure it is longer than three letters). The distance between two equal repeated strings is likely to be a multiple of the length of the keyword. The distance is defined as the number of characters starting from the last letter of the first set of strings to the last letter of the second set of strings (e.g. “abcdefxyzxyzxyzabcdef” -> “abcdef” is repeated” -> distance is “xyzxyzxyzabcdef” which is 15 letters). The reason this works is that if there is a repeated string in the plaintext and the distance between these strings is a multiple of the keyword length, the keyword letters will line up and there will be repeated strings in the ciphertext also. If the distance is not a multiple of the keyword length, even if there is a repeated string of letters in the plaintext, the ciphertext will be completely different as the keyword would not match up and be different.
It is useful recording the distance between each set of repeated strings to find the greatest common factor. The number that factors the most into all of these distances (e.g. 6 is a factor of 6, 12, 18…) is most likely the length of the keyword. Once the length of the keyword is found, then every nth letter must have been encrypted using the same letter of the keyword. Thus, by recording every nth letter in one string, you can obtain what is essentially a Caesar cipher. The Caesar cipher is then attacked using frequency analysis. Once a few of these strings (of different positions on the ciphertext) are solved, the keyword can be revealed by checking the shift key against a tabula recta (e.g. if a certain string of nth letters is found to have been shifted 3 letters each, then the corresponding letter in the keyword must be “D”, which shifts every plaintext letter by 3 in the Vigenère cipher). When the keyword is deduced, every message encrypted using that keyword can now easily be decoded by you.
Although the Kasiski examination appears to be complex, attempting to try it reveals how simple the process is. Thus, it is useful to try encrypting a message using the Vigenère cipher then trying to work out the keyword using the Kasiski examination. Much like the frequency analysis, it is an extremely useful tool in the case of needing to break a secret code.