Encryption - Caesar cipher

· Read in about 6 min · (1182 words)

Encryption - Caesar cipher

This first article is a part of a set of encryption articles and they do not have any professional purpose, but, just an exchange because the encryption is a part of science computing I like.

I will talk about the Caesar cipher, the history behind this cipher, how to encrypt a plain-text message and how to decrypt it.

Caesar cipher

One of the first substitution cipher was used by Julius Caesar who was a Roman emperor and he develop a new cipher, called the Caesar cipher. The cipher was very simple, but ingenious, because he understand the needed of the encryption.

The concept of this cipher is to replace each character of the text by another one. We called that a substitution cipher. For instance, in the picture bellow, we replace the plaintext message “Substitution cipher” by the encoded message “vxevwlwxwlrq flskhu”.

Encryption

When we use an algorithm for encoding a plaintext message, we need to use a key like describe the picture below:

Encryption works

For the Caesar cipher, with a numeric key, we just do a shift with another character. For instance, we a key of 2, for each letter ‘A’, we replace by ‘C’.

We can see the same pattern, some characters appears often, like ‘w’, so, it’s easily to understand how the cipher works and in this article, we will see how to decrypt.

Executing the project

I made this multi-tool project for encrypting data into cryptogram with differents ciphers. You can have Caesar, Vigenere (I will explain it in another article). You can find the project at this link: https://gitlab.com/gbucchino/cryptography

For executing the program, you can execute the bash script file exec:

$ ./exec.sh
Usage: [options] [crypt|decrypt] [file]
Options:
	-c: Caesar cipher
	-v: Vigenere cipher
	-t: Transposition cipher

As you see, we need to pass different argument to the program. The first argument is the kind of cipher we want to use, the second one is to crypt or to decrypt the file and the last argument is the file which contains the plaintext message or the cryptogram.

First, create your plain text message:

$ cat test.txt
Hello World !
I am a plain text message.

Then, you can execute the program:

$ ./exec.sh -c crypt test.txt 
Your key (between 1 and 26): 3

That will generate a file called test.txt.crypt:

$ ls -la test.txt.crypt 
-rw-r--r-- 1 geoffrey geoffrey 41 Sep 25 19:42 test.txt.crypt
$ cat test.txt.crypt 
KHOOR ZRUOG !
L DP D SODLQ WHAW PHVVDJH

We encrypted our message. We can decrypt the message like that:

$ ./exec.sh -c decrypt test.txt.crypt 
Your key (between 1 and 26): 3
$ cat test.txt.crypt.decrypt 
HELLO WORLD !
I AM A PLAIN TEXT MESSAGE.

Breaking the code

When we want to break the code, that’s means we do not have the key for decrypting the cryptogram, we need to analyze it. This role is done by a cryptanalysis. A cryptanalyst is a part of the cryptography for breaching the cryptogram. A cryptanalysis is a set of process for understanding how a cipher works and try to break the key. It’s a fundamental part in every security branch.

Frequency analysis

A cryptanalysis, can used the frequency analysis for breaching a simple cryptogram like Caesar or Vigenère ciphers, typically, the frequency analysis consist to count the number of occurrences of different messages and to used them and to try to breach the cryptogram. Each languages have some letters, or a combination of letters, which appear often, in the French language, the letter ‘e’ is the most used. With this frequency method, we can try to presume the letter in a text file.

I made a project with some python scripts for the cryptanalysis: https://gitlab.com/gbucchino/cryptanalysis

Statistics

You have the script stats.py, which can take one argument, the filename to analyze, or, if you don’t add this argument, the script will generate a Lorem Ipsum text. I put the text file poem.txt, which contain the poem from Victor Hugo, called A un martyr:

$ python3 stats.py -f poem.txt

The script display in graphic the statistics. The figure below display the frequency of each occurrences from the poem:

Statistics

As we see in the figure above, the letter ‘e’ is the most used in the poem. With that, we can deduce the key of the cryptogram, because, we can do the same for the cryptogram and if we identify what is the most useful letters in the cryptogram, we can deduce it’s the letter ‘e’. For instance, you may have to encode the poem with the Caesar cipher and you can execute the script stats.py for analyzing it:

$ python3 stats.py -f poem.txt.crypt

Statistics

In my example, we can see, the letter ‘i’ is most used. With stats, we can try to find the key. Maybe, you will find the key I used ?

Yes, the response if 4. easy, right ?

Brute force

In the project https://gitlab.com/gbucchino/cryptanalysis, I made a python script for decrypting a cryptogram encoded with a Caesar cipher.

For decoding the cryptogram, the script will test all value for the key, 1 to 26 and display the results. The script take one argument: the cryptogram file:

$ python3 decryptCaesar.py -f test.txt.crypt 
key: 1
['L', 'I', 'P', 'P', 'S', 'A', 'S', 'V', 'P', 'H', 'M', 'E', 'Q', 'E', 'T', 'P', 'E', 'M', 'R', 'X', 'I', 'B', 'X', 'Q', 'I', 'W', 'W', 'E', 'K', 'I']
key: 2
['K', 'H', 'O', 'O', 'R', 'Z', 'R', 'U', 'O', 'G', 'L', 'D', 'P', 'D', 'S', 'O', 'D', 'L', 'Q', 'W', 'H', 'A', 'W', 'P', 'H', 'V', 'V', 'D', 'J', 'H']
key: 3
['J', 'G', 'N', 'N', 'Q', 'Y', 'Q', 'T', 'N', 'F', 'K', 'C', 'O', 'C', 'R', 'N', 'C', 'K', 'P', 'V', 'G', 'Z', 'V', 'O', 'G', 'U', 'U', 'C', 'I', 'G']
key: 4
['I', 'F', 'M', 'M', 'P', 'X', 'P', 'S', 'M', 'E', 'J', 'B', 'N', 'B', 'Q', 'M', 'B', 'J', 'O', 'U', 'F', 'Y', 'U', 'N', 'F', 'T', 'T', 'B', 'H', 'F']
key: 5
['H', 'E', 'L', 'L', 'O', 'W', 'O', 'R', 'L', 'D', 'I', 'A', 'M', 'A', 'P', 'L', 'A', 'I', 'N', 'T', 'E', 'X', 'T', 'M', 'E', 'S', 'S', 'A', 'G', 'E']
key: 6
['G', 'D', 'K', 'K', 'N', 'V', 'N', 'Q', 'K', 'C', 'H', 'Z', 'L', 'Z', 'O', 'K', 'Z', 'H', 'M', 'S', 'D', 'W', 'S', 'L', 'D', 'R', 'R', 'Z', 'F', 'D']
key: 7
['F', 'C', 'J', 'J', 'M', 'U', 'M', 'P', 'J', 'B', 'G', 'Y', 'K', 'Y', 'N', 'J', 'Y', 'G', 'L', 'R', 'C', 'V', 'R', 'K', 'C', 'Q', 'Q', 'Y', 'E', 'C']

In the example above, the key has a value of 5, because it’s a human readable message.

In conclusion

In nowdays, resolving a Caesar it’s easy, but, during the Roman era, to have a cryptogram for hiding your message, and to avoid to be catch by the enemy, it was a great idea and we can understand why Caesar was a great emperor (it’s one of my point of view).

I hope you liked this article regarding the Caesar cipher, and in the future, I will do more articles regarding the encryption used in the old era.