6 Feb 2010

Break This Code!

uxq eai hiaswlrj wvttfdc oi 
wqmvxl rdiy esykhtqjt fhzq hhi 
soerbcn sxfeyby ugd ziypu tnbbz 
ol mm nmcsq


Don't understand the gibberish above? That's what this whole post is about, fortunately. You'll find out what it means later.

Ciphers have been used for many years already, even by Julius Caesar. He used a very simple method that most of us probably know already, which is named after him: the Caesar cipher/shift. You take your normal alphabet, and simply shift it a few letters to get your cipher! For example:

Plain:  ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher: EFGHIJKLMNOPQRSTUVWXYZABCD

In this case, HEAD in plain text would translate to LIEH in cipher text. Pretty simple! Similarly, the cipher text FVMRK QSVI EQQS would mean BRING MORE AMMO. (If I didn't make any mistakes)

However, most people wouldn't be stupid enough to neatly space out all the words; they can lump in into one lump like FVMRKQSVIEQQS(and you're supposed to be smart enough to figure out what BRINGMOREAMMO means), or even worse, create fixed letter "blocks"(like 3-letter groups): FVM RKQ SVI EQQ S, and this can add a challenge to whoever's trying to decode the intercepted message.

There's a variation to this cipher: you decide on a keyword(s) to use, e.g. SINGAPOREAN. That would be used for the first few letters of the cipher alphabet, but removing redundant letters like the second A and N. So you have this:

Plain:  ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher: SINGAPORE


...Okay, I know I should have just used SINGAPORE anyway...

Then you just allocate the rest of the cipher letters in alphabetical order:

Plain:  ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher: SINGAPOREBCDFHJKLMQTUVWXYZ


...but then you find that after Q, the plain and cipher alphabets match! So, you may need to do another shift:

Plain:  ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher: SINGAPOREQTUVWXYZBCDFHJKLM


NOW you're ready to use the cipher:

Cipher Text: DBL FCEWO DREC DX YSCC WXDAC GFBEWO UACCXWC
Plain Text:  TRY USING THIS TO PASS NOTES DURING LESSONS


Of course, there is one smart way of breaking the cipher(and some others too), and that is frequency analysis. Some letters of the English alphabet are used more frequently than others, and seeing a cipher letter/symbol popping out quite often might give a clue to what it stands for. Going from most frequent to least, here are the letters:

ETAOINSHRDLCUMWFGYPBVKJXQZ


So chances are someone who noticed that in the previous message, the letter C was used quite frequently, and since it is also found to be used in pairs(CC), he can guess that C might stand for S. Even if it's just one letter, it's still helpful anyway.

Now for a more advanced cipher called the Vigenere cipher. It is simple to use, but looks BIG:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
A a b c d e f g h i j k l m n o p q r s t u v w x y z
B b c d e f g h i j k l m n o p q r s t u v w x y z a
C c d e f g h i j k l m n o p q r s t u v w x y z a b
D d e f g h i j k l m n o p q r s t u v w x y z a b c
E e f g h i j k l m n o p q r s t u v w x y z a b c d
F f g h i j k l m n o p q r s t u v w x y z a b c d e
G g h i j k l m n o p q r s t u v w x y z a b c d e f
H h i j k l m n o p q r s t u v w x y z a b c d e f g
I i j k l m n o p q r s t u v w x y z a b c d e f g h
J j k l m n o p q r s t u v w x y z a b c d e f g h i
K k l m n o p q r s t u v w x y z a b c d e f g h i j
L l m n o p q r s t u v w x y z a b c d e f g h i j k
M m n o p q r s t u v w x y z a b c d e f g h i j k l
N n o p q r s t u v w x y z a b c d e f g h i j k l m
O o p q r s t u v w x y z a b c d e f g h i j k l m n
P p q r s t u v w x y z a b c d e f g h i j k l m n o
Q q r s t u v w x y z a b c d e f g h i j k l m n o p
R r s t u v w x y z a b c d e f g h i j k l m n o p q
S s t u v w x y z a b c d e f g h i j k l m n o p q r
T t u v w x y z a b c d e f g h i j k l m n o p q r s 
U u v w x y z a b c d e f g h i j k l m n o p q r s t
V v w x y z a b c d e f g h i j k l m n o p q r s t u
W w x y z a b c d e f g h i j k l m n o p q r s t u v
X x y z a b c d e f g h i j k l m n o p q r s t u v w 
Y y z a b c d e f g h i j k l m n o p q r s t u v w x
Z z a b c d e f g h i j k l m n o p q r s t u v w x y


Scared yet? Actually all this does is that it encodes a message using more than one Caesar shift. Let's say you want to send this message:

CIPHERS ARE SUPER AWESOME


You decide on a keyword(s) to use. The longer and more complex the better. Let's use DECRYPTION. Then you write it next to the plain text, aligning the letters, and repeating it until the end:

Plain: CIPHERS ARE SUPER AWESOME
KEY:   DECRYPT ION DECRY PTIONDE


Now for the first letter. Start by looking at row D(taken from the key):

  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
D d e f g h i j k l m n o p q r s t u v w x y z a b c


Then look at column C(taken from the plain text), which shows an F. So the C in plain text would be encrypted as F.

For the second letter, we have KEY E and Plain I, so at row E column I, we have the letter M. So the I in plain text would be encrypted as M. Going by this same procedure, the whole message gets encrypted into this rubbish:

FMRYCGL IFR VYRVP PQMGBPI


See how when SUPER is encrypted as VYRVP, the V actually stands for 2 letters? This is called polyalphabetic substitution, and can give code breakers a headache if the keyword is unknown.

Even then, the code makers can still add more mayhem to it by:

- Adding extra "words" in it:
FMRYCGL AIBH IFR WGCSXYH VYRVP AWQOOXP PQMGBPI


- Changing the spacing:
FMRY CGLI FRVY RVPP QMGB PI


- Eliminating the spaces:
FMRYCGLIFRVYRVPPQMGBPI


- Reversing the message:
IPBGMQP OVRYV RFU KGCYRNF


...and many other tricks up their sleeves.

Remember the cryptic message at the start? Now let's see if you can decode it. For convenience, I put the message here again, so you don't have to keep scrolling up and down for reference. And a few other tips:

- Look out for short words. They could stand for IN, OF, AT, I, A, WE etc. (But this won't help if the spaces aren't in their proper places, or not even there!)

- Remember frequency analysis?

- Watch out for repeated pairs. They could represent SS, CC, LL, OO, EE etc.

- All these three above are rendered useless if the Vigenere shift was used, which is used in this message, so sorry :(

I actually used a Caesar shift, then a Vigenere shift after that, to give this encrypted message. Good luck!

uxq eai hiaswlrj wvttfdc oi 
wqmvxl rdiy esykhtqjt fhzq hhi 
soerbcn sxfeyby ugd ziypu tnbbz 
ol mm nmcsq

Clue to the Vigenere shift keyword: If nautical nonsense be something you wish