RC4 Encryption Algorithm for VBA and VBScript

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Rabbit
    Recognized Expert MVP
    • Jan 2007
    • 12517

    RC4 Encryption Algorithm for VBA and VBScript

    09/22/2015 Update: A bug was found in the code. The code block has been updated with the fixed code.

    Description
    RC4 is one of the most widely used ciphers in the world. It is used in WEP, WPA, SSL, BitTorrent, PDF, etc. It is one of the simplest to understand and implement. It is also one of the fastest algorithms available.

    What is RC4
    It is a stream cipher, which means that it encrypts a stream of data byte by byte as opposed to a block cipher that encrypts groups of bytes at a time, usually with the inclusion of byte shifting. This usually means that block ciphers are more secure than stream ciphers. One of the weaknesses of a stream cipher, more so than in a block cipher, is that the layout of the data is unchanged. Therefore, if you know the layout of the data, you can flip a bit to change the meaning of the message. For example, if you know the message format is "Deposit: ###", then you can essentially flip one of the bits at byte 10 and hope to change 100 to 900. What is needed then, is a way to authenticate a message. Usually, a cryptographic hash function is used for this purpose. In essence, this message authentication code allows you to verify the sender and the message.

    It is a secret key algorithm, which means that it uses a key that should only be known by the intended sender and recipient. A major problem with this is the secure transmission of the key. Public key cryptography solves this by using so called "hard" math where calculating in one direction is easy while solving in the opposite direction takes much much longer. One example is multiplication vs factorization. Multiplying two numbers is easy but attempting to find the two numbers used to get that number takes a long time. This type of "hard" math allows you to transmit public information that the sender and recipient uses to compute a key without having to transmit a key. So why isn't public key cryptography used exclusively? It is a lot slower than secret key cryptograpy.

    General Weaknesses of Cryptography
    A weakness of cryptography is that they can be hacked using brute force. This means that a message can stay a secret only for as long as it takes a computer to try every password until it finds the one you used. Each byte that you add to a password means that it will take roughly 256 times longer to crack your password. So the question is, how long do you need that message to stay a secret?

    Now, that isn't actually a weakness of the algorithm per se. An actual weakness is that many algorithms are subject to mathematical analysis that may reveal what key was used to encrypt the data. Given enough encrypted data using the same or similar keys will result in a crack quicker than it would take using brute force. One way to mitigate this is the use of a nonce, initialization vector, or salt. They are basically random bits that are used with the key so that even though you are using the same key, each message is different because the random bits in effect change the key that is used.

    This is why you should choose a longer password and why you should change your passwords often.

    How RC4 Works
    RC4 encrypts a stream of data by generating a pseudorandom stream of bytes from a keystream. This output is XORed with the message.

    To generate the inital keystream, you start out with a 256 item array filled with the sequence 0-255. Looping 256 times, each byte in the key is used to calculate two numbers between 0 and 255. The two numbers at those indexes are then swapped.

    With this, the keystream is generated by calculating two indexes, swapping the numbers at those indexes, adding those two numbers mod 256 and then returning the number at that index.

    This number is then XORed with one byte of the message. This continues until the end of the message is reached.

    Specific Weaknesses of RC4
    Mathematical analysis has shown that the first few bytes of the keystream follow a pattern that could reveal the key or keystream that was used to create it. It is suggested that they first few bytes of the keystream should be discarded to reduce this relationship. The standard is 768 bytes but the recommended value is 3072 bytes.

    I mention the following weakness because it could be a major weakness in the algorithm but I do not understand enough about the attack to say anything worthwhile about it. Souradyuti Paul and Bart Preneel gave a proof of a prior weakness posited by Itsik Mantin and Adi Shamir about combinatorial analysis of the cipher stream. In their paper, they discuss the weakness and offer a modification to the algorithm that should mitigate it.

    Sample Implementation of RC4
    This is a function that works in many VB implementations . It even works in a Visual Basic Script and was, in fact, coded specifically for VBScript. But it should be directly portable to VBA. It is an inlined version because calling functions take a lot of overhead. One thing I did to make it more secure is to drop the first 3072 bytes of the keystream. Encryption and Decryption is done using the same function. I validated the pre-drop version against test values.
    Code:
    Function RunRC4(sMessage, strKey)
    	Dim kLen, x, y, i, j, temp
    	Dim s(256), k(256)
    
    	'Init keystream
    	klen = Len(strKey)
    	For i = 0 To 255
    		s(i) = i
    		k(i) = Asc(Mid(strKey, (i Mod klen) + 1, 1))
    	Next
    
    	j = 0
    	For i = 0 To 255
    		j = (j + k(i) + s(i)) Mod 256
    		temp = s(i)
    		s(i) = s(j)
    		s(j) = temp
    	Next
    
    	x = 0
    	y = 0
    
    	'Drop n bytes from keystream
    	For i = 1 To 3072
    		x = (x + 1) Mod 256
    		y = (y + s(x)) Mod 256
    		temp = s(x)
    		s(x) = s(y)
    		s(y) = temp
    	Next
    	
    	'Encode/Decode
    	For i = 1 To Len(sMessage)
    		x = (x + 1) Mod 256
    		y = (y + s(x)) Mod 256
    		temp = s(x)
    		s(x) = s(y)
    		s(y) = temp
    		
    		RunRC4 = RunRC4 & (s((s(x) + s(y)) Mod 256) Xor Asc(Mid(sMessage, i, 1))) & ","
    	Next
    End Function
    ** Edit **
    Download an example to play with.
    Last edited by NeoPa; Sep 25 '15, 04:51 PM. Reason: Added a link to the example attachment.
  • mshmyob
    Recognized Expert Contributor
    • Jan 2008
    • 903

    #2
    Thanks Rabbit. This looks interesting and I look forward to fooling with it.

    cheers,

    Comment

    • ADezii
      Recognized Expert Expert
      • Apr 2006
      • 8834

      #3
      @Rabbit - By any chance, did you break the Enigma Code during WWII?(LOL). On the serious side, and out of utter curiosity, why was that Encryption Algorithm almost impossible to break?

      Comment

      • mshmyob
        Recognized Expert Contributor
        • Jan 2008
        • 903

        #4
        @ADezii -
        One of the most interesting books I have read is called "The CODE BOOK - The Science of Secrecy from Ancient Egypt to Quantum Cryptography" by Simon Singh.

        I highly recommend it.

        There is quite a bit on the Enigma machine but basically it used a "scrambler" to encrypt each letter but the thing that made it more difficult was that this "scrambler" would rotate 1/26 (26 letters) of a rotation after each letter was entered. This in essence caused the cipher alphabet to change after each encryption (after each letter was encrypted) and therefore even if you encrpted the letter "S" twice in a row, the letter "S" would never come out encrypted the same. It was therefore hard to find repeating patterns without knowing the original key.

        This setup of the scrambler defined six cipher alphabets. The flaw being if a letter was repeated 6 times the scrambler would return to its original position thereby creating a pattern for that specific letter.

        A subsequent version of the Enigma used 2 scramblers giving 676 cipher alphabets.

        An improved version had 3 scramblers giving 17,576 cipher alphabets. A reflector was also implemented (too long to explain it function).

        But you can see that without another Enigma machine and trying to decipher by brain power alone would take forever.

        cheers,


        cheers,

        Comment

        • ADezii
          Recognized Expert Expert
          • Apr 2006
          • 8834

          #5
          @mshmyob - Thanks for the explanation.

          Comment

          • Rabbit
            Recognized Expert MVP
            • Jan 2007
            • 12517

            #6
            Indeed it was a difficult system to crack but the Polish had it cracked before the start of WW2.

            Comment

            • mshmyob
              Recognized Expert Contributor
              • Jan 2008
              • 903

              #7
              Through their spy network Poland managed to build an Enigma machine. Again through their spy network they discovered that they were the next target of the Nazis and called a secret meeting with England and gave them the Enigma machine so the Nazis wouldn't get a hold of it.

              Most people give England credit for duplicating the Enigma machine but it was actually Poland. England for the first few years used brute strength (trial and error - not an easy task in of itself) to guess the keys to decipher the encrypted German codes but without the Enigma machine itself the keys would have been useless.

              As the young kids today would say "History rocks"

              cheers,

              Comment

              • NeoPa
                Recognized Expert Moderator MVP
                • Oct 2006
                • 32661

                #8
                Originally posted by Rabbit
                Rabbit:
                Indeed it was a difficult system to crack but the Polish had it cracked before the start of WW2.
                My understanding is that two Polish chaps were able to get hold of an early enigma machine in the early 1930s. The Polish determined the mechanics of it and passed replicas on to both Britain and France prior to the start of WWII. I must admit that my readings on the matter before today indicated that this was as far as the Poles got, and that the mathematical work was mainly done by British mathematicians, but that seems now to be contested by the Poles. My understanding of the writing of history leads me to believe the claim, but I believe the cracking of subsequent versions of the code (and particularly the Naval version which remained unbroken until very near the end of the war), as well as the techniques developed to enable rebreaking the code on an almost daily basis, were managed by the British Intelligence Services based at Bletchley Park. Interestingly though, the three main Polish mathematicians who were originally responsible for breaking the code, were working at Bletchley throughout much of the war. Their contribution, as well as the quite incredible decision in 1939 for the Poles to share their current information on the cipher machines themselves, was absolutely critical in enabling so much of the German coded traffic to be broken throughout the war. Without this contribution alone, and with the fourth largest allied force in Europe during the war this was certainly not their only contribution to the victory, it seems clear that Nazi Germany would have been ultimately victorious in the conflict. The more I learn of the Polish contributions, the more impressed I am.

                Comment

                • Rabbit
                  Recognized Expert MVP
                  • Jan 2007
                  • 12517

                  #9
                  The Polish didn't have an original Enigma, but instead, they recreated one from information they obtained through a German spy. They were also able to obtain a table of daily keys. But largely, it was because of poor operating procedures that allowed the Polish to make the first breakthroughs in cracking the Enigma. The German Navy used much more stringent procedures and it wasn't until Alan Turing developed the Bombe (a machine used to reduce the number of possible "passwords" for a particular ciphertext) that the German Naval Enigma was cracked. At some point during the war, the Germans switched from a 3-rotor Enigma to a 4-rotor Enigma and there was a long period of time when the Allies were unable to decipher the messages. But that too was eventually cracked. Some people say that had the Germans used correct operating procedures, the Allies would never have cracked the Enigma.

                  Comment

                  • NeoPa
                    Recognized Expert Moderator MVP
                    • Oct 2006
                    • 32661

                    #10
                    Originally posted by Rabbit
                    Rabbit:
                    The Polish didn't have an original Enigma, but instead, they recreated one from information they obtained through a German spy.
                    That doesn't fit with my original recollection, but I must admit that I first heard of this many, many years ago and have probably misremembered the details with the passing of the decades. I expect you're quite right.

                    I believe the cracking of the naval codes was largely down to the promoting of Admiral Doenitz away from direct command of the U-Boats. He took charge of the whole navy, but the U-Boat fleets, who to that point had maintained procedure very strictly, started to become slacker. To be honest, although I seem to remember that point, and that the actual capture of a naval Enigma Machine from a surface craft also had an enormous effect at some time, I don't remember the details clearly enough to paint a true picture.

                    There were alternating periods during the war in the Atlantic, where each side took and maintained a semblance of superiority over the other for a while. The reasons for this were many, but some of the times were directly related to Enigma developments of one form or another, both on the part of the Allies and the Axis.

                    Comment

                    • Rabbit
                      Recognized Expert MVP
                      • Jan 2007
                      • 12517

                      #11
                      There were commercial versions of the Enigmas before the war started. I believe the Polish had one of the commercial variants but the military Enigma differed from the commercial version.

                      Comment

                      • NeoPa
                        Recognized Expert Moderator MVP
                        • Oct 2006
                        • 32661

                        #12
                        BTW. The code was interesting too. I'm incorporating it (a variation) into my projects going forward. It's better than my simple 'NOT based' alternative.

                        Comment

                        • Rabbit
                          Recognized Expert MVP
                          • Jan 2007
                          • 12517

                          #13
                          Three dozen lines for encryption is simple. I like it for its ease of use and its simple algorithm. You could even save some more code space by not inlining the functions. If you want a more secure algorithm, you should take a look at the AES thread I posted. Not so simple lol, it's 400 lines of code.

                          Comment

                          • NeoPa
                            Recognized Expert Moderator MVP
                            • Oct 2006
                            • 32661

                            #14
                            I may do that.

                            In the mean time, I incorporated the (amended) routine into a simple spreadsheet that I've attached. I encapsulated the function call with some code to convert the result to a Hex string after encryption, to avoid any dodgy characters in the sheet, but it's fundamentally just a spreadsheet to illustrate what it does.

                            If you like, I'll add the attachment to your OP, but only if you're happy with it. Let me know.
                            Attached Files

                            Comment

                            • Rabbit
                              Recognized Expert MVP
                              • Jan 2007
                              • 12517

                              #15
                              Looks good, thanks!

                              Comment

                              Working...