Overview of MD5 Algorithm
Hash algorithms are important components in many cryptographic applications and security protocol suites. MD5 which stands for Message Digest algorithm 5 is a widely used cryptographic hash function. The idea behind this algorithm is to take up a random data (text or binary) as an input and generate a fixed size “hash value” as the output. The input data can be of any size or length, but the output “hash value” size is always fixed.
MD5 was designed by well-known cryptographer Ronald Rivest in 1991. In 2004, some serious flaws were found in MD5. The complete implications of these flaws has yet to be determined.
The MD5 algorithm is a cryptographic algorithm that takes an input of arbitrary length and produces a message digest that is 128 bits long. The digest is sometimes also called the “hash” or “fingerprint” of the input. MD5 is used in many situations where a potentially long message needs to be processed and/or compared quickly. The most common application is the creation and verification of digital signatures.
Properties of a Message-Digest Algorithm
When people plan to design a message-digest algorithm, they try to make the algorithm satisfy the following properties:
- It should be one-way. Given the message digest, it is hard to get the original message.
- Given both input and output, it is difficult to find another input message which generates same output.
- It should be collision-resistant. It is computationally infeasible to find two messages, which generates same message digest. This property is not same as the second property. It is easier to make attack on this property than on the second property.
- The message digest should satisfy pseudo-randomness.
When all of the above properties are satisfied, we call the algorithm a collision-resistant message-digest algorithm. It is unknown whether collision-resistant message-digest algorithm can exist at all.
How MD5 works
Preparing the input
The MD5 algorithm first divides the input in blocks of 512 bits each. 64 Bits are inserted at the end of the last block. These 64 bits are used to record the length of the original input. If the last block is less than 512 bits, some extra bits are ‘padded’ to the end.
Next, each block is divided into 16 words of 32 bits each. These are denoted as M0 … M15.
MD5 helper functions
MD5 uses a buffer that is made up of four words that are each 32 bits long. These words are called A, B, C and D. They are initialized as
word A: 01 23 45 67
word B: 89 ab cd ef
word C: fe dc ba 98
word D: 76 54 32 10
Processing the blocks
The contents of the four buffers (A, B, C and D) are now mixed with the words of the input, using the four auxiliary functions (F, G, H and I). There are four rounds, each involves 16 basic operations. One operation is illustrated in the figure below.
Figure: MD5 Function
The figure shows how the auxiliary function F is applied to the four buffers (A, B, C and D), using message word Mi and constant Ki. The item “<<<s” denotes a binary left shift by s bits.
After all rounds have been performed, the buffers A, B, C and D contain the MD5 digest of the original input.
Application of MD5
Message-digest algorithms are mainly used in implementing digital signature. In this case, all of the above properties are required. However, the requirement is quite different when different applications use these algorithms. An application may rely upon some or all of the properties of the MDA. For example, some applications use the one-way property of a MDA. Because of its property of pseudo-randomness, MDA is also used to be part of the mechanism for random number generation.
 “The MD5 cryptographic hash function”, available online at: http://www.iusmentis.com/technology/hashfunctions/md5/
 “What is MD5 Hash and How to Use it”, available online at: https://www.gohacking.com/what-is-md5-hash/
 M.K. Islam, M.A. Hossain and M.A. Nashiry, “Security of Cryptographic Algorithm SHA and MD5”, Inst. Engineering Tech. 2(2):28-33(August 2012).