UUHash


UUHash is a hash algorithm employed by clients on the FastTrack network. It is employed for its ability to hash very large files in a very short period of time, even on older computers. However, this is achieved by only hashing a fraction of the file. This weakness makes it trivial to create a hash collision, allowing large sections to be completely altered without altering the checksum.
This method is used by Kazaa. The weakness of UUHash is exploited by anti-p2p agencies to corrupt downloads.

How it works

The UUHash is a 160-bit string that is usually Base64-encoded for presentation. It is a concatenation of an MD5 hash and a CRC-32 sum of selected chunks of the file.
The first 307,200 bytes of the file are MD5-hashed. The 32 bit little endian integer value smallhash is initialized to 0.
If the file is strictly larger than one chunk size, a series of chunks at file offsets of 2n MiB and one chunk right at the end of the file are hashed using a CRC-32. The last chunk of the power-of-two series ends strictly more than one chunk size before the end of the file, i.e. there is always at least one unread byte between the last two chunks. The end-of-file chunk may be shorter than one chunk size; it starts at or after one chunk size into the file. The CRC is initialized using smallhash and stored into smallhash.
So, for example:
Finally, the bitwise complement of smallhash is XORed together with the file size in bytes. The 160-bit UUHash is now the concatenation of the 128-bit MD5 hash and the final 32-bit smallhash value.

Test Vectors

Given below are hashes for strings of various lengths containing only 0x00 or 0xFF bytes, generated by sig2dat.
Note here that all strings that have a complete MD5 chunk have the same 128-bit prefix because their first chunks are the same. For files that have the same number of chunks the CRC part differs only because of the included file length. For files up to 300 KiB, the file length can be extracted from the last four bytes of the hash; smallhash is ~0.
InputBase64Hexadecimal
0 bytes1B2M2Y8AsgTpgAmY7PhCfv////8=D41D8CD98F00B204E9800998ECF8427E-FFFFFFFF
0x00, 1 bytek7iFrf4NoInN9jSQT9Wfcf7///8=93B885ADFE0DA089CDF634904FD59F71-FEFFFFFF
0xFF, 1 byteAFlP1PQrpD/BygQnoFdilf7///8=00594FD4F42BA43FC1CA0427A0576295-FEFFFFFF
0x00, 2 bytesxBA/Ei0nZ3ydsUTK4TlKZv3///8=C4103F122D27677C9DB144CAE1394A66-FDFFFFFF
0xFF, 2 bytesqyoNKN5rd//dbHKv6tCZq/3///8=AB2A0D28DE6B77FFDD6C72AFEAD099AB-FDFFFFFF
0x00, 307199 bytes YK6+Fj6S4MGzEC9H9Bn3gQBQ+/8=60AEBE163E92E0C1B3102F47F419F781-0050FBFF
0xFF, 307199 bytes I+QujFtxa9pBOt5X6NMGsgBQ+/8=23E42E8C5B716BDA413ADE57E8D306B2-0050FBFF
0x00, 307200 bytes kK7e2ZIs+JRup4WGNUk3JP9P+/8=90AEDED9922CF8946EA7858635493724-FF4FFBFF
0xFF, 307200 bytes oBSYynx6vdDeUWtP5N4mAv9P+/8=A01498CA7C7ABDD0DE516B4FE4DE2602-FF4FFBFF
0x00, 307201 bytes kK7e2ZIs+JRup4WGNUk3JHOg+S0=90AEDED9922CF8946EA7858635493724-73A0F92D
0xFF, 307201 bytes oBSYynx6vdDeUWtP5N4mAv5P+wA=A01498CA7C7ABDD0DE516B4FE4DE2602-FE4FFB00
0x00, 614399 bytes kK7e2ZIs+JRup4WGNUk3JHCHqBQ=90AEDED9922CF8946EA7858635493724-7087A814
0xFF, 614399 bytes oBSYynx6vdDeUWtP5N4mAqgX6Xs=A01498CA7C7ABDD0DE516B4FE4DE2602-A817E97B
0x00, 614400 bytes kK7e2ZIs+JRup4WGNUk3JGlfGn0=90AEDED9922CF8946EA7858635493724-695F1A7D
0xFF, 614400 bytes oBSYynx6vdDeUWtP5N4mApKrf9g=A01498CA7C7ABDD0DE516B4FE4DE2602-92AB7FD8
0x00, 614401 bytes kK7e2ZIs+JRup4WGNUk3JGhfGn0=90AEDED9922CF8946EA7858635493724-685F1A7D
0xFF, 614401 bytes oBSYynx6vdDeUWtP5N4mApOrf9g=A01498CA7C7ABDD0DE516B4FE4DE2602-93AB7FD8
0x00, 614402 bytes kK7e2ZIs+JRup4WGNUk3JGtfGn0=90AEDED9922CF8946EA7858635493724-6B5F1A7D
0xFF, 614402 bytes oBSYynx6vdDeUWtP5N4mApCrf9g=A01498CA7C7ABDD0DE516B4FE4DE2602-90AB7FD8
0x00, 16777216 bytes kK7e2ZIs+JRup4WGNUk3JN/b8qg=90AEDED9922CF8946EA7858635493724-DFDBF2A8
0xFF, 16777216 bytes oBSYynx6vdDeUWtP5N4mAt0YF2Y=A01498CA7C7ABDD0DE516B4FE4DE2602-DD181766
0x00, 17084416 bytes kK7e2ZIs+JRup4WGNUk3JN9r9qg=90AEDED9922CF8946EA7858635493724-DF6BF6A8
0xFF, 17084416 bytes oBSYynx6vdDeUWtP5N4mAt2oE2Y=A01498CA7C7ABDD0DE516B4FE4DE2602-DDA81366
0x00, 17084417 bytes kK7e2ZIs+JRup4WGNUk3JN5r9qg=90AEDED9922CF8946EA7858635493724-DE6BF6A8
0xFF, 17084417 bytes oBSYynx6vdDeUWtP5N4mAtyoE2Y=A01498CA7C7ABDD0DE516B4FE4DE2602-DCA81366
0x00, 17391616 bytes kK7e2ZIs+JRup4WGNUk3JN+7+6g=90AEDED9922CF8946EA7858635493724-DFBBFBA8
0xFF, 17391616 bytes oBSYynx6vdDeUWtP5N4mAt14HmY=A01498CA7C7ABDD0DE516B4FE4DE2602-DD781E66
0x00, 17391617 bytes kK7e2ZIs+JRup4WGNUk3JNzVMqw=90AEDED9922CF8946EA7858635493724-DCD532AC
0xFF, 17391617 bytes oBSYynx6vdDeUWtP5N4mAgS1uWk=A01498CA7C7ABDD0DE516B4FE4DE2602-04B5B969
0x00, 17391618 bytes kK7e2ZIs+JRup4WGNUk3JN/VMqw=90AEDED9922CF8946EA7858635493724-DFD532AC
0xFF, 17391618 bytes oBSYynx6vdDeUWtP5N4mAge1uWk=A01498CA7C7ABDD0DE516B4FE4DE2602-07B5B969

Sig2Dat

The name UUHash derives from the utility which creates URIs referencing files on Kazaa. These URIs are of the form:
sig2dat://|File: surprise.mp3|Length:5845871Bytes|UUHash:=1LDYkHDl65OprVz37xN1VSo9b00=
Not considering the fact that this URI format is not RFC compliant, UUHash refers to the Base64-encoding of the hash and not the hash itself.