One article said:A message digest (md5) is a compact digital signature for an arbitrarily long stream of binary data. An ideal message digest algorithm would never generate the same signature for two different sets of input, but achieving such theoretical perfection would require a message digest as long as the input file.
Another atricle said:Given an input file and its corresponding message digest, it should be computationally infeasible to find another file with the same message digest value.
I've been looking into MD5 checks, and I have a question:
Couldn't you compress the size of a file to something ridiculously small by using a big huge md5 check and by putting a few bytes in various key points within the file? Using those 'key bytes', wouldn't you be able to piece together the rest of the file by using the md5?
Consider it as a game of minesweeper. Using the numbers and what you already know, you could tell what's in the other boxes by piecing together more bytes untill you have the whole file. Would that work, and if not, then why not?
Sure, it would be slow. Compression time would take a day and decompression time wouldn't be so hot either, but if I'm right, we could have a full installation package of SRB2 under 2 megs or so.
*coughcoughsavesbandwidthcoughcough*
Gosh, I wish I knew more C.