The most interesting aspect about
the identification of similar subsections in files in Peer-to-Peer networks is the use of a combination of size and conceptual markers plus hashing that is called Rabin fingerprintingThat's clever and a useful trick to know. The rest (including the hashing, actually) is pretty much how anyone would have implemented
(
Read more... )