finding common subsections: _rck

_rck_

finding common subsections

Apr 12, 2007 23:05

The most interesting aspect about the identification of similar subsections in files in Peer-to-Peer networks is the use of a combination of size and conceptual markers plus hashing that is called Rabin fingerprinting.

That's clever and a useful trick to know. The rest (including the hashing, actually) is pretty much how anyone would have implemented it.

That it causes an improvement is a sad statement about the unequal distribution of network power in the current world.

technology, software, algorithms