Shingleprinting code for estimating document similarity
This package implements the Broder et al method for approximating the Jaccard metric for comparing two text files. The Jaccard metric measures the similarity of two sets of features by computing the cardinality of the…