⚠️There's a newer version (0.1.5) of this package.Take me there.
textreuse (version 0.1.0)
Detect Text Reuse and Document Similarity
Description
Tools for measuring similarity among documents and detecting
passages which have been reused. Implements shingled n-gram, skip n-gram,
and other tokenizers; similarity/dissimilarity functions; pairwise
comparisons; minhash and locality sensitive hashing algorithms; and a
version of the Smith-Waterman local alignment algorithm suitable for
natural language.