semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models such as BERT.
The Tidelift Subscription provides access to a continuously curated stream of human-researched and maintainer-verified data on open source packages and their licenses, releases, vulnerabilities, and development practices.