snowball-extractor

Snowball: Extracting Relations from Large Plain-Text Collections


Keywords
nlp, semantic, relationship, extraction, bootstrapping, emnlp, tf-idf, information-extraction, relationship-extraction, semi-supervised-learning
License
GPL-3.0
Install
pip install snowball-extractor==1.0.5

Documentation

Snowball: Extracting Relations from Large Plain-Text Collections

This is my own implementation of the the Snowball system to bootstrap relationship instances. You can find more details here:

A sample file containing sentences where the named-entities are already tagged can be downloaded, which has 1 million sentences taken from the New York Times articles part of the English Gigaword Collection.

NOTE: look at the desription of BREDS to understand how to give a tagged document collection and seeds to setup the bootstrapping of relationship instances with Snowball, both systems have a similar setup.