cort is a coreference resolution toolkit. It consists of two parts: the coreference resolution component implements a framework for coreference resolution based on latent variables, which allows you to rapidly devise approaches to coreference resolution, while the error analysis component provides extensive functionality for analyzing and visualizing errors made by coreference resolution systems.
If you are interested in running cort with the search space pruning described in Moosavi and Strube (2016), check out Nafise Moosavi's fork of cort.
If you have any questions or comments, drop me an e-mail at firstname.lastname@example.org.
cort is available on PyPi. You can install it via
pip install cort
Dependencies (automatically installed by pip) are nltk, numpy, matplotlib, mmh3, PyStanfordDependencies, cython, future, jpype and beautifulsoup. It ships with stanford_corenlp_pywrapper and [the reference implementation of the CoNLL scorer] (https://github.com/conll/reference-coreference-scorers).
cort is written for use on Linux with Python 3.3+. While cort also runs under Python 2.7, I strongly recommend running cort with Python 3, since the Python 3 version is much more efficient.
Nafise Sadat Moosavi and Michael Strube (2016). Search space pruning: A simple solution for better coreference resolvers. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, Cal., 12-17 June 2016, pages 1005-1011. PDF
Sebastian Martschat and Michael Strube (2015). Latent Structures for
Coreference Resolution. Transactions of the Association for
Computational Linguistics, 3, pages 405-418.
Sebastian Martschat, Patrick Claus and Michael Strube (2015). Plug Latent
Structures and Play Coreference Resolution. In Proceedings of
the Proceedings of ACL-IJCNLP 2015 System Demonstrations, Beijing, China,
26-31 July 2015, pages 61-66.
Sebastian Martschat, Thierry Göckel and Michael Strube (2015). Analyzing and
Visualizing Coreference Resolution Errors. In Proceedings of the 2015
Conference of the North American Chapter of the Association for Computational
Linguistics: Demonstrations, Denver, Colorado, USA, 31 May-5 June 2015,
Sebastian Martschat and Michael Strube (2014). Recall Error Analysis for
Coreference Resolution. In Proceedings of the 2014 Conference on Empirical
Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25-29 October
2014, pages 2070-2081.
Sebastian Martschat (2013). Multigraph Clustering for Unsupervised
Coreference Resolution. In Proceedings of the Student Research Workshop
at the 51st Annual Meeting of the Association for Computational Linguistics,
Sofia, Bulgaria, 5-7 August 2013, pages 81-88.
If you use the error analysis component in your research, please cite the EMNLP'14 paper. If you use the coreference component in your research, please cite the TACL paper. If you use the multigraph system, please cite the ACL'13-SRW paper.
Wednesday, 4 November 2015
Support numeric features. Due to a different feature representation the models changed, hence I have updated the downloadable models.
Friday, 9 October 2015
Now supports label-dependent cost functions.
Tuesday, 15 September 2015
Monday, 27 July 2015
Now can perform coreference resolution on raw text.
Tuesday, 21 July 2015
Updated to status of TACL paper.
Wednesday, 3 June 2015
Improvements to visualization (mention highlighting and scrolling).
Monday, 1 June 2015
Fixed a bug in mention highlighting for visualization.
Sunday, 31 May 2015
Updated to status of NAACL'15 demo paper.
Wednesday, 13 May 2015
Fixed another bug in the documentation regarding format of antecedent data.
Tuesday, 3 February 2015
Fixed a bug in the documentation: part no. in antecedent file must be with trailing 0s.
Thursday, 30 October 2014
Fixed data structure bug in documents.py. The results from the paper are not affected by this bug.
Wednesday, 22 October 2014