Typecraft Python
This repository contains an IGT model based on the Typecraft IGT format. It also contains a simple CLI for performing various NLP tasks, interfacing with both NLTK and other tools such as the TreeTagger.
- Free software: MIT license
- Full Documentation: https://typecraft_python.readthedocs.io.
Installation
pip install typecraft_python
Features
- Parsing of the Typecraft XML format.
-
- Manipulation of the Typecraft IGT model format.
-
- Integrating with NLTK
- Integrating with TreeTagger
- Provides a CLI that can be used to load, convert and manipulate raw text and Typecraft XML files.
Usage
Usage: tpy [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
convert
ntexts This command lists the number of texts in a...
raw
xml
Examples
Load a raw file, tokenize and tag it, and output xml (to stdout):
$Â tpy raw your_file.txt
To save to a file
$Â tpy raw your_file.txt -o output.xml
# or
$Â tpy raw your_file.txt > output.xml
To tag using a specific tagger:
$ tpy raw your_file.txt --tagger=tree # Tags using the tree tagger
To load a Typecraft xml file and tag it:
$Â tpy xml your_file.xml --tag --tagger=nltk -o tagged_output.xml
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.