xmi2conll

Simple CLI to convert any annotated document in UIMA CAS XMI to CONLL format (IOB schema support).


License
MIT
Install
pip install xmi2conll==0.1.6

Documentation

xmi2conll CLI

Python Version License: MIT PyPI version

logo

Simple CLI to convert any annotated document in UIMA CAS XMI to CONLL format (IOB schema support).

Installation:

Start by create and activate a new environnement with virtualenv :

virtualenv --python=/usr/bin/python3.8 venv
source venv/bin/activate

then choose:

  • Easy way (use pip):
pip install xmi2conll
  • Dev install:
git clone https://github.com/Lucaterre/xmi2conll
pip install -r requirements.txt

Usage:

with pip install run:

x2c --help

or with dev install run:

python x2c.py --help
Usage: x2c.py [OPTIONS] INPUT_XMI TYPESYSTEM

  XMI to CONLL Converter CLI © 2022 - @Lucaterre

  INPUT_XMI (str): XMI file path or directory path that contains XMI for batch
  processing.

  TYPESYSTEM (str): Typesystem.xml path.

Options:
  -o, --output TEXT               output path that contains new conll, 
                                  if it not specify ./output/ is auto created.
                                  [default: ./output/]
  -tn, --type_name_annotations TEXT
                                  type name of the annotations  [default: de.t
                                  udarmstadt.ukp.dkpro.core.api.ner.type.Named
                                  Entity]
  -s, --conll_separator TEXT      Defines a separator in CONLL between mention
                                  and label; only 'space' or 'tab' are accepted [default:
                                  space]
  -h, --header BOOLEAN            show or hide title of CLI  [default: True]
  --help                          Show this message and exit.

Citation:

@misc{xmi2conll-cli,
    author = "Lucas Terriel",
    title = {xmi2conll, a cli to convert any annotated document in UIMA CAS XMI to CONLL format (IOB schema support)},
    howpublished = {\url{https://github.com/Lucaterre/xmi2conll}},
    year = {2022}
}

License:

This tool is distributed under MIT license.