AutoScribe-tools

Set of tools for AutoScribe.


License
MIT
Install
pip install AutoScribe-tools==0.0.19

Documentation

AutoScibe-tools

Coverage Status Build Status Documentation Status PyPI version License

Set of tools used in the development of AutoScibe.

DrugExtractor & OrganicChemExtractor

(TODO documentation - finished implementation May 7th 2018)

RxNorm-parser

For retrieving all drug names and unique identifier data in the RxNorm Conso RRF file.

Quickstart

from parsers.RxnConsoFile import parseRxnConsoFile

data = list(parseRxnConsoFile("<path-to-RXNCONSO.RRF>"))
print(f.data[0].name) # drug name of the first element of this table.
print(data[0].group) # unique group number of this first element.
print(data[0].uniqueNum) # unique id of this specific element (should be different for every row).

print(data[-1].name) # drug name of the last element.

print(data[0].dataSource.name) # name of the data source for this element.
print(data[0].termType.name) # name of the term type for this element.

Fields

The following features are accessible per element:

data[i].name
data[i].group
data[i].uniqueNum # note: this is a string because there are uniqueNum that are not integers.

data[i].termType.abbr
data[i].termType.name
data[i].termType.description
data[i].termType.example
# note: there are some {SU, PT and DP} term types that are not defined
# on <https://www.nlm.nih.gov/research/umls/rxnorm/overview.html>.
# Therefore they only have abbr field - the other fields are empty strings.

data[i].dataSource.name
data[i].dataSource.abbr
# note: RXNORM is only an abbreviation (empty string for name).

ChvFile

For retrieving the UMLS concept relations.

Quickstart

from parsers.ChvFile import parseChvFile

data = list(parseChvFile("<path-to-CHV_concepts_terms.tsv>"))

print(data[0].cui) # concept unique id
print(data[0].chvTerm) # CHV term name
print(data[0].chvConceptId # CHV term unique concept ID

Fields

The following features are accessible per element:

# See the following link for detailed explanation:
# <https://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/CHV/sourcerepresentation.html>

# string
data[i].cui
data[i].chvTerm
data[i].umlsName
data[i].chvName
data[i].explanation

# boolean
data[i].chvPreferred
data[i].umlsPreferred
data[i].disparaged

# float (may be -1.0)
data[i].freqScore
data[i].contextScore
data[i].cuiScore
data[i].comboScore
data[i].comboScoreNoTopWords

# int
data[i].chvStrId
data[i].chvConceptId