AutoScibe-tools
Set of tools used in the development of AutoScibe.
DrugExtractor & OrganicChemExtractor
(TODO documentation - finished implementation May 7th 2018)
RxNorm-parser
For retrieving all drug names and unique identifier data in the RxNorm Conso RRF file.
Quickstart
from parsers.RxnConsoFile import parseRxnConsoFile
data = list(parseRxnConsoFile("<path-to-RXNCONSO.RRF>"))
print(f.data[0].name) # drug name of the first element of this table.
print(data[0].group) # unique group number of this first element.
print(data[0].uniqueNum) # unique id of this specific element (should be different for every row).
print(data[-1].name) # drug name of the last element.
print(data[0].dataSource.name) # name of the data source for this element.
print(data[0].termType.name) # name of the term type for this element.
Fields
The following features are accessible per element:
data[i].name
data[i].group
data[i].uniqueNum # note: this is a string because there are uniqueNum that are not integers.
data[i].termType.abbr
data[i].termType.name
data[i].termType.description
data[i].termType.example
# note: there are some {SU, PT and DP} term types that are not defined
# on <https://www.nlm.nih.gov/research/umls/rxnorm/overview.html>.
# Therefore they only have abbr field - the other fields are empty strings.
data[i].dataSource.name
data[i].dataSource.abbr
# note: RXNORM is only an abbreviation (empty string for name).
ChvFile
For retrieving the UMLS concept relations.
Quickstart
from parsers.ChvFile import parseChvFile
data = list(parseChvFile("<path-to-CHV_concepts_terms.tsv>"))
print(data[0].cui) # concept unique id
print(data[0].chvTerm) # CHV term name
print(data[0].chvConceptId # CHV term unique concept ID
Fields
The following features are accessible per element:
# See the following link for detailed explanation:
# <https://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/CHV/sourcerepresentation.html>
# string
data[i].cui
data[i].chvTerm
data[i].umlsName
data[i].chvName
data[i].explanation
# boolean
data[i].chvPreferred
data[i].umlsPreferred
data[i].disparaged
# float (may be -1.0)
data[i].freqScore
data[i].contextScore
data[i].cuiScore
data[i].comboScore
data[i].comboScoreNoTopWords
# int
data[i].chvStrId
data[i].chvConceptId