transmart-loader

Python library for loading data to TranSMART using transmart-copy


Keywords
t, r, a, n, s, m, _, l, o, d, e, python, transmart, transmart-copy
License
MIT
Install
pip install transmart-loader==1.4.0

Documentation

TranSMART loader

Build status codecov PyPI PyPI - Status PyPI - Downloads MIT license

This package contains classes that represent the core domain objects stored in the TranSMART platform, an open source data sharing and analytics platform for translational biomedical research.

It also provides a utility that writes such objects to tab-separated files that can be loaded into a TranSMART database using the transmart-copy tool.

⚠️ Note: this is a development version. Issues can be reported at https://github.com/thehyve/python_transmart_loader/issues.

Installation and usage

To install transmart_loader, do:

pip install transmart-loader

or from sources:

git clone https://github.com/thehyve/python_transmart_loader.git
cd python_transmart_loader
pip install .

Usage

Define a TranSMART data collection, using the classes in transmart_loader/transmart.py, e.g.,

# Create the dimension elements
age_concept = Concept('test:age', 'Age', '\\Test\\age', ValueType.Numeric)
concepts = [age_concept]
studies = [Study('test', 'Test study')]
trial_visits = [TrialVisit(studies[0], 'Week 1', 'Week', 1)]
patients = [Patient('SUBJ0', 'male', [])]
visits = [Visit(patients[0], 'visit1', None, None, None, None, None, None, [])]
# Create the observations
observations = [
    Observation(patients[0], age_concept, visits[0], trial_visits[0],
                date(2019, 3, 28), None, NumericalValue(28))]

Create a hierarchical ontology for the concepts, e.g., to create the following structure:

└ Ontology
  └ Age
# Create an ontology with one top node and a concept node
top_node = TreeNode('Ontology')
top_node.add_child(ConceptNode(concepts[0]))
ontology = [top_node]

Write the data collection to a format that can be loaded using transmart-copy:

collection = DataCollection(concepts, [], [], studies,
                            trial_visits, visits, ontology, patients, observations)

# Write collection to a temporary directory
# The generated files can be loaded into TranSMART with transmart-copy.
output_dir = mkdtemp()
copy_writer = TransmartCopyWriter(output_dir)
copy_writer.write_collection(collection)

Check examples/data_collection.py for a complete example.

Usage examples can be found in these projects:

  • fhir2transmart: a tool that translates core HL7 FHIR resources to the TranSMART data model.
  • claml2transmart: a tool that translates ontologies in ClaML format (e.g., ICD-10, available from DIMDI) to TranSMART ontologies.
  • csr2transmart: a custom data transformation and loading pipeline for a Dutch center for pediatric oncology.
  • transmart-hyper-dicer: a tool that reads a selection of data from a TranSMART instance using its REST API and loads it into another TranSMART instance.

Documentation

Full documentation of the package is available at Read the Docs.

Development

For a quick reference on software development, we refer to the software guide checklist.

Python versions

This packages is tested with Python versions >= 3.7.

Package management and dependencies

This project uses pip for installing dependencies and package management.

  • Dependencies should be added to setup.py in the install_requires list.

Testing and code coverage

  • Tests are in the tests folder.
  • The tests folder contains:
    • A test if files for transmart-copy are generated for fake data (file: test_transmart_loader)
    • A test that checks whether your code conforms to the Python style guide (PEP 8) (file: test_lint.py)
  • The testing framework used is PyTest
  • Tests can be run with python setup.py test

Documentation

  • Documentation should be put in the docs folder.
  • To generate html documentation run python setup.py build_sphinx

Coding style conventions and code quality

  • Check your code style with prospector
  • You may need run pip install .[dev] first, to install the required dependencies

License

Copyright (c) 2019 The Hyve B.V.

The TranSMART loader is licensed under the MIT License. See the file LICENSE.

Credits

This project was funded by the German Ministry of Education and Research (BMBF) as part of the project DIFUTURE - Data Integration for Future Medicine within the German Medical Informatics Initiative (grant no. 01ZZ1804D).

This package was created with Cookiecutter and the NLeSC/python-template.