UConnMLHI_UKBiobankProject

Installation

Installation via pip (Not done yet)

For local installation

    $ pip install UConnMLHI_UKBiobankProject --user

For global installation

    $ sudo pip install UConnMLHI_UKBiobankProject

Manual Installation

    $ git clone https://github.com/xinyuwang1209/UConnMLHI_UKBiobankProject
    $ cd UConnMLHI_UKBiobankProject

For local installation

    $ python setup.py install --user

For global installation

    $ sudo python setup.py install

Usage

Load the module

    from UConnMLHI_UKBiobankProject import UConnMLHI_UKBiobankProject as UKB

Data_Process

dta2csv: convert dta file into csv format

Set return_df = True, if you want to get the dataframe file beside saving it in csv

    df = UKB.Data_Process.dta2csv('/path/to/file',return_df=True)

Set return_df = False if you only want to save the data into csv format

    UKB.Data_Process.dta2csv('/path/to/file',return_df)

ICD_parser: Select Disease IDs from ICD9/ICD10 using keywords or nodes

File 'coding87.tsv' is the hierarchical tree-structured dictionary for ICD9
- https://biobank.ctsu.ox.ac.uk/crystal/coding.cgi?id=87&nl=1
File 'coding19.tsv' is the hierarchical tree-structured dictionary for ICD10
- https://biobank.ctsu.ox.ac.uk/crystal/coding.cgi?id=19&nl=1
For example, to get all diseases in ICD10 with keyword 'alzheimer' plus all diseases in node 6 (the category of all the diseases of the nervous system):

    ICDs, keyword_dict, node_dict = UKB.Data_Process.ICD_parser('/path/to/coding19.tsv',keywords=['alzheimer'],nodes=[6])

select_healthy_subjects: Select healthy subject using disease IDs from ICD_parser

For example, to get the subjects who does not have alzheimer or any other nervous system diseases:

  df = pd.read_csv('Data_Diagnosis.csv', low_memory=False)

  icd = 10
  ICDs, _， _ = UKB.Data_Process.ICD_parser('/path/to/coding19.tsv',keywords=['alzheimer'],nodes=[6])
  healthy_subjects, unhealthy_subjects = UKB.Data_Process.select_healthy_subjects(df,ICDs,icd=icd)

UConnMLHI-UKBiobankProject
Release 0.0.5

Release 0.0.5

0.0.5

0.0.4

0.0.3

0.0.2

0.0.1

Documentation

UConnMLHI_UKBiobankProject

Installation

Installation via pip (Not done yet)

Manual Installation

Usage

Load the module

Data_Process

dta2csv: convert dta file into csv format

ICD_parser: Select Disease IDs from ICD9/ICD10 using keywords or nodes

select_healthy_subjects: Select healthy subject using disease IDs from ICD_parser

Stats

Development practices

Releases

Contributors

UConnMLHI-UKBiobankProject Release 0.0.5

Release 0.0.5 Toggle Dropdown 0.0.5 0.0.4 0.0.3 0.0.2 0.0.1

Documentation

UConnMLHI_UKBiobankProject

Installation

Installation via pip (Not done yet)

Manual Installation

Usage

Load the module

Data_Process

dta2csv: convert dta file into csv format

ICD_parser: Select Disease IDs from ICD9/ICD10 using keywords or nodes

select_healthy_subjects: Select healthy subject using disease IDs from ICD_parser

Stats

Development practices

Releases

Contributors

UConnMLHI-UKBiobankProject
Release 0.0.5

Release 0.0.5

0.0.5

0.0.4

0.0.3

0.0.2

0.0.1