edc-analytics

Build analytical tables for clinicedc/edc projects


Keywords
django, analytics, pandas, data, collection, clinicedc, clinical, trials
Licenses
xpp/MIT-feh
Install
pip install edc-analytics==0.1.1

Documentation

pypi downloads

edc-analytics

Build analytic tables from EDC data

Read your data into a dataframe, for example an EDC screening table:

qs_screening = SubjectScreening.objects.all()
df = read_frame(qs_screening)

Convert all numerics to pandas numerics:

cols = [
    "age_in_years",
    "dia_blood_pressure_avg",
    "fbg_value",
    "hba1c_value",
    "ogtt_value",
    "sys_blood_pressure_avg",
]
df[cols] = df[cols].apply(pd.to_numeric)

Pass the dataframe to each Table class

gender_tbl = GenderTable(main_df=df)
age_tbl = AgeTable(main_df=df)
bp_table = BpTable(main_df=df)

In the Table instance,

  • data_df is the supporting dataframe
  • table_df is the dataframe to display. The table_df displays formatted data in the first 5 columns ("Characteristic", "Statistic", "F", "M", "All"). The table_df has additional columns that contain the statistics used for the statistics displayed in columns ["F", "M", "All"].

From above, gender_tbl.table_df is just a dataframe and can be combined with other table_df dataframes using pd.concat() to make a single table_df.

table_df = pd.concat(
    [gender_tbl.table_df, age_tbl.table_df, bp_table.table_df]
 )

Show just the first 5 columns:

table_df.iloc[:, :5]

Like any dataframe, you can export to csv:

path = "my/path/to/csv/folder/table_df.csv"
table_df.to_csv(path_or_buf=path, encoding="utf-8", index=0, sep="|")