Python package for data analytics.


License
BSD-2-Clause
Install
pip install PtDa==0.1.5

Documentation

PtDa

Python package for data analytics.

The package provides:

  • WOE calculation
  • IV calculation
  • Numeric and categorical check
  • etc

How to get it?

Binary installers for the latest released version are available at the Python package index.

# with PyPi 
pip install ptda

The source code is hosted on Github:

https://github.com/luckyp71/ptda

Dependencies

  • Pandas
  • Numpy
  • Scipy

Example

The following code is the example on how to use ptda. In this example, we use UCI Credit Card dataset.

Load Librares and Data

load_lib_data

Check Target Variable Name

Please bear in mind that we need to rename our target variable into target. Luckily in UCI Credit Card dataset we used in this example, the target variable name is already target, hence we don't need make any changes. check_target_var_name

Numeric and Categorical Variable Check

This method will return dataframe which contains numeric_var and categorical_var fields. Those fields are used to inform us whether the particular feature/variable is numeric or categorical, 1 for yes and 0 for no.

How does it work?
What if we have categorical feature that has many unique values, let say 15?

Well the cn_df method has one optional argument, i.e. n_bin, so if you have many unique values in your categorical feature/var, you can pass that unique values count as n_bin in the cn_df method (the default of n_bin is 10). num_cat_check

WOE and IV Calculation

woe_iv is a method to calculate WOE and IV as well as generating dataframe which contains those two information. woe_iv_calculation

iv_result