pd_multiprocessing
pd_multiprocessing provides a simple, parallelized function to apply a user defined function rowwise on a Pandas Dataframe.
Requirements
- pandas 0.22.0+
- pytest 3.4.1+
Documentation
If you want to build the documentation, you need the following packages:
- Sphinx
- sphinx_rtd_theme
- m2r
Installation
You can easily install pd_multiprocessing via
pip install pd-multiprocessing
Usage
A typical usage looks like this
import pandas as pd
from pd_multiprocessing.map import df_map
def twotimes(row):
row['col2'] = row['col1']*2
return row
if __name__ == '__main__':
df = pd.DataFrame.from_dict({'col1': range(100)})
print(df_map(twotimes, df))