timeseries-feature-engineering

A description of your project


Keywords
some, keywords
License
Apache-2.0
Install
pip install timeseries-feature-engineering==0.0.1

Documentation

Time Series Feature Engineering

Time series feature generator.

Install

pip install timeseries_feature_engineering

How to use

Add Date Parts

df = pd.DataFrame({'date': ['2019-12-04', None, '2019-11-15', '2019-10-24']})
df = add_datepart(df, 'date')
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Year Month Week Day Dayofweek Dayofyear Is_month_end Is_month_start Is_quarter_end Is_quarter_start Is_year_end Is_year_start Elapsed
0 2019.0 12.0 49.0 4.0 2.0 338.0 False False False False False False 1575417600
1 NaN NaN NaN NaN NaN NaN False False False False False False None
2 2019.0 11.0 46.0 15.0 4.0 319.0 False False False False False False 1573776000
3 2019.0 10.0 43.0 24.0 3.0 297.0 False False False False False False 1571875200

Add Moving Average Features

With weighted average.

Recency in an important factor in a time series. Values closer to the current date would hold more information.

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_moving_average_features(df, 'sales', windows=[3,5], weighted=True)
df.head(10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_MA sales_5p_MA
0 2019-12-01 155 NaN NaN
1 2019-12-02 437 NaN NaN
2 2019-12-03 361 352.000000 NaN
3 2019-12-04 356 371.166667 NaN
4 2019-12-05 490 423.833333 399.066667
5 2019-12-06 222 333.666667 353.133333
6 2019-12-07 197 254.166667 294.400000
7 2019-12-08 390 297.666667 316.000000
8 2019-12-09 159 242.333333 258.666667
9 2019-12-10 470 353.000000 318.133333

Without weighted average.

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_moving_average_features(df, 'sales', windows=[3,5], weighted=True)
df.head(10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_MA sales_5p_MA
0 2019-12-01 167 NaN NaN
1 2019-12-02 458 NaN NaN
2 2019-12-03 260 310.500000 NaN
3 2019-12-04 174 250.000000 NaN
4 2019-12-05 392 297.333333 301.266667
5 2019-12-06 401 360.166667 338.200000
6 2019-12-07 460 429.000000 379.200000
7 2019-12-08 381 410.666667 393.733333
8 2019-12-09 349 378.166667 389.533333
9 2019-12-10 365 362.333333 379.000000

Add Expanding Features

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_expanding_features(df, 'sales', period=3)
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_expanding
0 2019-12-01 178 NaN
1 2019-12-02 398 NaN
2 2019-12-03 399 325.0
3 2019-12-04 385 340.0
4 2019-12-05 136 299.2

Add Trend Features

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_trend_features(df, 'sales', windows=[3,7])
df.head(10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_trend sales_7p_trend
0 2019-12-01 237 0.000000 0.000000
1 2019-12-02 388 0.000000 0.000000
2 2019-12-03 384 0.000000 0.000000
3 2019-12-04 498 87.000000 0.000000
4 2019-12-05 275 -37.666667 0.000000
5 2019-12-06 382 -0.666667 0.000000
6 2019-12-07 132 -122.000000 0.000000
7 2019-12-08 337 20.666667 14.285714
8 2019-12-09 496 38.000000 15.428571
9 2019-12-10 216 28.000000 -24.000000