lowess-grouped

Apply groupwise lowess smoothing to a dataframe


Keywords
data-analysis, data-science, loess, lowess, python, savitzky-golay-filter, smoothing, statistics
License
MIT
Install
pip install lowess-grouped==0.0.7

Documentation

Lowess Grouped

Apply groupwise lowess smoothing to a dataframe.

Smooth data for each category using the lowess (aka loess) algorithm. You can use this code for all forms of data that should be smoothed independently by group:

lowess-grouped-example Figure 1: Smoothed temperature data for each region

Usage

Install the package (Python 3.8 or higher):

pip install lowess-grouped

Import the package and call the function lowess_grouped with your dataframe df. Use the parameter frac to control the strength of the smoothing:

from lowess_grouped.lowess_grouped import lowess_grouped

df_smoothed = lowess_grouped(df, 
                             x_name="year", 
                             y_name="temperature_anomaly",
                             group_name="region_name", 
                             frac=0.05)

For a detailed example, refer to the notebook temperature-example.ipynb.

Testcases

Tests are defined in the folder tests. To run them manually, follow these steps:

  1. Download the source code from GitHub.

  2. Install package locally by executing the following command in the project folder:

    pip install -e .

    You might need to upgrade your version of pip for this to work:

    pip install --upgrade pip
  3. Run the tests:

    python ./tests/test_lowess_grouped.py -v

Motivation

Smoothing data can make plots more readable, and one commonly used method is lowess/loess, sometimes also referred as Savitzky–Golay filter.

Statsmodels lowess only smooths the entire dataframe, leading to undesirable results when you need independent smoothing for multiple groups (e.g., temperature data by regions).

This package was developed to address this limitation. It internally uses statsmodels. That's why some parameters have the same names. Feel free to use to code as inspiration.

Attribution

This project builds upon the lowess function from statsmodels. The temperature data used in the example notebook and testcases is from Berkley Earth, and licensed under Creative Commons BY-NC 4.0 International.