
An OpenCL implementation of Kernel Density Estimation

pip install kde-ocl==0.1.0


This repository implements Gaussian Kernel Density Estimation using OpenCL to achieve important performance gains.

The Python interface is based on the Scipy's gaussian_kde class, so it should be pretty easy to replace the CPU implementation of gaussian_kde with the OpenCL implementation in this repository gaussian_kde_ocl.

Example Code

import numpy as np
from kde_ocl import gaussian_kde_ocl

# Generate dummy training data (10000 instances of 2D data)
train = np.random.multivariate_normal([0,0], [[1,0],[0,1]], 10000)
# Generate dummy test data (10000 instances of 2D data)
test = np.random.multivariate_normal([0,0], [[1,0],[0,1]], 100)

# Train the KDE model
kde = gaussian_kde_ocl(train)

# Get the pdf of each test point. This is equivalent to kde.pdf(test)
pdf = kde(test)

# Get the logpdf of each test point. This is equivalent to kde.pdf(test)
logpdf = kde.logpdf(test)

The interface is mostly the same as Scipy's gaussian_kde, but the axis order is changed. For example, training a Scipy's gaussian_kde with a numpy array of shape (10000, 2) is interpreted as two instances of 10000 dimensions. In gaussian_kde_ocl, this data is interpreted as 10000 instances of 2 dimensions. This change makes easier to work with pandas dataframes:

import pandas as pd
import numpy as np
from kde_ocl import gaussian_kde_ocl

# Create pandas dataframe 
a = np.random.normal(0, 1, 5000)
b = np.random.normal(3.2, np.sqrt(1.8), 5000)
data = pd.DataFrame({'a': a, 'b': b})

# Train KDE model
kde = gaussian_kde_ocl(data.values)

# Evaluate one point
logpdf = kde.logpdf([1.1, 2.3])


This is a comparison of the gaussian_kde_ocl and Scipy's gaussian_kde with 2D data and the following configuration:

  • CPU: Intel i7-6700K.
  • GPU: AMD RX 460.
  • Python 3.7.3
  • Ubuntu 16.04

pdf() method

Training instances / Test instances gaussian_kde_ocl.pdf() gaussian_kde.pdf() Speedup
100,000 / 1,000 218.6474 ± 1.5901 ms 1,911.0764 ± 50.8762 ms 8.74x
1,000 / 10,000,000 18.8643 ± 0.07322 s 237.3429 ± 1.1765 s 12.58x
100 / 10,000 4.4533 ± 0.7297 ms 18.0684 ± 0.3302 ms 4.46x

logpdf() method

Training instances / Test instances gaussian_kde_ocl.logpdf() gaussian_kde.logpdf() Speedup
100,000 / 1,000 261.1466 ± 6.3932 ms 6,798.4730 ± 420.2878 ms 26.03x
1,000 / 10,000,000 36.3143 ± 0.02916 s MemoryError NA
100 / 10,000 8.827 ± 0.7442 ms 34.1114 ± 1.3060 ms 3.86x

Current Limitations

  • Only C order (the default) numpy arrays can be used as traning/test datasets.
  • Only Gaussian kernels are implemented.
  • OpenCL device is selected automatically.


The library is Python 2/3 compatible. Currently, is tested in Ubuntu 16.04, but should be compatible with other operating systems where there are OpenCL GPU support.

Python Dependencies

The project has the following Python dependencies:

cffi numpy six

You can install them with:

pip install cffi numpy six


The Rust compiler must be installed in the system. Check out https://www.rust-lang.org/tools/install for more information.

The default Rust toolchain is used to compile the library, so make sure to install a Rust toolchain (32 vs 64 bits) compatible with the Python interpreter version (32 vs 64 bits).


The GPU drivers that enable OpenCL should be installed.


Use pip:

pip install kde_ocl

Alternatively, clone the repository and use the setup script:

python setup.py install


Tests are run using pytest and requires scipy to compare gaussian_kde_ocl with Scipy's gaussian_kde. Install them:

pip pytest scipy

Run the tests with:



To run the benchmarks, pytest-benchmark is needed:

pip pytest-benchmark

Then, execute the tests with benchmarks enabled:

pytest --times

To run only the OpenCL benchmarks:

pytest --times-ocl

To run only the Scipy's gaussian_kde benchmarks:

pytest --times-scipy