Python toolbox to create adversarial examples that fool neural networks

adversarial-examples, keras, machine-learning, mxnet, neural-networks, python, pytorch, tensorflow, theano
pip install foolbox==1.8.0



Foolbox is a Python toolbox to create adversarial examples that fool neural networks. It requires Python, NumPy and SciPy.


pip install foolbox

We test using Python 2.7, 3.5 and 3.6. Other Python versions might work as well. We recommend using Python 3!


Documentation is available on readthedocs:

Our paper describing Foolbox is on arXiv:


import foolbox
import keras
import numpy as np
from keras.applications.resnet50 import ResNet50

# instantiate model
kmodel = ResNet50(weights='imagenet')
preprocessing = (np.array([104, 116, 123]), 1)
fmodel = foolbox.models.KerasModel(kmodel, bounds=(0, 255), preprocessing=preprocessing)

# get source image and label
image, label = foolbox.utils.imagenet_example()

# apply attack on source image
# ::-1 reverses the color channels, because Keras ResNet50 expects BGR instead of RGB
attack = foolbox.attacks.FGSM(fmodel)
adversarial = attack(image[:, :, ::-1], label)
# if the attack fails, adversarial will be None and a warning will be printed

For more examples, have a look at the documentation.

Finally, the result can be plotted like this:

# if you use Jupyter notebooks
%matplotlib inline

import matplotlib.pyplot as plt


plt.subplot(1, 3, 1)
plt.imshow(image / 255)  # division by 255 to convert [0, 255] to [0, 1]

plt.subplot(1, 3, 2)
plt.imshow(adversarial[:, :, ::-1] / 255)  # ::-1 to convert BGR to RGB

plt.subplot(1, 3, 3)
difference = adversarial[:, :, ::-1] - image
plt.imshow(difference / abs(difference).max() * 0.2 + 0.5)

Interfaces for a range of other deeplearning packages such as TensorFlow, PyTorch, Theano, Lasagne and MXNet are available, e.g.

model = foolbox.models.TensorFlowModel(images, logits, bounds=(0, 255))
model = foolbox.models.PyTorchModel(torchmodel, bounds=(0, 255), num_classes=1000)
# etc.

Different adversarial criteria such as Top-k, specific target classes or target probability values for the original class or the target class can be passed to the attack, e.g.

criterion = foolbox.criteria.TargetClass(22)
attack    = foolbox.attacks.LBFGSAttack(fmodel, criterion)

Feature requests and bug reports

We welcome feature requests and bug reports. Just create a new issue on GitHub.

Questions & FAQ

Depending on the nature of your question feel free to post it as an issue on GitHub, or post it as a question on Stack Overflow using the foolbox tag. We will try to monitor that tag but if you don't get an answer don't hesitate to contact us.

Before you post a question, please check our FAQ and our Documentation on ReadTheDocs.

Contributions welcome

Foolbox is a work in progress and any input is welcome.

In particular, we encourage users of deep learning frameworks for which we do not yet have builtin support, e.g. Caffe, Caffe2 or CNTK, to contribute the necessary wrappers. Don't hestiate to contact us if we can be of any help.

Moreoever, attack developers are encouraged to share their reference implementation using Foolbox so that it will be available to everyone.


If you find Foolbox useful for your scientific work, please consider citing it in resulting publications:

  title={Foolbox: A Python toolbox to benchmark the robustness of machine learning models},
  author={Rauber, Jonas and Brendel, Wieland and Bethge, Matthias},
  journal={arXiv preprint arXiv:1707.04131},

You can find the paper on arXiv:


You might want to have a look at our recently announced Robust Vision Benchmark.