gpumonitor

GPU Monitoring Callbacks for TensorFlow and PyTorch Lightning


Keywords
gpu-monitoring, pytorch-lightning, tensorflow
License
MIT
Install
pip install gpumonitor==0.1.2

Documentation

gpumonitor

Pypi Version Licence Frameworks

gpumonitor gives you stats about GPU usage during execution of your scripts and trainings, as TensorFlow or Pytorch Lightning callbacks.

Installation

Installation can be done directly from this repository:

pip install gpumonitor

Getting started

Option 1: In your scripts

monitor = gpumonitor.GPUStatMonitor(delay=1)

# Your instructions here
# [...]

monitor.stop()
monitor.display_average_stats_per_gpu()

It keeps track of the average of GPU statistics. To reset the average and start from fresh, you can also reset the monitor:

monitor = gpumonitor.GPUStatMonitor(delay=1)

# Your instructions here
# [...]

monitor.display_average_stats_per_gpu()
monitor.reset()

# Some other instructions
# [...]

monitor.display_average_stats_per_gpu()

Option 2: Callbacks

Add the following callback to your training loop:

For TensorFlow,

from gpumonitor.callbacks.tf import TFGpuMonitorCallback

model.fit(x, y, callbacks=[TFGpuMonitorCallback(delay=0.5)])

For PyTorch Lightning,

from gpumonitor.callbacks.lightning import PyTorchGpuMonitorCallback

trainer = pl.Trainer(callbacks=[PyTorchGpuMonitorCallback(delay=0.5)])
trainer.fit(model)

Display Format

You can customize the display format according to the gpustat options. For example, display of watts consumption, fan speed are available. To know which options you can change, refer to:

Sources

  • Built on top of GPUStat
  • Separate thread loop coming from gputil