gpulink
A library and command-line tool for monitoring NVIDIA GPU stats.
gpulink uses pynvml - a Python wrapper for
the NVIDIA Management Library (NVML).
Current status
โ gpulink is in a very early state - breaking changes between versions are possible!
Requirements
gpulink requires the NVIDIA Management Library to be installed which is shipped together with nvidia-smi.
Installation
Installation using PIP
To install gpulink using the Python Package Manager (PIP) run:
pip install gpulink
Using from source
gpulink can also be used from source. For this, perform the following steps to create a Python environment and to install the requirements:
- Create an environment:
python -m venv env
- Activate the environment:
.\env\Scripts\Activate
- Install requirements:
pip install -r requirements.txt
Command-line usage
gpulink can either be imported as a library or can be used from the command line:
Usage: GPU-Link: Monitor NVIDIA GPUs [OPTIONS] COMMAND [ARGS]...
Options:
--version Show the version and exit.
--help Show this message and exit.
Commands:
record Record GPU properties.
sensors Fetch and print the GPU sensor status.
Examples
- View GPU sensor status:
gpulink sensors
โโโโโโโโโคโโโโโโโโโโโโโโโโโโโคโโโโโโโโโโโโโโโโโโโโโโคโโโโโโโโโโโโโโคโโโโโโโโโโโโโโโโโโคโโโโโโโโโโโโโโโโคโโโโโโโโโโโโโโโโโโโโ
โ GPU โ Name โ Memory [MB] โ Temp [ยฐC] โ Fan speed [%] โ Clock [MHz] โ Power Usage [W] โ
โโโโโโโโโชโโโโโโโโโโโโโโโโโโโชโโโโโโโโโโโโโโโโโโโโโโชโโโโโโโโโโโโโโชโโโโโโโโโโโโโโโโโโชโโโโโโโโโโโโโโโโชโโโโโโโโโโโโโโโโโโโโก
โ 0 โ NVIDIA TITAN RTX โ 1809 / 25769 (7.0%) โ 34 โ 41 โ Graph.: 173 โ 26.583 โ
โ โ โ โ โ โ Memory: 403 โ โ
โ โ โ โ โ โ SM: 173 โ โ
โ โ โ โ โ โ Video: 540 โ โ
โโโโโโโโโงโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโ
- Watch GPU sensor status:
gpulink sensors -w
- Record the memory usage over time, generate a plot and save it as a png image:
gpulink record -o memory.png memory
โโโโโโโคโโโโโโโโโโโโโโโโโโโคโโโโโโโโโโโโโโโโโโโโโโโ
โ GPU โ Name โ Memory used [MB] โ
โโโโโโโผโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโค
โ 0 โ NVIDIA TITAN RTX โ minimum: 1584.754688 โ
โ โ โ maximum: 2204.585984 โ
โโโโโโโงโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโโ
Duration: 2.500 [s]"
Sampling rate: 300.000 [Hz]"
Library usage
gpulink can be easily used within applications. Just import gpulink
and create a DeviceCtx
. This context manages
device access and provides an API for fetching GPU properties
(see API example):
import gpulink as gpu
with gpu.DeviceCtx() as ctx:
print(f"Available GPUs: {ctx.gpus.names}")
memory_information = ctx.get_memory_info(gpus=ctx.gpus.ids)
Recording data
gpulink provides a Recorder class for recording GPU properties. For simple instantiation use one of the provided factory methods, e.g.:
recorder = gpu.Recorder.create_memory_recorder(ctx, ctx.gpus.ids)
Afterwards a recording can be performed:
start
and stop
method (see Basic example)
Option 1: Using recorder.start()
... # Do some GPU stuff
recorder.stop(auto_join=True)
Context-Manager example)
Option 2: Using a context manager (see with recorder:
... # Do some GPU stuff
Decorator example)
Option 3: Using a decorator (see @record(factory=gpu.Recorder.create_memory_recorder)
def my_gpu_function():
... # Do dome GPU stuff
my_gpu_function()
Once a recording is finished its data can be accessed:
recording = recording = recorder.get_recording()
Plotting data
gpulink provides a Plot class for visualizing recordings using matplotlib:
from pathlib import Path
# Generate the plot
plot = gpu.Plot(recording)
# Display the plot
plot.plot()
# Save the plot as an image
plot.save(Path("memory.png"))
# The generated Figure and Axis can also be accessed directly
figure, axis = plot.generate_graph()
Unit testing
When using gpulink inside unit tests, create or use an already existing device mock,
e.g. DeviceMock.
To create a custom mock class just derive it from
the BaseDevice. Then during
creating
a DeviceCtx
provide the mock as follows:
import gpulink as gpu
with gpu.DeviceCtx(device=DeviceMock) as ctx:
...
Troubleshooting
- If you get the error message below, please ensure that the NVIDIA Management Library is installed on you system by
typing
nvidia-smi --version
into a terminal:
pynvml.nvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found
.
Planned features
- Live-plotting of GPU stats
Changelog
-
0.4.0
- Recording arbitrary GPU stats (clock, fan-speed, memory, power-usage, temp)
- Display GPU name and power usage within
sensors
command - Replaced
arparse
library by click - Aborting a
watch
orrecording
command can be done by pressing any key instead ofctrl+c
-
0.4.1
- Fix error when calling
nvmlDeviceGetName
inpynvml
version 11.5.0
- Fix error when calling
-
0.5.0
- Add context-manager-based recording
- Add decorator-based recording
-
0.6.0
- Remove PlotOptions class
- Fix imports and update unit tests