jupyter-probe

A package to monitor, manage, declare and analyse notebook resource usage on jupyter environments


License
MIT
Install
pip install jupyter-probe==0.1.4

Documentation

Jupyter Probe

Jupyterprobe is a python package to monitor, manage, declare and analyse notebook resource usage on jupyter environments.

Compatibility

Jupyterprobe works with Linux, OSX. Wide variety of Jupyter environment flavours and configurations are supported:

  • Jupyter Notebook, Jupyter Lab
  • Authentication: None, Token, Password
  • Hosted at: localhost, remote
  • GPU is also supported (requires Nvidia Management Library (NVML)/ CUDA toolkit).

Jupyterprobe currently works with Python 3.6.1 or later.

Installing

Install with pip or your favorite PyPi package manager.

pip install jupyter-probe

(For common troubleshooting, see here)

Usage

Note: All these commands are to be run from within your jupyter notebooks.

Define Probe

First define a Probe object using host and port.

from jupyterprobe import Probe
host = 'localhost'
port = 8888
pb = Probe(host, port)

If your jupyter environment is password authenticated, you can additionally pass the password argument

pb = Probe(host, port, password='hobbit')

Monitor

To monitor resource usage of all notebooks in your session, call Monitor.

pb.Monitor()

Monitor Top 5 results are shown sorted by memory usage. To see more, you can pass top_n as argument

pb.Monitor(10)

Declare Experiment

To declare ownership

pb.Declare(owner='Gandalf', priority='10', project='Ring')

This will save the declaration for your current notebook in ~/.jupyterprobe/experiment_register.json.

Monitor Team

By default, monitor only shows resource usage and PID of your notebooks. To see ownership and project related data, use MonitorTeam or Monitor(team=true)

pb.MonitorTeam()

Team This will show details based on your declarations as well as declarations from other teammates using the same Jupyter Session.

Custom Usage Analytics

If you want to do your own analytics on notebooks' usage, you can get Pandas Dataframe of all the results through pb.results.

Some more usefull methods that can be called upon Probe object:

get_results_by_PID(PID) : get all results of notebook matching given PID

get_results_by_name(name) : get all results of notebook matching given name. Returns multiple notebooks if name isn't unique

get_path_by_PID(PID) : get absolute path of notebook matching given PID

get_path_by_name(name) : get absolute path of notebook matching given name. Returns multiple paths if name isn't unique

Troubleshooting

Can't install psutil, python.h not found: You need to install python-dev. For Unix like systems, you can do sudo apt-get install python3.x-dev where python3.x refers to your python version.

INFO: GPU not found on your system, but you actually have GPU: Install py3nvml and run

import py3nvml
py3nvml.py3nvml.nvmlInit()

Most probably, your Nvidia libraries are missing due to which it will throw an error.

Could not detect any jupyter servers from your kernel: This can happen if you have multiple ipython kernels and you have only installed jupyterprobe in one of them. The best way possible is to uninstall and install jupyter and jupyterlab packages.

python3.x -m pip uninstall jupyter jupyterlab
python3.x -m pip install jupyter jupyterlab

If this doesn't work either, you can try removing the ipython kernel, install it again and then repeat the above steps.

jupyter kernelspec list
jupyter kernelspec remove <kernel-name>
python3.x -m pip install ipykernel
python3.x -m ipykernel install --user

Additionally, to check if your kernel can find notebook servers, run

from notebook import notebookapp
servers = list(notebookapp.list_running_servers())
print(servers)

If your kernel is correctly setup, you should see jupyter server information. If you get an error, or an empty list, try above steps.

Issues and Contributing

The project is still in active development. If you face any error or want to request a feature, feel free to open an issue. Additionally, if you want to contribute, a PR is always welcome.

The nvidia-smi conundrum

nvidia-smi is a great command to check GPU usage and state for your system. However, if the system you are using is in a container, when you run nvidia-smi, you will see correct GPU memory usage but process IDs and names might not come up. This can happen because nvidia-smi uses hardware level nvml library which exposes PIDs as defined on host system. But PIDs change on container and thus process names can't be found. As a result, nvidia-smi doesn't yield process level information.

Jupyterprobe solves this issue by mapping back from host PIDs to container PIDs. However, currently, this only works on Unix based systems (which have process information in /proc).