scs4onnx
A very simple tool that compresses the overall size of the ONNX model by aggregating duplicate constant values as much as possible. Simple Constant value Shrink for ONNX.
Key concept
- If the same constant tensor is found by scanning the entire graph for Constant values, it is aggregated into a single constant tensor.
- Ignore scalar values.
- Ignore variables.
1. Setup
### option
$ echo export PATH="~/.local/bin:$PATH" >> ~/.bashrc \
&& source ~/.bashrc
### run
$ pip install -U onnx \
&& python3 -m pip install -U onnx_graphsurgeon --index-url https://pypi.ngc.nvidia.com \
&& pip install -U scs4onnx
2. Usage
$ scs4onnx -h
usage: scs4onnx [-h] [--mode {shrink,npy}] input_onnx_file_path output_onnx_file_path
positional arguments:
input_onnx_file_path
Input onnx file path.
output_onnx_file_path
Output onnx file path.
optional arguments:
-h, --help show this help message and exit
--mode {shrink,npy} Constant Value Compression Mode.
shrink: Share constant values inside the model as much as possible.
The model size is slightly larger because
some shared constant values remain inside the model,
but performance is maximized.
npy: Outputs constant values used repeatedly in the model to an
external file .npy. Instead of the smallest model body size,
the file loading overhead is greater.
Default: shrink
3. CLI Execution
$ scs4onnx input.onnx output.onnx --mode=shrink
4. In-script Execution
from scs4onnx import shrinking
shrunk_graph, npy_file_paths = shrinking('input.onnx', 'output.onnx', mode='npy')