Useless CV toolkits


Keywords
blip, cuda, dinov2, florence2, grounding-dino, ml, ocr, onnx, onnxruntime, rust, rust-yolo, sam, sapiens, tensorrt, vlm, yolo, yolo-rs, yolo-rust, yolov11, yolov8
License
GPL-3.0
Install
pip install usls==0.3.1

Documentation

usls

Documentation

ONNXRuntime Release Page CUDA Toolkit Page TensorRT Page

Crates Page Crates.io Total Downloads

usls is a Rust library integrated with ONNXRuntime that provides a collection of state-of-the-art models for Computer Vision and Vision-Language tasks, including:

Click to expand Supported Models

Supported Models

Model Task / Type Example CUDA f32 CUDA f16 TensorRT f32 TensorRT f16
YOLOv5 Classification
Object Detection
Instance Segmentation
demo βœ… βœ… βœ… βœ…
YOLOv6 Object Detection demo βœ… βœ… βœ… βœ…
YOLOv7 Object Detection demo βœ… βœ… βœ… βœ…
YOLOv8 Object Detection
Instance Segmentation
Classification
Oriented Object Detection
Keypoint Detection
demo βœ… βœ… βœ… βœ…
YOLOv8 Object Detection
Instance Segmentation
Classification
Oriented Object Detection
Keypoint Detection
demo βœ… βœ… βœ… βœ…
YOLOv9 Object Detection demo βœ… βœ… βœ… βœ…
YOLOv11 Object Detection
Instance Segmentation
Classification
Oriented Object Detection
Keypoint Detection
demo βœ… βœ… βœ… βœ…
RTDETR Object Detection demo βœ… βœ… βœ… βœ…
FastSAM Instance Segmentation demo βœ… βœ… βœ… βœ…
SAM Segment Anything demo βœ… βœ…
SAM2 Segment Anything demo βœ… βœ…
MobileSAM Segment Anything demo βœ… βœ…
EdgeSAM Segment Anything demo βœ… βœ…
SAM-HQ Segment Anything demo βœ… βœ…
YOLO-World Object Detection demo βœ… βœ… βœ… βœ…
DINOv2 Vision-Self-Supervised demo βœ… βœ… βœ… βœ…
CLIP Vision-Language demo βœ… βœ… βœ… Visual
❌ Textual
βœ… Visual
❌ Textual
BLIP Vision-Language demo βœ… βœ… βœ… Visual
❌ Textual
βœ… Visual
❌ Textual
DB Text Detection demo βœ… βœ… βœ… βœ…
SVTR Text Recognition demo βœ… βœ… βœ… βœ…
RTMO Keypoint Detection demo βœ… βœ… ❌ ❌
YOLOPv2 Panoptic Driving Perception demo βœ… βœ… βœ… βœ…
Depth-Anything v1 & v2 Monocular Depth Estimation demo βœ… βœ… ❌ ❌
MODNet Image Matting demo βœ… βœ… βœ… βœ…
GroundingDINO Open-Set Detection With Language demo βœ… βœ…
Sapiens Body Part Segmentation demo βœ… βœ…
Florence2 a Variety of Vision Tasks demo βœ… βœ…
DepthPro Monocular Depth Estimation demo βœ… βœ…

⛳️ ONNXRuntime Linking

You have two options to link the ONNXRuntime library
  • Option 1: Manual Linking

    • For detailed setup instructions, refer to the ORT documentation.

    • For Linux or macOS Users:

      • Download the ONNX Runtime package from the Releases page.
      • Set up the library path by exporting the ORT_DYLIB_PATH environment variable:
        export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.19.0
  • Option 2: Automatic Download

    Just use --features auto

    cargo run -r --example yolo --features auto

🎈 Demo

cargo run -r --example yolo   # blip, clip, yolop, svtr, db, ...

πŸ₯‚ Integrate Into Your Own Project

  • Add usls as a dependency to your project's Cargo.toml

    cargo add usls

    Or use a specific commit:

    [dependencies]
    usls = { git = "https://github.com/jamjamjon/usls", rev = "commit-sha" }
  • Follow the pipeline

    • Build model with the provided models and Options
    • Load images, video and stream with DataLoader
    • Do inference
    • Retrieve inference results from Vec<Y>
    • Annotate inference results with Annotator
    • Display images and write them to video with Viewer

    example code
    use usls::{models::YOLO, Annotator, DataLoader, Nms, Options, Vision, YOLOTask, YOLOVersion};
    
    fn main() -> anyhow::Result<()> {
        // Build model with Options
        let options = Options::new()
            .with_trt(0)
            .with_model("yolo/v8-m-dyn.onnx")?
            .with_yolo_version(YOLOVersion::V8) // YOLOVersion: V5, V6, V7, V8, V9, V10, RTDETR
            .with_yolo_task(YOLOTask::Detect) // YOLOTask: Classify, Detect, Pose, Segment, Obb
            .with_ixx(0, 0, (1, 2, 4).into())
            .with_ixx(0, 2, (0, 640, 640).into())
            .with_ixx(0, 3, (0, 640, 640).into())
            .with_confs(&[0.2]);
        let mut model = YOLO::new(options)?;
    
        // Build DataLoader to load image(s), video, stream
        let dl = DataLoader::new(
            // "./assets/bus.jpg", // local image
            // "images/bus.jpg",  // remote image
            // "../images-folder",  // local images (from folder)
            // "../demo.mp4",  // local video
            // "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4",  // online video
            "rtsp://admin:kkasd1234@192.168.2.217:554/h264/ch1/",  // stream
        )?
        .with_batch(2)  // iterate with batch_size = 2
        .build()?;
    
        // Build annotator
        let annotator = Annotator::new()
            .with_bboxes_thickness(4)
            .with_saveout("YOLO-DataLoader");
    
        // Build viewer
        let mut viewer = Viewer::new().with_delay(10).with_scale(1.).resizable(true);
    
        // Run and annotate results
        for (xs, _) in dl {
            let ys = model.forward(&xs, false)?;
            // annotator.annotate(&xs, &ys);
            let images_plotted = annotator.plot(&xs, &ys, false)?;
    
            // show image
            viewer.imshow(&images_plotted)?;
    
            // check out window and key event
            if !viewer.is_open() || viewer.is_key_pressed(usls::Key::Escape) {
                break;
            }
    
            // write video
            viewer.write_batch(&images_plotted)?;
    
            // Retrieve inference results
            for y in ys {
                // bboxes
                if let Some(bboxes) = y.bboxes() {
                    for bbox in bboxes {
                        println!(
                            "Bbox: {}, {}, {}, {}, {}, {}",
                            bbox.xmin(),
                            bbox.ymin(),
                            bbox.xmax(),
                            bbox.ymax(),
                            bbox.confidence(),
                            bbox.id(),
                        );
                    }
                }
            }
        }
    
        // finish video write
        viewer.finish_write()?;
    
        Ok(())
    }

πŸ“Œ License

This project is licensed under LICENSE.