Python framework for serving ML models

pip install multivitamin==1.4.18



Build Status PyPI version License

Multivitamin is python framework built for serving computer vision (CV), natural language processing (NLP), and machine learning (ML) models. It aims to provide the serving infrastructure around a single service and to allow the flexibility to use any python framework for prediction.

Main Features

  • Asynchronous APIs sharing a common interface (CommAPI) for pulling requests and pushing responses
  • An interface (via the Module class) for processing images, video, text, or any form of data
  • A data model for storing the output of the modules

Getting Started

To start an asynchronous service, construct a Server object, which accepts 3 input parameters:

  • An input CommAPI, which is an abstract base class that defines the push() and pull() interface
  • An output CommAPI
  • A Module or sequence of Modules, which is an abstract base class that defines the interface for process(Request), process_properties() or process_images(...)

Defining input and output CommAPIs:

from multivitamin.apis import SQSAPI, S3API

sqs_api = SQSAPI(queue_name='SQS-ObjectDetector')
s3_api = S3API(s3_bucket='od-output', s3_key='2019-03-22')

Both SQSAPI and S3API are concrete implementations of CommAPI.

Defining a Module:

For convenience, we provide several example modules (which are concrete implementations of Module) that you can import for your purposes. Let's say we want a object detector built using TensorFlow's object detection API:

from multivitamin.applications.images.detectors import TFDetector

obj_det_module = TFDetector(name="IG_obj_det", ver="1.0.0", model="models_dir/")

Constructing a Server

Which will pull requests from the AWS SQS queue queue_name=SQS-ObjectDetector and push the responses to s3://

from multivitamin.server import Server

obj_det_server = Server(


If we wanted to send our responses to multiple endpoints, we could add a second output CommAPI like so:

from multivitamin.apis import HTTPAPI

http_api = HTTPAPI()

and modifying the above Server we created like:

obj_det_server = Server(

note: the HTTPAPI assumes that the Request has a field called dst_url. HTTPAPI will send a POST request to that destination URL.

Chaining Modules

If we wanted to run a sequence of Modules, we could add a second Module. Say, we had an image classifier written in pytorch that predicted the make and model of a vehicle. A pytorch image classifier is another example application we provide in multivitamin.applications.images

from multivitamin.applications.images.classifiers.pyt_classifier import PYTClassifier

make_model_clf = PYTClassifier(name="make-model", ver="1.0.0", model="models/mm.pth")

The set_previous_properties_of_interest is a method to tell this make_model_clf module to only run its predict_images function for predictions of car OR truck found in the previous module (the 600 class TensorFlow object detector).

And now, creating a Server:

vehicle_mm_server = Server(


Using conda:

conda install multivitamin

Using pip Note: this requires opencv be already installed. We highly recommend installing with conda instead

pip install multivitamin

Using nvidia-docker:

docker run --runtime=nvidia multivitamin:cuda9-cudnn7 /bin/bash


For API documentation and full details, see

High-level overview

Data flow:

  1. JSON request is "pulled" by a CommAPI object
  2. JSON request is used to construct a Request class
  3. Server creates a (typically) empty Response from the Request. If the Requestcontains a previous module's Response (for modules run in a sequence), that is pre-populated in the Response
  4. process_request() sends the Response through all Modules
  5. Each Module appends/modifies the Response
  6. process_request() returns the Response back to the Server
  7. Server sends the Response to the output CommAPI(s) and calls the push(Response) method

Repository organization:

  • data/
    • Request: data object encapsulating request JSON
    • response/ * Response: data object encapsulating response that reflects the schema. Contains methods for serialization, modifying internal data * ResponseInternal: Python dataclasses with typechecking that matches the schema
  • module/
    • Module: abstract parent class that defines an interface for processing requests
    • ImagesModule: abstract child class of Module that defines an interface for processing requests with images or video, process_images(...) and handles retrieval of media.
    • PropertiesModule: abstract child class of Module that defines an interface process_properties()
  • apis/
    • CommAPI: abstract parent class that defines an interface, i.e. push() and pull()
    • SQSAPI: pulls requests from an SQS queue, pushes requests to a queue
    • HTTPAPI: pushes Responses by posting to a HTTP endpoint (provided in the request)
    • LocalAPI: pulls requests from a local directory of JSONs, pushes Responses to a local directory
    • S3API: pulls requests from an S3 bucket of JSONs, pushes Responses to an S3 bucket


To file a bug or request a feature, please file a GitHub issue. Pull requests are welcome.

The Team

Multivitamin is currently maintained by Greg Chu, Matthew Greenberg, and Javier Molina, with contributions from Divyaa Ravichandran, and Rohit Annigeri, and with collaboration from Cambron Carter, Shankar Chatterjee, Nandakishore Puttashamachar, Nishita Sant, and Iris Fu.