Skip to content

9. Model export

Depending on the final use of your model you might need to export the model from the default training framework format (pytorch checkpoint) to a format more suitable for inference (i.e. onnx, tensorrt, torchscript or torchserve).

Create and assing a model/export issue. The assigment of this issue will automatically create an associated branch. Remember to link it to the corresponding milestone.

The concrete export format that you will need depends on the nature of your proyect. Reference implementations of some commonly used formats are included as part of the template.


TorchServe

TorchServe is a flexible and easy to use tool for serving PyTorch models. It's a very simple way to expose a trained model under a REST API endpoint for inference.

A script for exporting a trained model to TorchServe format is included as part of the template.

Assuming you are working on the associated branch, you can add a new stage to export the model by editing dvc.yaml:

Adding a export stage

vars:
  - configs/pipeline.yml

stages:

  # ...

  model-export-to_torchserve:
    cmd: python src/stages/model/export/to_torchserve.py
      --config_file results/experiment/config.py
      --checkpoint_file results/experiment/best_model.pth
      --output_folder results/model/export
      --model_name $MODEL_NAME
    deps:
      - src/stages/model/export/to_torchserve.py
      - results/experiment
    outs:
      - results/model/export

You should replace $MODEL_NAME with your desired output name and a $MODEL_NAME.mar file will be created under output_folder. This file can be used to register a model in a running TorchServe service.

Abstract

You have a simple documentation of the steps to follow to deploy a model in Torchserve with Docker here.

Test TorchServe

The templates include a test to verify the script to_torchserve.py and the model deployment with TorchServe in a docker container.

This test is like a simple template in which you have to add more checks depending on what the model expects to receive and what it is expected to return.

To make it more useful, it is recommended to add more asserts to help check that everything works correctly, like to make predictions, passing it an image or expected data, and verifying that the format of the returned result is as expected.

Info

This test has to be executed locally, it will not be executed in CI workflow, and it will be skipped if no experiment has been launched yet, since it will use the artifact_uri to the last experiment launched, which is found in results/experiment.json, and will be used in the test through the fixture get_artifact_uri.

Caution

A waiting time of 10 seconds is specified before making the calls to TorchServe (SLEEP_TIME = 10), but in order to make predictions, it is also necessary to wait for the model to load inside the docker container, and this may depend on the model and what the .mar contains, since sometimes, libraries can be loaded that are specified within the .mar. So if this takes more than 10 seconds, the test will fail. This time must be adjusted.

PyDeepStream

PydeepStream is a python library with callbacks to DeepStream. In order to run it you need a tensorrt converted model, i.e. a .engine.

To do this, see the README.md in the model_conversion/to_trt/ folder.