9. Model export
Depending on the final use of your model you might need to export the model from the default training framework format (pytorch checkpoint) to a format more suitable for inference (i.e. onnx
, tensorrt
, torchscript
or torchserve
).
Create and assing a model/export
issue. The assigment of this issue will automatically create an associated branch. Remember to link it to the corresponding milestone
.
The concrete export format that you will need depends on the nature of your proyect. Reference implementations of some commonly used formats are included as part of the template.
TorchServe
TorchServe is a flexible and easy to use tool for serving PyTorch models. It's a very simple way to expose a trained model under a REST API endpoint for inference.
A script for exporting a trained model to TorchServe format is included as part of the template.
Assuming you are working on the associated branch, you can add a new stage to export the model by editing dvc.yaml
:
Adding a export stage
vars:
- configs/pipeline.yml
stages:
# ...
model-export-to_torchserve:
cmd: python src/stages/model/export/to_torchserve.py
--config_file results/experiment/config.py
--checkpoint_file results/experiment/best_model.pth
--output_folder results/model/export
--model_name $MODEL_NAME
deps:
- src/stages/model/export/to_torchserve.py
- results/experiment
outs:
- results/model/export
You should replace $MODEL_NAME
with your desired output name and a $MODEL_NAME.mar
file will be created under output_folder
. This file can be used to register a model in a running TorchServe service.
Abstract
You have a simple documentation of the steps to follow to deploy a model in Torchserve with Docker here.
Test TorchServe
The templates include a test to verify the script to_torchserve.py
and the model deployment with TorchServe in a docker container.
This test is like a simple template in which you have to add more checks depending on what the model expects to receive and what it is expected to return.
To make it more useful, it is recommended to add more asserts
to help check that everything works correctly, like to make predictions, passing it an image or expected data, and verifying that the format of the returned result is as expected.
Info
This test has to be executed locally, it will not be executed in CI
workflow, and it will be skipped if no experiment has been launched yet, since it will use the artifact_uri
to the last experiment launched, which is found in results/experiment.json
, and will be used in the test through the fixture get_artifact_uri
.
Caution
A waiting time of 10 seconds is specified before making the calls to TorchServe (SLEEP_TIME = 10
), but in order to make predictions
, it is also necessary to wait for the model to load inside the docker container, and this may depend on the model
and what the .mar
contains, since sometimes, libraries can be loaded that are specified within the .mar
. So if this takes more than 10 seconds, the test will fail. This time must be adjusted.
PyDeepStream
PydeepStream is a python library with callbacks to DeepStream.
In order to run it you need a tensorrt converted model, i.e. a .engine
.
To do this, see the README.md in the model_conversion/to_trt/
folder.