Deploy models
You can use VESSL to quickly deploy your models into production for use from external applications through APIs. You can register it through the SDK and deploy it in the web UI in one click.
Register a model using the SDK
A model file cannot be deployed on its own - we need to provide instructions on how to setup the server, handle requests, and send responses. This step is called registering a model.
There are two ways you can register a model. One is to use an existing model—that is, a VESSL model exists and a model file is stored in it. The other is to train a model from scratch and register it. The two options are further explained below.
Register an existing model
In most cases, you will have already trained model and have the file ready, either through VESSL Run or in your local environment. After creating a model, you will need to register it using the SDK. The below example shows how you can do so.
First, we redefine the layers of the torch model. (This is assuming we only saved the state_dict
, or the model’s parameters. If you saved the model’s layers as well, you do not have to redefine the layers.)
Then, we define a MyRunner
which inherits from vessl.RunnerBase
, which provides instructions for how to serve our model. You can read more about each method here.
Finally, we register the model using vessl.register_model
. We specify the repository name and number, pass MyRunner
as the runner class we will use for service, and list any requirements to install.
After executing the script, you should see that two files have been generated: vessl.manifest.yaml
, which stores metadata and vessl.runner.pkl
, which stores the runner binary. Your model has been registered and is ready for service.
Register a model from scratch
In some cases, you will want to train the model and register it within one script. You can use vessl.register_model
to register a new model as well:
After executing the script, you should see that three files have been generated: vessl.manifest.yaml
, which stores metadata, vessl.runner.pkl
, which stores the runner binary, and vessl.model.pkl
, which stores the trained model. Your model has been registered and is ready for service.
PyTorch models
If you are using PyTorch, there is an easier way to register your model. You only need to optionally define preprocess_data
and postprocess_data
- the other methods are autogenerated.
Deploy a registered model
You can deploy your model with VESSL Service. For detailed instructions, please refer to the VESSL Service documentation.
Once you deployed your model with VESSL Service, you can make predictions using your service by sending HTTP requests to the service endpoint. As in the example request, use the POST
method and pass your authentication token as a header. Pass your input data in the format you’ve specified in your runner when you registered the model. You should receive a response with the prediction.