VESSL AI Documentation

Deploy your AI models efficiently using VESSL Service, which supports both provisioned and serverless deployment methods. This guide will walk you through the steps to create a new service using the Command Line Interface (CLI) and the web console, with how to enable Serverless Mode.

Before deploying a service, ensure that VESSL Service is enabled in the cluster settings. Refer to the Endpoint Configuration Guide for details on how to enable VESSL Service and configure endpoints properly.

Deploy a new service using CLI

Using YAML file

To deploy a service using the CLI, you’ll first need to define your service configuration in a YAML file. This YAML-based configuration allows you to deploy services programmatically. If you were using this feature as beta and want to migrate to new version, please refer to migration guide.

Example YAML configuration:

name: vessl-test-service
message: vessl-yaml-revision
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-l4-small-spot
run:
  - |
    pip install vllm
    python -m vllm.entrypoints.openai.api_server --model $MODEL_NAME --max-model-len 4096 --disable-log-requests --api-key token-123
env:
  MODEL_NAME: mistralai/Mistral-7B-Instruct-v0.2
ports:
  - port: 8000
service:
  autoscaling:
    max: 2
    metric: cpu
    min: 1
    target: 50
  expose: 8000

Steps to deploy using CLI:

Create or edit your YAML file to define the service configuration.
Deploy the service by running the following command in your terminal. Replace [your-yaml-file].yaml with the path to your YAML file.:
```
vessl service create -f [your-yaml-file].yaml
```
If you want to deploy a serverless mode, make sure to append --serverless flag.
```
vessl service create -f [your-serverless-yaml-file.yaml] --serverless
```
If deploying to a NodePort cluster, specify the —port option to expose the service on a specific port.
```
vessl service create -f [your-yaml-file].yaml --port=30010
```

Using VESSL Hub templates

VESSL Hub provides service templates for rapid service deployment.

Steps to deploy using VESSL Hub templates:

Get the key of the template you want to use from the VESSL Hub

vessl hub list -t service

example output:

Service Tasks:
      llama-3-chat-api
      gemma-2-chat-api
      tgi-service
      vllm-service

Start your service by vessl service create --from-hub=[template-key].

Example:

vessl service create --from-hub=llama-3-chat-api

Deploy a new service using web console

Deploying through the web console is user-friendly and suitable for those who prefer a graphical interface over command line operations. The interactive demo below will guide you to through the process.

Provisioned Mode — Steps to create a new service in the web console

The explanations of each field are as follows:

Serverless Mode — Steps to create a new service in the web console:

The explanations of each field are as follows:

Overview

Get information from the Overview Dashboard.

Get Started

Compute

Resource

Admin

Private Hub

Pricing

Create a new service

Deploy a new service using CLI

Using YAML file

Example YAML configuration:

Steps to deploy using CLI:

Using VESSL Hub templates

Steps to deploy using VESSL Hub templates:

Deploy a new service using web console

Provisioned Mode — Steps to create a new service in the web console

Serverless Mode — Steps to create a new service in the web console:

Overview

Get Started

Compute

Resource

Admin

Private Hub

Pricing

​Deploy a new service using CLI

​Using YAML file

​Example YAML configuration:

​Steps to deploy using CLI:

​Using VESSL Hub templates

​Steps to deploy using VESSL Hub templates:

​Deploy a new service using web console

​Provisioned Mode — Steps to create a new service in the web console

​Serverless Mode — Steps to create a new service in the web console:

Overview

Deploy a new service using CLI

Using YAML file

Example YAML configuration:

Steps to deploy using CLI:

Using VESSL Hub templates

Steps to deploy using VESSL Hub templates:

Deploy a new service using web console

Provisioned Mode — Steps to create a new service in the web console

Serverless Mode — Steps to create a new service in the web console: