Deploy your AI models efficiently using VESSL Service, which supports both provisioned and serverless deployment methods. This guide will walk you through the steps to create a new service using the Command Line Interface (CLI) and the web console, with how to enable Serverless Mode.

Deploy a new service using CLI

Using YAML file

To deploy a service using the CLI, you’ll first need to define your service configuration in a YAML file. This YAML-based configuration allows you to deploy services programmatically.

If you were using this feature as beta and want to migrate to new version, please refer to migration guide.

Example YAML configuration:

name: vessl-test-service
message: vessl-yaml-revision
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
resources:
  cluster: vessl-gcp-oregon
  preset: gpu-l4-small-spot
run:
  - |
    pip install vllm
    python -m vllm.entrypoints.openai.api_server --model $MODEL_NAME --max-model-len 4096 --disable-log-requests --api-key token-123
env:
  MODEL_NAME: mistralai/Mistral-7B-Instruct-v0.2
ports:
  - port: 8000
service:
  autoscaling:
    max: 2
    metric: cpu
    min: 1
    target: 50
  expose: 8000

Steps to deploy using CLI:

  1. Create or edit your YAML file to define the service configuration.
  2. Deploy the service by running the following command in your terminal. Replace [your-yaml-file].yaml with the path to your YAML file.:
    vessl service create -f [your-yaml-file].yaml
    
    If you want to deploy a serverless mode, make sure to append --serverless flag.
    vessl service create -f [your-serverless-yaml-file.yaml] --serverless
    

Using VESSL Hub templates

VESSL Hub provides service templates for rapid service deployment.

Steps to deploy using VESSL Hub templates:

  1. Get the key of the template you want to use from the VESSL Hub
vessl hub list -t service

example output:

Service Tasks:
      llama-3-chat-api
      gemma-2-chat-api
      tgi-service
      vllm-service
  1. Start your service by vessl service create --from-hub=[template-key].

Example:

vessl service create --from-hub=llama-3-chat-api

Deploy a new service using web console

Deploying through the web console is user-friendly and suitable for those who prefer a graphical interface over command line operations. The interactive demo below will guide you to through the process.

Provisioned Mode — Steps to create a new service in the web console

Serverless Mode — Steps to create a new service in the web console: