Deploying machine learning (ML) models in production environments often requires meticulous planning to ensure smooth operation, high availability, and the ability to handle fluctuating demands. VESSL Serve offers two modes to cater to different needs: Provisioned and Serverless.

Provisioned Mode

In the Provisioned mode, VESSL Serve acts as a robust platform for deploying models developed within VESSL, or even your custom models, as inference servers. This mode is ideal for those who prefer direct control over their deployment environment with features such as:

  • Activity Tracking: Monitor logs, system metrics, and model performance metrics.
  • Auto-Scaling: Automatically adjust server size based on resource usage to handle increased demands.
  • Traffic Management: Easily split traffic for Canary testing and gradually roll out new model versions without downtime.
  • Operational Control: Extensive customization through YAML configurations for those who need precise control over their deployments.

What’s Next in Provisioned Mode?

Serverless Mode

The Serverless mode simplifies deployments by abstracting away the underlying server management, allowing you to focus solely on model deployment and scaling. It’s particularly beneficial for teams without deep backend management expertise or those seeking cost efficiency:

  • Automatic Scaling: Scale your models in real-time based on workload demands.
  • Cost Efficiency: Only pay for the resources you use with a pay-as-you-go pricing model.
  • Simplified Deployment: Minimal configuration needed, making it accessible regardless of technical background.
  • High Availability and Resilience: Built-in mechanisms to ensure models are always operational and resilient to failures.

What’s Next in Serverless Mode?

Both modes of VESSL Serve are designed to make the deployment of ML services reliable, adaptable, and capable of managing varying workloads efficiently. Whether you choose the granular control of Provisioned mode or the streamlined simplicity of Serverless mode, VESSL Serve facilitates the easy rollout and scaling of your AI models.