Deploying a server to host ML models in a production environment requires careful planning to ensure they run smoothly, stay available, and can handle increased demands. This can be particularly challenging for ML engineers or small backend teams who might not be deeply familiar with complex backend setups.

VESSL Serve is an essential tool for deploying models developed in VESSL, or even your custom models as inference servers.

  • Keeping track of activities like logs, system metrics and model performance metrics
  • Automatically adjusting their size based on resource usage
  • Split the traffic sent to models for easier Canary testing
  • Roll out a new version of model to production without downtime

VESSL Serve simplifies the process of setting up ML services that are reliable, adaptable, and can handle varying workloads.