Serve & deploy with VESSL Service Serverless mode
(oci) vessl-oci-sanjose
.(GPU) gpu-a10-small
. This means that we will be using one NVIDIA A10 GPU and a 24GB RAM instance.vllm/vllm-openai:v0.10.0
.HTTP
, 8000
and name it vllm
.MODEL_ID
microsoft/Phi-4-mini-reasoning