> ## Documentation Index
> Fetch the complete documentation index at: https://docs.vessl.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create a new service

Deploy your AI models efficiently using VESSL Service, which supports both **provisioned** and **serverless** deployment methods. This guide will walk you through the steps to create a new service using the **Command Line Interface (CLI)** and the **web console**, with how to enable Serverless Mode.

<Tip>Before deploying a service, ensure that **VESSL Service is enabled** in the cluster settings. Refer to the **[Endpoint Configuration Guide](https://docs.vessl.ai/guides/clusters/endpoint)** for details on how to enable **VESSL Service** and configure endpoints properly.</Tip>

## Deploy a new service using CLI

### Using YAML file

To deploy a service using the CLI, you'll first need to define your service configuration in a YAML file. This YAML-based configuration allows you to deploy services programmatically.

If you were using this feature as beta and want to migrate to new version, please refer to [migration guide](/guides/serve/migration).

#### Example YAML configuration:

```yaml theme={null}
name: vessl-test-service
message: vessl-yaml-revision
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-l4-small-spot
run:
  - |
    pip install vllm
    python -m vllm.entrypoints.openai.api_server --model $MODEL_NAME --max-model-len 4096 --disable-log-requests --api-key token-123
env:
  MODEL_NAME: mistralai/Mistral-7B-Instruct-v0.2
ports:
  - port: 8000
service:
  autoscaling:
    max: 2
    metric: cpu
    min: 1
    target: 50
  expose: 8000
```

#### Steps to deploy using CLI:

1. **Create or edit your YAML file** to define the service configuration.
2. **Deploy the service** by running the following command in your terminal. Replace `[your-yaml-file].yaml` with the path to your YAML file.:
   ```bash theme={null}
   vessl service create -f [your-yaml-file].yaml
   ```
   If you want to deploy a **serverless mode**, make sure to append `--serverless` flag.
   ```bash theme={null}
   vessl service create -f [your-serverless-yaml-file.yaml] --serverless
   ```
   If deploying to a NodePort cluster, specify the --port option to expose the service on a specific port.
   ```bash theme={null}
   vessl service create -f [your-yaml-file].yaml --port=30010
   ```

### Using VESSL Hub templates

[VESSL Hub](https://app.vessl.ai/hub) provides service templates for rapid service deployment.

#### Steps to deploy using VESSL Hub templates:

1. Get the key of the template you want to use from the VESSL Hub

```
vessl hub list -t service
```

example output:

```
Service Tasks:
      llama-3-chat-api
      gemma-2-chat-api
      tgi-service
      vllm-service
```

2. Start your service by `vessl service create --from-hub=[template-key]`.

Example:

```
vessl service create --from-hub=llama-3-chat-api
```

## Deploy a new service using web console

Deploying through the web console is user-friendly and suitable for those who prefer a graphical interface over command line operations. The interactive demo below will guide you to through the process.

### Provisioned Mode -- Steps to create a new service in the web console

<div style={{ marginBottom: '120px', position: 'relative', paddingTop: '400px' }}>
  <iframe src="https://demo.arcade.software/2OOfE3J52V0dVOO9CyW6?embed&embed_mobile=inline&embed_desktop=inline&show_copy_link=true" frameBorder="0" loading="lazy" webkitAllowFullScreen="" mozAllowFullScreen="" title="Dashboards" style={{ position: 'absolute', top: '0px', left: '0px', width: '100%', height: '500px', colorScheme: 'light' }} />
</div>

<Accordion title="The explanations of each field are as follows:">
  * **Initialize this revision with:**: Select initialization method.
    * **Template from VESSL hub**: Use a template from the VESSL Hub.
    * **Recent revision configuration**: Select the configuration of the recent revisions.
    * **YAML file**: Upload a YAML file or paste its content to initialize the revision.
  * **Message**: Enter a message for the revision.
  * **Resources**: Select the compute resources and container image you want to use for the Service.
    * **Resource**: Select the compute resources you want to use for the Service.
    * **Container image**: The Docker image to use for the Revision.
  * **Task**:
    * **Volumes**: Import or mount code, data
    * **Command**: The command to run inside the container. This is similar to running a command in the terminal on your computer.
    * **Port**: The port to expose from the container. For example, if you're using a BentoML model server, you'll want to expose port 3000 and use the HTTP protocol to access the service endpoint.
  * **Monitoring**: Enable monitoring to track default system metrics from service workers.
  * **Healthcheck**: Check API health using the specified port and path.
  * **Autoscaling**: Set autoscaling strategy for the revision.
    * **Target Metric**: The metric to use for autoscaling - cpu, memory, nvidia.com/GPU, requests.
    * **Target Value**: The target value for the metric.
    * **Min value**: The minimum number of replicas.
    * **Max value**: The maximum number of replicas.
  * **Variables**: Environment variables or secret variables to inject into the container.
</Accordion>

### Serverless Mode -- Steps to create a new service in the web console:

<div style={{ marginBottom: '120px', position: 'relative', paddingTop: '400px' }}>
  <iframe src="https://demo.arcade.software/vaxqw6edYVTpLjtbjN0o?embed&embed_mobile=inline&embed_desktop=inline&show_copy_link=true" frameBorder="0" loading="lazy" webkitAllowFullScreen="" mozAllowFullScreen="" title="Dashboards" style={{ position: 'absolute', top: '0px', left: '0px', width: '100%', height: '500px', colorScheme: 'light' }} />
</div>

<Accordion title="The explanations of each field are as follows:">
  * **Resources**: Select the compute resources and container image you want to use for the Service.
    * **Resource**: Select the compute resources you want to use for the Service. Custom resource specs cannot be set in Serverless mode. If you need to use a custom resource spec, please contact our sales team.
    * **Container image**: The Docker image to use for the Revision.
  * **Task**:
    * **Command**: The command to run inside the container. This is similar to running a command in the terminal on your computer.
    * **Port**: The port to expose from the container. For example, if you're using a BentoML model server, you'll want to expose port 3000 and use the HTTP protocol to access the service endpoint. You can open only one port in Serverless mode.
  * **Advanced Settings**: Set additional configurations.
    * **Variables**: Environment variables or secret variables to inject into the container.
</Accordion>

<CardGroup cols={1}>
  <Card icon="book" title="Overview" href="overview">
    Get information from the Overview Dashboard.
  </Card>
</CardGroup>
