VESSL AI Documentation

This example deploys a simple web app for image generation. You will learn how you can set up an interactive workload for inference — mounting models from Hugging Face and opening up a port for user inputs. For a more in-depth guide, refer to our blog post.

Try it on VESSL Hub

Try out the Quickstart example with a single click on VESSL Hub.

See the final code

See the completed YAML file and final code for this example.

Note that if you want to save your credits, remember to Terminate to stop and end the runs.

What you will do

Host a GPU-accelerated web app built with Gradio
Mount model checkpoints from Hugging Face
Open up a port to an interactive workload for inference

Writing the YAML file

Let’s fill in the image-generation.yaml file.

Spin up an interactive workload

We already learned how you can launch an interactive workload in our previous guide. Let’s copy & paste the YAML we wrote for notebook.yaml.

name: Image Generation Playground
description: An interactive web app for image generation
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-a10-small
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
interactive:
  jupyter:
    idle_timeout: 120m
  max_runtime: 24h

Import code and model

Let’s mount a GitHub repo and import a model checkpoint from Hugging Face. We already learned how you can mount a codebase from our Quickstart guide.VESSL comes with a native integration with Hugging Face so you can import models and datasets simply by referencing the link to the Hugging Face repository. Under import, let’s create a working directory /model/ and import the model.

name: Image Generation Playground
description: An interactive web app for image generation
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-a10-small
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
	import:
		/code/:
			git:
				url: https://github.com/vessl-ai/examples
				ref: main
		/model/: hf://huggingface.co/Qwen/Qwen-Image
interactive:
  jupyter:
    idle_timeout: 120m
  max_runtime: 24h

Open up a port for inference

The ports key expose the workload ports where VESSL listens for HTTP requests. This means you will be able to interact with the remote workload — sending input query and receiving an generated image through port 7860 in this case.

name: Image Generation Playground
description: An interactive web app for image generation
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-a10-small
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
	import:
		/code/:
			git:
				url: https://github.com/vessl-ai/examples
				ref: main
		/model/: hf://huggingface.co/Qwen/Qwen-Image
interactive:
  jupyter:
    idle_timeout: 120m
  max_runtime: 24h
ports:
  - name: gradio
    type: http
    port: 7860

Write the run commands

Let’s install additional Python dependencies and finally run our python file app.py.

name: Image Generation Playground
description: An interactive web app for image generation
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-a10-small
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
	import:
		/code/:
			git:
				url: https://github.com/vessl-ai/examples
				ref: main
		/model/: hf://huggingface.co/Qwen/Qwen-Image
run:
  - command: |-
      pip install --upgrade accelerate gradio protobuf sentencepiece torch torchvision transformers git+https://github.com/huggingface/diffusers
      python app.py
    workdir: /code/runs/qwen-image
interactive:
  max_runtime: 24h
  jupyter:
    idle_timeout: 120m
ports:
  - name: gradio
    type: http
    port: 7860

Running the app

Once again, running the workload will guide you to the workload Summary page.

vessl run create -f image-generation.yaml

Under Endpoints, click the gradio link to launch the app.

Using our web interface

You can repeat the same process on the web. Head over to your Organization, select a project, and create a New run.

What’s next?

See how VESSL takes care of the infrastructural challenges of fine-tuning a large language model with a custom dataset.

Phi-4 Fine-tuning

Fine-tune Phi-4 with counselling datasets

Phi-4-mini-reasoning deployment

Serve & deploy vLLM-accelerated Phi-4-mini-reasoning

Enable Serverless Mode

Deploy with VESSL Service Serverless mode

Get Started

Compute

Resource

Admin

Private Hub

Pricing

Image generation playground

Try it on VESSL Hub

See the final code

What you will do

Writing the YAML file

Spin up an interactive workload

Import code and model

Open up a port for inference

Write the run commands

Running the app

Using our web interface

What’s next?

Phi-4 Fine-tuning

Phi-4-mini-reasoning deployment

Enable Serverless Mode

Get Started

Compute

Resource

Admin

Private Hub

Pricing

Try it on VESSL Hub

See the final code

​What you will do

​Writing the YAML file

Spin up an interactive workload

Import code and model

Open up a port for inference

Write the run commands

​Running the app

​Using our web interface

​What’s next?

Phi-4 Fine-tuning

Phi-4-mini-reasoning deployment

Enable Serverless Mode

What you will do

Writing the YAML file

Running the app

Using our web interface

What’s next?