This example deploys a simple web app for image generation. You will learn how you can set up an interactive workload for inference — mounting models from Hugging Face and opening up a port for user inputs. For a more in-depth guide, refer to our blog post.
Note that if you want to save your credits, remember to Terminate to stop and end the runs.

What you will do

  • Host a GPU-accelerated web app built with Gradio
  • Mount model checkpoints from Hugging Face
  • Open up a port to an interactive workload for inference

Writing the YAML file

Let’s fill in the image-generation.yaml file.
1

Spin up an interactive workload

We already learned how you can launch an interactive workload in our previous guide. Let’s copy & paste the YAML we wrote for notebook.yaml.
name: Image Generation Playground
description: An interactive web app for image generation
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-a10-small
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
interactive:
  jupyter:
    idle_timeout: 120m
  max_runtime: 24h
2

Import code and model

Let’s mount a GitHub repo and import a model checkpoint from Hugging Face. We already learned how you can mount a codebase from our Quickstart guide.VESSL comes with a native integration with Hugging Face so you can import models and datasets simply by referencing the link to the Hugging Face repository. Under import, let’s create a working directory /model/ and import the model.
name: Image Generation Playground
description: An interactive web app for image generation
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-a10-small
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
	import:
		/code/:
			git:
				url: https://github.com/vessl-ai/examples
				ref: main
		/model/: hf://huggingface.co/Qwen/Qwen-Image
interactive:
  jupyter:
    idle_timeout: 120m
  max_runtime: 24h
3

Open up a port for inference

The ports key expose the workload ports where VESSL listens for HTTP requests. This means you will be able to interact with the remote workload — sending input query and receiving an generated image through port 7860 in this case.
name: Image Generation Playground
description: An interactive web app for image generation
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-a10-small
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
	import:
		/code/:
			git:
				url: https://github.com/vessl-ai/examples
				ref: main
		/model/: hf://huggingface.co/Qwen/Qwen-Image
interactive:
  jupyter:
    idle_timeout: 120m
  max_runtime: 24h
ports:
  - name: gradio
    type: http
    port: 7860
4

Write the run commands

Let’s install additional Python dependencies and finally run our python file app.py.
name: Image Generation Playground
description: An interactive web app for image generation
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-a10-small
image: quay.io/vessl-ai/torch:2.3.1-cuda12.1-r5
	import:
		/code/:
			git:
				url: https://github.com/vessl-ai/examples
				ref: main
		/model/: hf://huggingface.co/Qwen/Qwen-Image
run:
  - command: |-
      pip install --upgrade accelerate gradio protobuf sentencepiece torch torchvision transformers git+https://github.com/huggingface/diffusers
      python app.py
    workdir: /code/runs/qwen-image
interactive:
  max_runtime: 24h
  jupyter:
    idle_timeout: 120m
ports:
  - name: gradio
    type: http
    port: 7860

Running the app

Once again, running the workload will guide you to the workload Summary page.
vessl run create -f image-generation.yaml
Under Endpoints, click the gradio link to launch the app.

Using our web interface

You can repeat the same process on the web. Head over to your Organization, select a project, and create a New run.

What’s next?

See how VESSL takes care of the infrastructural challenges of fine-tuning a large language model with a custom dataset.