Comment on page

Creating an Experiment

To create an experiment, first specify a few options such as cluster, resource, image, and start command. Here is an explanation of the config options.

Cluster & Resource (Required)

You can run your experiment on either VESSL's managed cluster or your custom cluster. Start by selecting a cluster.
Custom Cluster
Once you selected VESSL's managed cluster, you can view a list of available resources under the dropdown menu.
You also have an option to use spot instances.
Check out the full list of resource types and corresponding prices:
Your custom cluster can be either on-premise or on-cloud. For on-premise clusters, you can specify the processor type and resource requirements. The experiment will be assigned automatically to an available node based on the input resource requirements.

Distribution Mode (Optional)

You have an option to use multi-node distributed training. The default option is single-node training.

Image (Required)

Select the Docker image that the experiment container will use. You can either use a managed image provided by VESSL or your own custom image.
Managed Image
Custom Image
Managed images are pre-pulled images provided by VESSL. You can find the available image tags at VESSL's Amazon ECR Public Gallery.
You can pull your own custom images from either Docker Hub or Amazon ECR.
Public Images
To pull images from the public Docker registry, simply pass the image URL. The example below demonstrates pulling the official TensorFlow development GPU image from Docker Hub.
Private Images
To pull images from the private Docker registry, you should first integrate your credentials in organization settings.
Then, check the private image checkbox, fill in the image URL, and select the credential.

Start Command (Required)

Specify the start command in the experiment container. Write a running script with command-line arguments just as you are using a terminal. You can put multiple commands by using the && command or a new line separation.

Volume (Optional)

You can mount the project, dataset, and files to the experiment container.
Learn more about volume mount on the following page:


You can set hyperparameters as key-value pairs. The given hyperparameters are automatically added to the container as environment variables with the given key and value. A typical experiment will include hyperparameters like learning_rate and optimizer.
You can also use them at runtime by appending them to the start command as follows.
python \
--learning-rate $learning_rate \
--optimizer $optimizer

Termination Protection

Checking the termination protection option puts experiments in idle once it completes running, so you to access the container of a finished experiment.