VESSL AI Documentation

Volumes provide powerful data management capabilities for your workspace, allowing you to access datasets, code repositories, and storage systems directly within your development environment. This guide covers how to configure and use volumes effectively in your workspaces.

Volume types

Import volumes

Import volumes download data from external sources into your workspace container at startup. The data is copied into the specified directory and becomes available immediately when your workspace starts. Common use cases:

Loading datasets for analysis or model training
Pulling the latest code from a Git repository
Downloading pre-trained models from model registries
Accessing files from cloud storage for processing

Supported sources:

Git repositories (GitHub, GitLab, BitBucket)
VESSL Dataset and Model Registry
Hugging Face datasets and models
VESSL Storage volumes
AWS S3 and Google Cloud Storage
External storage systems

Mount volumes

Mount volumes provide persistent, real-time access to external storage systems. Unlike import volumes, mounted volumes reflect changes made to the source in real-time and don’t consume additional disk space in your workspace. Common use cases:

Working with large datasets that exceed workspace disk limits
Sharing data across multiple workspaces or team members
Accessing frequently updated data sources
Integrating with existing data pipelines

Supported sources:

VESSL Storage volumes
AWS S3 (through S3 FUSE)
Google Cloud Storage (through GCS FUSE)
Network File System (NFS) for custom clusters
Host path storage for on-premises setups

Configuring volumes

Through the Web Console

When creating a new workspace, you can configure volumes in the workspace creation form:

Navigate to the Volumes section during workspace creation
Click Add Volume to configure a new volume
Select the volume type (Import or Mount)
Choose the source (Dataset, Storage, Git repository, etc.)
Specify the target path where the volume should be accessible in your workspace

Using VESSL CLI

You can also create workspaces with volumes using the VESSL CLI:

# Create workspace with imported dataset
vessl workspace create \
  --name "my-workspace" \
  --cluster "my-cluster" \
  --import "/data:vessl-dataset://my-org/my-dataset"

# Create workspace with mounted storage
vessl workspace create \
  --name "my-workspace" \
  --cluster "my-cluster" \
  --mount "/shared:volume://vessl-storage/shared-data"

Best practices

Choosing between import and mount

Use import volumes when:

You need a snapshot of data at a specific point in time
Working with relatively small datasets (< 10GB)
You want to ensure data consistency throughout your workspace session
Network connectivity to the source might be intermittent

Use mount volumes when:

Working with large datasets that exceed workspace disk capacity
You need real-time access to frequently updated data
Sharing data across multiple workspaces or users
Integrating with external data pipelines that update source data

Volume paths and organization

Use descriptive paths: Choose clear, descriptive mount points like /data/datasets or /code/project
Avoid system directories: Don’t mount volumes to system directories like /bin, /usr, or /etc
Leverage /root persistence: Remember that /root is automatically persistent, so you can store temporary files and configurations there
Organize by purpose: Group related volumes together (e.g., /data/ for datasets, /models/ for pre-trained models)

Performance considerations

Mount for large data: Use mount volumes for datasets larger than your workspace disk allocation
Import for speed: Import volumes provide faster access since data is local to the workspace
Network location matters: Choose storage locations close to your compute cluster for optimal performance

Common workflows

Data science workflow

# Import code repository
/code: git://github.com/myorg/ml-project

# Mount large dataset
/data/raw: volume://vessl-storage/raw-dataset

# Import pre-trained model
/models/pretrained: vessl-model://myorg/bert-base/v1

# Use /root for experiments and outputs (automatically persistent)

Collaborative development

# Shared codebase
/project: git://github.com/team/shared-project

# Shared datasets
/datasets: volume://vessl-storage/team-datasets

# Personal workspace for experiments
# (Use /root for personal files and configurations)

Model development and training

# Training data
/data/train: vessl-dataset://myorg/training-data

# Validation data  
/data/val: vessl-dataset://myorg/validation-data

# Model checkpoints (shared storage for team access)
/checkpoints: volume://vessl-storage/model-checkpoints

Troubleshooting

Volume mount failures

If a volume fails to mount:

Check permissions: Ensure your organization has access to the specified storage
Verify paths: Confirm the source path exists and is accessible
Review credentials: For external storage, verify integration credentials are valid
Check cluster connectivity: Ensure your cluster can reach the external storage system

Performance issues

If you experience slow data access:

Use appropriate volume type: Consider mount vs. import based on your use case
Check network connectivity: Ensure good connectivity between cluster and storage
Optimize data location: Use storage systems geographically close to your cluster
Monitor resource usage: Check if workspace resources are sufficient for your workload

Storage limitations

Remember these important limitations:

Disk space: Import volumes consume workspace disk space
Persistence: Only /root directory persists across workspace restarts
Custom clusters: Some volume types may have limitations on custom clusters
Network requirements: External storage requires appropriate network access

Need help with storage setup?

Learn more about VESSL’s storage system and how to configure different storage types.

Get Started

Compute

Resource

Admin

Private Hub

Pricing

Using volumes in workspaces

Volume types

Import volumes

Mount volumes

Configuring volumes

Through the Web Console

Using VESSL CLI

Best practices

Choosing between import and mount

Volume paths and organization

Performance considerations

Common workflows

Data science workflow

Collaborative development

Model development and training

Troubleshooting

Volume mount failures

Performance issues

Storage limitations

Need help with storage setup?

Get Started

Compute

Resource

Admin

Private Hub

Pricing

​Volume types

​Import volumes

​Mount volumes

​Configuring volumes

​Through the Web Console

​Using VESSL CLI

​Best practices

​Choosing between import and mount

​Volume paths and organization

​Performance considerations

​Common workflows

​Data science workflow

​Collaborative development

​Model development and training

​Troubleshooting

​Volume mount failures

​Performance issues

​Storage limitations

Need help with storage setup?

Volume types

Import volumes

Mount volumes

Configuring volumes

Through the Web Console

Using VESSL CLI

Best practices

Choosing between import and mount

Volume paths and organization

Performance considerations

Common workflows

Data science workflow

Collaborative development

Model development and training

Troubleshooting

Volume mount failures

Performance issues

Storage limitations