Links

Experiment API

read_experiment

vessl.read_experiment(
experiment_number: int, **kwargs
)
Read experiment in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.
Args
  • experiment_number (int) : experiment number.
Example
vessl.read_experiment(
experiment_number=23,
)

list_experiments

vessl.list_experiments(
statuses: List[str] = None, **kwargs
)
List experiments in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.
Args
  • statuses (List[str]) : A list of status filter. Defaults to None.
Example
vessl.list_experiments(
statuses=["completed"]
)

create_experiment

vessl.create_experiment(
cluster_name: str, start_command: str, cluster_node_names: List[str] = None,
kernel_resource_spec_name: str = None, processor_type: str = None,
cpu_limit: float = None, memory_limit: str = None, gpu_type: str = None,
gpu_limit: int = None, kernel_image_url: str = None, *, message: str = None,
termination_protection: bool = False, hyperparameters: List[str] = None,
dataset_mounts: List[str] = None, model_mounts: List[str] = None,
git_ref_mounts: List[str] = None, git_diff_mount: str = None,
local_files: List[str] = None, upload_local_git_diff: bool = False,
archive_file_mount: str = None, root_volume_size: str = None, working_dir: str = None,
output_dir: str = MOUNT_PATH_OUTPUT, worker_count: int = 1,
framework_type: str = None, **kwargs
)
Create experiment in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs. You can also configure git info by passing git_branch or git_ref as **kwargs. Pass use_git_diff=True if you want to run experiment with uncommitted changes and pass use_git_diff_untracked=True if you want to run untracked changes(only valid if use_git_diff is set).
Args
  • cluster_name (str) : Cluster name(must be specified before other options).
  • cluster_node_names (List[str]) : Node names. The experiment will run on one of these nodes. Defaults to None(all).
  • start_command (str) : Start command to execute in experiment container.
  • kernel_resource_spec_name (str) : Resource type to run an experiment (for managed cluster only). Defaults to None.
  • cpu_limit (float) : Number of vCPUs (for custom cluster only). Defaults to None.
  • memory_limit (str) : Memory limit in GiB (for custom cluster only). Defaults to None.
  • gpu_type (str) : GPU type (for custom cluster only). Defaults to None.
  • gpu_limit (int) : Number of GPU cores (for custom cluster only). Defaults to None.
  • kernel_image_url (str) : Kernel docker image URL. Defaults to None.
  • message (str) : Message. Defaults to None.
  • termination_protection (bool) : True if termination protection is enabled, False otherwise. Defaults to False.
  • hyperparameters (List[str]) : A list of hyperparameters. Defaults to None.
  • dataset_mounts (List[str]) : A list of dataset mounts. Defaults to None.
  • model_mounts (List[str]) : A list of model mounts. Defaults to None.
  • git_ref_mounts (List[str]) : A list of git repository mounts. Defaults to None.
  • git_diff_mount (str) : Git diff mounts. Defaults to None.
  • local_files (List[str]) : A list of local files to upload. Defaults to None.
  • upload_local_git_diff (bool) : True if local git diff to upload, False otherwise. Defaults to False.
  • archive_file_mount (str) : Local archive file mounts. Defaults to None.
  • root_volume_size (str) : Root volume size. Defaults to None.
  • working_dir (str) : Working directory path. Defaults to None.
  • output_dir (str) : Output directory path. Defaults to "/output/".
  • worker_count (int) : Number of workers(for distributed experiment only). Defaults to 1.
  • framework_type (str) : Specify "pytorch" or "tensorflow" (for distributed experiment only). Defaults to None. processor_type(str) cpu or gpu (for custom cluster only). Defaults to None.
Example
vessl.create_experiment(
cluster_name="aws-apne2-prod1",
kernel_resource_spec_name="v1.cpu-4.mem-13",
kernel_image_url="public.ecr.aws/vessl/kernels:py36.full-cpu",
dataset_mounts=["/input/:mnist"]
start_command="pip install requirements.txt && python main.py",
)

list_experiment_logs

vessl.list_experiment_logs(
experiment_number: int, tail: int = 200, worker_number: int = 0, after: int = 0,
**kwargs
)
List experiment logs in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.
Args
  • experiment_name (int) : Experiment number.
  • tail (int) : The number of lines to display from the end. Display all if -1. Defaults to 200.
  • worker_number (int) : Override default worker number (for distributed experiments only). Defaults to 0.
  • after (int) : The number of starting lines to display from the start. Defaults to 0.
Example
vessl.list_experiment_logs(
experiment_number=23,
)

list_experiment_output_files

vessl.list_experiment_output_files(
experiment_number: int, need_download_url: bool = False, recursive: bool = True,
worker_number: int = 0, **kwargs
)
List experiment output files in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.
Args
  • experiment_number (int) : Experiment number.
  • need_download_url (bool) : True if you need a download URL, False otherwise. Defaults to False.
  • recursive (bool) : True if list files recursively, False otherwise. Defaults to True.
  • worker_number (int) : Override default worker number (for distributed experiments only). Defaults to 0.
Example
vessl.list_experiment_output_files(
experiment_number=23,
)

download_experiment_output_files

vessl.download_experiment_output_files(
experiment_number: int, dest_path: str = os.path.join(os.getcwd(), 'output'),
worker_number: int = 0, **kwargs
)
Download experiment output files in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.
Args
  • experiment_number (int) : Experiment number.
  • dest_path (str) : Local download path. Defaults to "./output".
  • worker_number (int) : Override default worker number (for distributed experiments only). Defaults to 0.
Example
vessl.download_experiment_output_files(
experiment_number=23,
)

upload_experiment_output_files

vessl.upload_experiment_output_files(
experiment_number: int, path: str, **kwargs
)
Upload experiment output files in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.
Args
  • experiment_number (int) : Experiment number.
  • path (str) : Source path.
Example
vessl.upload_experiment_output_files(
experiment_number=23,
path="output",
)

terminate_experiment

vessl.terminate_experiment(
experiment_number: int, **kwargs
)
Terminate experiment in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.
Args
  • experiment_number (int) : Experiment number.
Example
vessl.terminate_experiment(
experiment_number=23,
)

delete_experiment

vessl.delete_experiment(
experiment_number: int, **kwargs
)
Delete experiment in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.
Args
  • experiment_number (int) : Experiment number.
Example
vessl.delete_experiment(
experiment_number=23,
)