read_experiment

vessl.read_experiment(
   experiment_number: int, **kwargs
)

Read experiment in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.

Args

  • experiment_number (int) : experiment number.

Example

vessl.read_experiment(
    experiment_number=23,
)

list_experiments

vessl.list_experiments(
   statuses: List[str] = None, **kwargs
)

List experiments in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.

Args

  • statuses (List[str]) : A list of status filter. Defaults to None.

Example

vessl.list_experiments(
    statuses=["completed"]
)

create_experiment

vessl.create_experiment(
   cluster_name: str, start_command: str, cluster_node_names: List[str] = None,
   kernel_resource_spec_name: str = None, processor_type: str = None,
   cpu_limit: float = None, memory_limit: str = None, gpu_type: str = None,
   gpu_limit: int = None, kernel_image_url: str = None,
   docker_credentials_id: Optional[int] = None, *, message: str = None,
   termination_protection: bool = False, hyperparameters: List[str] = None,
   secrets: List[str] = None, dataset_mounts: List[str] = None,
   model_mounts: List[str] = None, git_ref_mounts: List[str] = None,
   git_diff_mount: str = None, local_files: List[str] = None,
   use_vesslignore: bool = True, upload_local_git_diff: bool = False,
   archive_file_mount: str = None, object_storage_mounts: List[str] = None,
   root_volume_size: str = None, working_dir: str = None,
   output_dir: str = MOUNT_PATH_OUTPUT, worker_count: int = 1,
   framework_type: str = None, service_account: str = '', **kwargs
)

Create experiment in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs. You can also configure git info by passing git_branch or git_ref as **kwargs. Pass use_git_diff=True if you want to run experiment with uncommitted changes and pass use_git_diff_untracked=True if you want to run untracked changes(only valid if use_git_diff is set).

Args

  • cluster_name (str) : Cluster name(must be specified before other options).
  • cluster_node_names (List[str]) : Node names. The experiment will run on one of these nodes. Defaults to None(all).
  • start_command (str) : Start command to execute in experiment container.
  • kernel_resource_spec_name (str) : Resource type to run an experiment (for managed cluster only). Defaults to None.
  • cpu_limit (float) : Number of vCPUs (for custom cluster only). Defaults to None.
  • memory_limit (str) : Memory limit in GiB (for custom cluster only). Defaults to None.
  • gpu_type (str) : GPU type (for custom cluster only). Defaults to None.
  • gpu_limit (int) : Number of GPU cores (for custom cluster only). Defaults to None.
  • kernel_image_url (str) : Kernel docker image URL. Defaults to None.
  • docker_credentials_id (int) : Docker credential id. Defaults to None.
  • message (str) : Message. Defaults to None.
  • termination_protection (bool) : True if termination protection is enabled, False otherwise. Defaults to False.
  • hyperparameters (List[str]) : A list of hyperparameters. Defaults to None.
  • secrets (List[str]) : A list of secrets in form “KEY=VALUE”. Defaults to None.
  • dataset_mounts (List[str]) : A list of dataset mounts. Defaults to None.
  • model_mounts (List[str]) : A list of model mounts. Defaults to None.
  • git_ref_mounts (List[str]) : A list of git repository mounts. Defaults to None.
  • git_diff_mount (str) : Git diff mounts. Defaults to None.
  • local_files (List[str]) : A list of local files to upload. Defaults to None.
  • use_vesslignore (bool) : True if local files matching glob patterns in .vesslignore files should be ignored. Patterns apply relative to the directory containing that .vesslignore file.
  • upload_local_git_diff (bool) : True if local git diff to upload, False otherwise. Defaults to False.
  • archive_file_mount (str) : Local archive file mounts. Defaults to None.
  • object_storage_mounts (List[str]) : Object storage mounts. Defaults to None.
  • root_volume_size (str) : Root volume size. Defaults to None.
  • working_dir (str) : Working directory path. Defaults to None.
  • output_dir (str) : Output directory path. Defaults to “/output/“.
  • worker_count (int) : Number of workers(for distributed experiment only). Defaults to 1.
  • framework_type (str) : Specify “pytorch” or “tensorflow” (for distributed experiment only). Defaults to None.
  • service_account (str) : Service account name. Defaults to "". processor_type(str) cpu or gpu (for custom cluster only). Defaults to None.

Example

vessl.create_experiment(
    cluster_name="aws-apne2",
    kernel_resource_spec_name="v1.cpu-4.mem-13",
    kernel_image_url="public.ecr.aws/vessl/kernels:py36.full-cpu",
    dataset_mounts=["/input/:mnist"]
    start_command="pip install requirements.txt && python main.py",
)

list_experiment_logs

vessl.list_experiment_logs(
   experiment_number: int, tail: int = 200, worker_number: int = 0, after: int = 0,
   **kwargs
)

List experiment logs in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.

Args

  • experiment_name (int) : Experiment number.
  • tail (int) : The number of lines to display from the end. Display all if -1. Defaults to 200.
  • worker_number (int) : Override default worker number (for distributed experiments only). Defaults to 0.
  • after (int) : The number of starting lines to display from the start. Defaults to 0.

Example

vessl.list_experiment_logs(
    experiment_number=23,
)

list_experiment_output_files

vessl.list_experiment_output_files(
   experiment_number: int, need_download_url: bool = False, recursive: bool = True,
   worker_number: int = 0, **kwargs
)

List experiment output files in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.

Args

  • experiment_number (int) : Experiment number.
  • need_download_url (bool) : True if you need a download URL, False otherwise. Defaults to False.
  • recursive (bool) : True if list files recursively, False otherwise. Defaults to True.
  • worker_number (int) : Override default worker number (for distributed experiments only). Defaults to 0.

Example

vessl.list_experiment_output_files(
    experiment_number=23,
)

download_experiment_output_files

vessl.download_experiment_output_files(
   experiment_number: int, dest_path: str = os.path.join(os.getcwd(), 'output'),
   worker_number: int = 0, **kwargs
)

Download experiment output files in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.

Args

  • experiment_number (int) : Experiment number.
  • dest_path (str) : Local download path. Defaults to “./output”.
  • worker_number (int) : Override default worker number (for distributed experiments only). Defaults to 0.

Example

vessl.download_experiment_output_files(
    experiment_number=23,
)

upload_experiment_output_files

vessl.upload_experiment_output_files(
   experiment_number: int, path: str, **kwargs
)

Upload experiment output files in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.

Args

  • experiment_number (int) : Experiment number.
  • path (str) : Source path.

Example

vessl.upload_experiment_output_files(
    experiment_number=23,
    path="output",
)

terminate_experiment

vessl.terminate_experiment(
   experiment_number: int, **kwargs
)

Terminate experiment in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.

Args

  • experiment_number (int) : Experiment number.

Example

vessl.terminate_experiment(
    experiment_number=23,
)

delete_experiment

vessl.delete_experiment(
   experiment_number: int, **kwargs
)

Delete experiment in the default organization/project. If you want to override the default organization/project, then pass organization_name or project_name as **kwargs.

Args

  • experiment_number (int) : Experiment number.

Example

vessl.delete_experiment(
    experiment_number=23,
)