Sweep
read_sweep
vessl.read_sweep(
sweep_name: str, **kwargs
)
Read sweep in the default organization/project. If you want to
override the default organization/project, then pass organization_name
or
project_name
as **kwargs
.
Args
sweep_name
(str) : Sweep name.
Example
vessl.read_sweep(
sweep_name="pitch-lord",
)
list_sweeps
vessl.list_sweeps(
**kwargs
)
List sweeps in the default organization/project. If you want to
override the default organization/project, then pass organization_name
or
project_name
as **kwargs
.
Example
vessl.list_sweeps()
create_sweep
vessl.create_sweep(
name: str, algorithm: str, parameters: List[SweepParameter], cluster_name: str,
command: str, objective: SweepObjective = None, max_experiment_count: int = None,
parallel_experiment_count: int = None, max_failed_experiment_count: int = None,
resource_spec_name: str = None, processor_type: str = None, cpu_limit: float = None,
memory_limit: str = None, gpu_type: str = 'Any', gpu_limit: int = None,
image_url: str = None, *, early_stopping_name: str = None,
early_stopping_settings: List[Tuple[str, str]] = None, message: str = None,
hyperparameters: List[Tuple[str, str]] = None, dataset_mounts: List[str] = None,
git_ref_mounts: List[str] = None, git_diff_mount: str = None,
archive_file_mount: str = None, object_storage_mount: str = None,
root_volume_size: str = None, working_dir: str = None,
output_dir: str = MOUNT_PATH_OUTPUT, **kwargs
)
Create sweep in the default organization/project. If you want to
override the default organization/project, then pass organization_name
or
project_name
as **kwargs
. Pass use_git_diff=True
if you want to run
experiment with uncommitted changes and pass use_git_diff_untracked=True
if you want to run untracked changes(only valid if use_git_diff
is set).
Args
name
(str) : Nameobjective
(Optional[vessl.SweepObjective]) : A sweep objective including goal, metric, and type.max_experiment_count
(Optional[int]) : The maximum number of experiments to run. Required unless grid search.parallel_experiment_count
(Optional[int]) : The number of experiments to run in parallel. Default: 1.max_failed_experiment_count
(Optional[int]) : The maximum number of experiments to allow to fail. Default: 1.algorithm
(str) : Parameter suggestion algorithm.grid
,random
, orbayesian
.parameters
(List[vessl.SweepParameter]) : A list of parameters to search.- SweepParameter
- name(str): The names of hyperparameters to search.
- type(str):
int
,double
,categorical
. - range(SweepParameterRange): Search range.
- list(List[str]): A list of values to try.
If
list
is given,min
,max
andstep
will be ignored. - min(str): The minimum value of the search range (inclusive).
- max(str): The maximum value of the search range (inclusive).
- step(Optional[str]): If provided, the values are limited to min + n*step.
- list(List[str]): A list of values to try.
If
- SweepParameter
cluster_name
(str) : Cluster name(must be specified before other options).command
(str) : Start command to execute in experiment container.resource_spec_name
(str) : Resource type to run an experiment (for managed cluster only). Defaults to None.cpu_limit
(float) : Number of vCPUs (for custom cluster only). Defaults to None.memory_limit
(str) : Memory limit (for custom cluster only). Defaults to None. Example: “100Gi”, “500Mi”gpu_type
(str) : GPU type(name) (for custom cluster only). Defaults to “Any”. processor_type(str) cpu or gpu (for custom cluster only). Defaults to None. Examplegpu_limit
(int) : Number of GPU cores (for custom cluster only). Defaults to None.image_url
(str) : Kernel docker image URL. Defaults to None.early_stopping_name
(str) : Early stopping algorithm name. Defaults to None.early_stopping_settings
(List[Tuple[str, str]]) : Early stopping algorithm settings. Defaults to None.message
(str) : Message. Defaults to None.hyperparameters
(List[str]) : A list of fixed hyperparameters. Defaults to None.dataset_mounts
(List[str]) : A list of dataset mounts. Defaults to None.git_ref_mounts
(List[str]) : A list of git repository mounts. Defaults to None.git_diff_mount
(str) : Git diff mounts. Defaults to None.archive_file_mount
(str) : Local archive file mounts. Defaults to None.object_storage_mount
(str) : Object storage mounts. Defaults to None.root_volume_size
(str) : Root volume size. Defaults to None.working_dir
(str) : Working directory path. Defaults to None.output_dir
(str) : Output directory path. Defaults to “/output/“.
Example
sweep_objective = vessl.SweepObjective(
type="maximize",
goal="0.99",
metric="val_accuracy",
)
parameters = [
vessl.SweepParameter(
name="optimizer",
type="categorical",
range=vessl.SweepParameterRange(
list=["adam", "sgd", "adadelta"]
)
),
vessl.SweepParameter(
name="batch_size",
type="int",
range=vessl.SweepParameterRange(
max="256",
min="64",
step="8",
)
)
]
# Custom Cluster
vessl.create_sweep(
name="example-sweep-name",
objective=sweep_objective,
max_experiment_count=4,
parallel_experiment_count=2,
max_failed_experiment_count=2,
algorithm="random",
parameters=parameters,
dataset_mounts=["/input:mnist"],
cluster_name="dgx-cluster",
processor_type="gpu",
gpu_type="NVIDIA-A100-SXM4-80GB",
gpu_limit=2,
cpu_limit=30,
memory_limit="100Gi",
kernel_image_url="public.ecr.aws/vessl/kernels:py36.full-cpu",
start_command="pip install requirements.txt && python main.py",
)
# VESSL-Managed Cluster
vessl.create_sweep(
name="example-sweep-name",
objective=sweep_objective,
max_experiment_count=4,
parallel_experiment_count=2,
max_failed_experiment_count=2,
algorithm="random",
parameters=parameters,
dataset_mounts=["/input:mnist"],
cluster_name="aws-apne2",
kernel_resource_spec_name="v1.cpu-4.mem-13",
kernel_image_url="public.ecr.aws/vessl/kernels:py36.full-cpu",
start_command="pip install requirements.txt && python main.py",
)
terminate_sweep
vessl.terminate_sweep(
sweep_name: str, **kwargs
)
Terminate sweep in the default organization/project. If you want to
override the default organization/project, then pass organization_name
or
project_name
as **kwargs
.
Args
sweep_name
(str) : Sweep name.
Example
vessl.terminate_sweep(
sweep_name="pitch-lord",
)
list_sweep_logs
vessl.list_sweep_logs(
sweep_name: str, tail: int = 200, **kwargs
)
List sweep logs in the default organization/project. If you want to
override the default organization/project, then pass organization_name
or
project_name
as **kwargs
.
Args
sweep_name
(str) : Sweep name.tail
(int) : The number of lines to display from the end. Display all if -1. Defaults to 200.
Example
vessl.list_sweep_logs(
sweep_name="pitch-lord",
)
get_best_sweep_experiment
vessl.get_best_sweep_experiment(
sweep_name: str, **kwargs
)
Read sweep and return the best experiment info in the default
organization/project. If you want to override the default
organization/project, then pass organization_name
or project_name
as
**kwargs
.
Args
sweep_name
(str) : Sweep name.
Example
vessl.get_best_sweep_experiment(
sweep_name="pitch-lord",
)