Dataset API

​

read_dataset

1
vessl.read_dataset(
2
dataset_name: str, **kwargs
3
)
Copied!
Read a dataset in the default organization. If you want to override the default organization, then pass organization_name as **kwargs.
Args
  • dataset_name (str) : Dataset name.
Example
1
vessl.read_dataset(
2
dataset_name="mnist",
3
)
Copied!
​

read_dataset_version

1
vessl.read_dataset_version(
2
dataset_id: int, dataset_version_hash: str, **kwargs
3
)
Copied!
Read the specific version of dataset in the default organization. If you want to override the default organization, then pass organization_name as **kwargs.
Args
  • dataset_id (int) : Dataset id.
  • dataset_version_hash (str) : Dataset version hash.
Example
1
vessl.read_dataset_version(
2
dataset_id=1,
3
dataset_version_hash="hash123"
4
)
Copied!
​

list_datasets

1
vessl.list_datasets(
2
**kwargs
3
)
Copied!
List datasets in the default organization. If you want to override the default organization, then pass organization_name as **kwargs.
Example
1
vessl.list_datasets()
Copied!
​

create_dataset

1
vessl.create_dataset(
2
dataset_name: str, description: str = None, is_version_enabled: bool = False,
3
is_public: bool = False, external_path: str = None, aws_role_arn: str = None,
4
version_path: str = None, **kwargs
5
)
Copied!
Create a dataset in the default organization. If you want to override the default organization, then pass organization_name as **kwargs.
Args
  • dataset_name (str) : Dataset name.
  • description (str) : dataset description. Defaults to None.
  • is_version_enabled (bool) : True if a dataset versioning is set, False otherwise. Defaults to False.
  • is_public (bool) : True if a dataset is source from a public bucket, False otherwise. Defaults to False.
  • external_path (str) : AWS S3 or Google Cloud Storage bucket URL. Defaults to None.
  • aws_role_arn (str) : AWS Role ARN to access S3. Defaults to None.
  • version_path (str) : Versioning bucket path. Defaults to None.
Example
1
vessl.create_dataset(
2
dataset_name="mnist",
3
is_public=True,
4
external_path="s3://savvihub-public-apne2/mnist"
5
)
Copied!
​

list_dataset_volume_files

1
vessl.list_dataset_volume_files(
2
dataset_name: str, need_download_url: bool = False, path: str = '',
3
recursive: bool = False, **kwargs
4
)
Copied!
List dataset volume files in the default organization. If you want to override the default organization, then pass organization_name as **kwargs.
Args
  • dataset_name (str) : Dataset name.
  • need_download_url (bool) : True if you need a download URL, False otherwise. Defaults to False.
  • path (str) : Directory path to list. Defaults to root(""),
  • recursive (bool) : True if list files recursively, False otherwise. Defaults to False.
Example
1
vessl.list_dataset_volume_files(
2
dataset_name="mnist",
3
recursive=True,
4
)
Copied!
​

upload_dataset_volume_file

1
vessl.upload_dataset_volume_file(
2
dataset_name: str, source_path: str, dest_path: str, **kwargs
3
)
Copied!
Upload file to the dataset. If you want to override the default organization, then pass organization_name as **kwargs.
Args
  • dataset_name (str) : Dataset name.
  • source_path (str) : Local source path.
  • dest_path (str) : Destination path within the dataset.
Example
1
vessl.upload_dataset_volume_file(
2
dataset_name="mnist",
3
source_path="test.csv",
4
dest_path="train",
5
)
Copied!
​

download_dataset_volume_file

1
vessl.download_dataset_volume_file(
2
dataset_name: str, source_path: str, dest_path: str, **kwargs
3
)
Copied!
Download file from the dataset. If you want to override the default organization, then pass organization_name as **kwargs.
Args
  • dataset_name (str) : Dataset name.
  • source_path (str) : Source path within the dataset.
  • dest_path (str) : Local destination path.
Example
1
vessl.download_dataset_volume_file(
2
dataset_name="mnist",
3
source_path="train/test.csv",
4
dest_path=".",
5
)
Copied!
​

copy_dataset_volume_file

1
vessl.copy_dataset_volume_file(
2
dataset_name: str, source_path: str, dest_path: str, recursive: bool = False,
3
**kwargs
4
)
Copied!
Copy files within the same dataset. Noted that this is not supported for externally sourced datasets like S3 or GCS. If you want to override the default organization, then pass organization_name as **kwargs.
Args
  • dataset_name (str) : Dataset name.
  • source_path (str) : Source path within the dataset.
  • dest_path (str) : Local destination path.
  • recursive (bool) : True if file is a directory, False otherwise. Defaults to False.
Example
1
vessl.download_dataset_volume_file(
2
dataset_name="mnist",
3
source_path="train/test.csv",
4
dest_path="test/test.csv",
5
)
Copied!
​

delete_dataset_volume_file

1
vessl.delete_dataset_volume_file(
2
dataset_name: str, path: str, recursive: bool = False, **kwargs
3
)
Copied!
Delete the dataset volume file. Noted that this is not supported for externally sourced datasets like S3 or GCS. If you want to override the default organization, then pass organization_name as **kwargs.
Args
  • dataset_name (str) : Dataset name.
  • path (str) : File path.
  • recursive (bool) : True if file is a directory, False otherwise. Defaults to False.
Example
1
vessl.delete_dataset_volume_file(
2
dataset_name="mnist",
3
path="train/test.csv",
4
)
Copied!