Dataset
read_dataset
vessl.read_dataset(
dataset_name: str, **kwargs
)
Read a dataset in the default organization. If you want to override the
default organization, then pass organization_name
as **kwargs
.
Args
dataset_name
(str) : Dataset name.
Example
vessl.read_dataset(
dataset_name="mnist",
)
read_dataset_version
vessl.read_dataset_version(
dataset_id: int, dataset_version_hash: str, **kwargs
)
Read the specific version of dataset in the default organization. If you
want to override the default organization, then pass organization_name
as
**kwargs
.
Args
dataset_id
(int) : Dataset id.dataset_version_hash
(str) : Dataset version hash.
Example
vessl.read_dataset_version(
dataset_id=1,
dataset_version_hash="hash123"
)
list_datasets
vessl.list_datasets(
**kwargs
)
List datasets in the default organization. If you want to override the
default organization, then pass organization_name
as **kwargs
.
Example
vessl.list_datasets()
create_dataset
vessl.create_dataset(
dataset_name: str, description: str = None, is_version_enabled: bool = False,
is_public: bool = False, external_path: str = None, aws_role_arn: str = None,
version_path: str = None, **kwargs
)
Create a dataset in the default organization. If you want to override
the default organization, then pass organization_name
as **kwargs
.
Args
dataset_name
(str) : Dataset name.description
(str) : dataset description. Defaults to None.is_version_enabled
(bool) : True if a dataset versioning is set, False otherwise. Defaults to False.is_public
(bool) : True if a dataset is source from a public bucket, False otherwise. Defaults to False.external_path
(str) : AWS S3 or Google Cloud Storage bucket URL. Defaults to None.aws_role_arn
(str) : AWS Role ARN to access S3. Defaults to None.version_path
(str) : Versioning bucket path. Defaults to None.
Example
vessl.create_dataset(
dataset_name="mnist",
is_public=True,
external_path="s3://savvihub-public-apne2/mnist"
)
list_dataset_volume_files
vessl.list_dataset_volume_files(
dataset_name: str, need_download_url: bool = False, path: str = '',
recursive: bool = False, **kwargs
)
List dataset volume files in the default organization. If you want to
override the default organization, then pass organization_name
as
**kwargs
.
Args
dataset_name
(str) : Dataset name.need_download_url
(bool) : True if you need a download URL, False otherwise. Defaults to False.path
(str) : Directory path to list. Defaults to root(""),recursive
(bool) : True if list files recursively, False otherwise. Defaults to False.
Example
vessl.list_dataset_volume_files(
dataset_name="mnist",
recursive=True,
)
upload_dataset_volume_file
vessl.upload_dataset_volume_file(
dataset_name: str, source_path: str, dest_path: str, **kwargs
)
Upload file to the dataset. If you want to override the default
organization, then pass organization_name
as **kwargs
.
Args
dataset_name
(str) : Dataset name.source_path
(str) : Local source path.dest_path
(str) : Destination path within the dataset.
Example
vessl.upload_dataset_volume_file(
dataset_name="mnist",
source_path="test.csv",
dest_path="train",
)
download_dataset_volume_file
vessl.download_dataset_volume_file(
dataset_name: str, source_path: str, dest_path: str, **kwargs
)
Download file from the dataset. If you want to override the default
organization, then pass organization_name
as **kwargs
.
Args
dataset_name
(str) : Dataset name.source_path
(str) : Source path within the dataset.dest_path
(str) : Local destination path.
Example
vessl.download_dataset_volume_file(
dataset_name="mnist",
source_path="train/test.csv",
dest_path=".",
)
copy_dataset_volume_file
vessl.copy_dataset_volume_file(
dataset_name: str, source_path: str, dest_path: str, **kwargs
)
Copy files within the same dataset. Noted that this is not supported for
externally sourced datasets like S3 or GCS. If you want to override the
default organization, then pass organization_name
as **kwargs
.
Args
dataset_name
(str) : Dataset name.source_path
(str) : Source path within the dataset.dest_path
(str) : Local destination path.
Example
vessl.download_dataset_volume_file(
dataset_name="mnist",
source_path="train/test.csv",
dest_path="test/test.csv",
)
delete_dataset_volume_file
vessl.delete_dataset_volume_file(
dataset_name: str, path: str, **kwargs
)
Delete the dataset volume file. Noted that this is not supported for
externally sourced datasets like S3 or GCS. If you want to override the
default organization, then pass organization_name
as **kwargs
.
Args
dataset_name
(str) : Dataset name.path
(str) : File path.
Example
vessl.delete_dataset_volume_file(
dataset_name="mnist",
path="train/test.csv",
)
Was this page helpful?