TFDS CLI is a command-line tool that provides various commands to easily work with TensorFlow Datasets.
Copyright 2020 The TensorFlow Datasets Authors, Licensed under the Apache License, Version 2.0
%%capture
%env TF_CPP_MIN_LOG_LEVEL=1 # Disable logs on TF import
The CLI tool is installed with tensorflow-datasets
(or tfds-nightly
).
!pip install -q tfds-nightly
!tfds --version
For the list of all CLI commands:
!tfds --help
tfds new
: Implementing a new Dataset¶This command will help you kickstart writing your new Python dataset by creating
a <dataset_name>/
directory containing default implementation files.
Usage:
!tfds new my_dataset
tfds new my_dataset
will create:
ls -1 my_dataset/
An optional flag --data_format
can be used to generate format-specific dataset builders (e.g., conll
). If no data format is given, it will generate a template for a standard tfds.core.GeneratorBasedBuilder
.
Refer to the documentation for details on the available format-specific dataset builders.
See our writing dataset guide for more info.
Available options:
!tfds new --help
tfds build
: Download and prepare a dataset¶Use tfds build <my_dataset>
to generate a new dataset. <my_dataset>
can be:
A path to dataset/
folder or dataset.py
file (empty for current directory):
tfds build datasets/my_dataset/
cd datasets/my_dataset/ && tfds build
cd datasets/my_dataset/ && tfds build my_dataset
cd datasets/my_dataset/ && tfds build my_dataset.py
A registered dataset:
tfds build mnist
tfds build my_dataset --imports my_project.datasets
Note: tfds build
has useful flags to help prototyping and debuging. See the Debug & tests:
section bellow.
Available options:
!tfds build --help