So far we have only considered converting a Python class into a ZnTrack Node.
Whilst ZnTrack classes are the more powerful tool a lightweight alternative is wrapping a Python function with @zntrack.nodify
to gain access to a subset of the available ZnTrack tools.
from zntrack import config
# When using ZnTrack we can write our code inside a Jupyter notebook.
# We can make use of this functionality by setting the `nb_name` config as follows:
config.nb_name = "07_functions.ipynb"
from zntrack.utils import cwd_temp_dir
temp_dir = cwd_temp_dir()
!git init
!dvc init
Initialized empty Git repository in /tmp/tmp7_5qd1yx/.git/ Initialized DVC repository. You can now commit the changes to git. +---------------------------------------------------------------------+ | | | DVC has enabled anonymous aggregate usage analytics. | | Read the analytics documentation (and how to opt-out) here: | | <https://dvc.org/doc/user-guide/analytics> | | | +---------------------------------------------------------------------+ What's next? ------------ - Check out the documentation: <https://dvc.org/doc> - Get help and share ideas: <https://dvc.org/chat> - Star us on GitHub: <https://github.com/iterative/dvc>
In the following example we will create an output file and write some parameters to it.
from zntrack import nodify, NodeConfig
import pathlib
@nodify(outs=pathlib.Path("outs.txt"), params={"text": "Lorem Ipsum"})
def write_text(cfg: NodeConfig):
cfg.outs.write_text(cfg.params.text)
The @nodify
allows us to define all available DVC run options such as outs
or deps
together with a parameter dictionary.
The params are cast into a DotDict
which allows us to access them either via cfg.params["text"]
or directly via cfg.params.text
.
Running the function will only create the Node for us and not execute the function. We can circumvent that by telling DVC to run the method via run=True
.
cfg = write_text(run=True)
2022-02-22 13:53:33,564 (WARNING): Jupyter support is an experimental feature! Please save your notebook before running this command! Submit issues to https://github.com/zincware/ZnTrack. 2022-02-22 13:53:37,149 (WARNING): Running DVC command: 'dvc run -n write_text ...'
cfg.outs.read_text()
'Lorem Ipsum'
This also allows us to build DAGs by adding the output files as dependencies.
@nodify(
deps=pathlib.Path("outs.txt"),
outs=[pathlib.Path("part_1.txt"), pathlib.Path("part_2.txt")],
)
def split_text(cfg: NodeConfig):
text = cfg.deps.read_text()
for text_part, outs_file in zip(text.split(" "), cfg.outs):
outs_file.write_text(text_part)
_ = split_text(run=True)
2022-02-22 13:53:44,346 (WARNING): Running DVC command: 'dvc run -n split_text ...'
print(pathlib.Path("part_1.txt").read_text())
print(pathlib.Path("part_2.txt").read_text())
Lorem Ipsum
Wrapping a Python function and converting it into Node is closer to the original DVC API. It provides all the basic functionality and can be nicely applied to compact methods.
The ZnTrack class API provides more powerful tools such as the zn.<method>
and can be used without configuring any file names.
Personal preferences allow everyone to use either method or combine them to get maximum benefit from ZnTrack and DVC.