Notes on Starting a New, Modular Analysis (Project) in Julia

The cookiecutter equivalent for Julia is PkgTemplates (github link), but you can do a ton with the base Pkg library. As someone who really like kedro for python (kedro docs), I really want to know the best methods of creating a modular, reproducible analysis using Julia.

Here's the steps I need to understand:

  • Navigating folders
  • Create a barebones project structure
  • Create a new environment for your project
  • Importing local modules

Create a New Environment for Your Project

Surprisingly, the best documentation for setting up your project exists in the Pkg Getting Started With Environments documentation itself.

Instead of creating new environments from the command line with conda, venv, pyenv, poetry, etc etc you can do it through the Julia REPL or calling the Pkg library within your script.

Key Terms:

  • Pkg.status() : See what environment you're using
  • Pkg.activate() : Activate environment
  • Pkg.generate() : Generates
  • Pkg.instantiate() :
In [8]:
using Pkg;
Pkg.status()
      Status `~/.julia/environments/v1.6/Project.toml`
  [336ed68f] CSV v0.9.11
  [8f4d0f93] Conda v1.5.2
  [a93c6f00] DataFrames v1.2.2
  [1313f7d8] DataFramesMeta v0.10.0
  [c91e804a] Gadfly v1.3.4
  [cd3eb016] HTTP v0.9.17
  [7073ff75] IJulia v1.23.2
  [91a5bcdd] Plots v1.24.2
  [c3e4b0f8] Pluto v0.17.2
  [438e738f] PyCall v1.92.5
  [6f49c342] RCall v0.13.12
  [ce6b1742] RDatasets v0.7.6
  [fdbf4ff8] XLSX v0.7.8
  • pwd - prints the current working directory
  • readdir - just like ls in bash, lists files in the current directory
  • mkdir - makes a new directory
  • rm(path, recursive=true) - recursively remove a directory
In [7]:
readdir()
Out[7]:
2-element Vector{String}:
 ".ipynb_checkpoints"
 "A Julia Workflow.ipynb"

Generate a new project structure with Pkg.generate

Julia gives us a barebones package generator with generate that will make a Project.toml config file and a src directory with a 'hello world' julia file in it.

In [12]:
Pkg.generate("my_package")
  Generating  project my_package:
    my_package/Project.toml
    my_package/src/my_package.jl
Out[12]:
Dict{String, Base.UUID} with 1 entry:
  "my_package" => UUID("a2f78104-38a4-4b74-bda0-0fd91114a85c")

See the directory structure of my_package:

In [14]:
readdir("my_package")
Out[14]:
2-element Vector{String}:
 "Project.toml"
 "src"
In [33]:
readdir("my_package/src")
Out[33]:
1-element Vector{String}:
 "my_package.jl"

Activate the new package after creating the strawman using Pkg.generate:

In [15]:
Pkg.activate("my_package")
Pkg.status()
  Activating environment at `~/Desktop/jul_test/workflow/my_package/Project.toml`

You can also open a REPL using your project by navigating to the my_package directory and calling something like:

# bash
julia --project=. # the '.' says to open julia using the environment in the current directory

And finally, if you open a jupyter notebook in the my_package directory, it should use the directory's Julia environment by default.

Loading someone else's project

Loading someone else's project after you've activated it, use instantiate:

In [25]:
Pkg.instantiate()
Precompiling project...
my_package
  1 dependency successfully precompiled in 1 seconds
In [29]:
Pkg.status()
     Project my_package v0.1.0
      Status `~/Desktop/jul_test/workflow/my_package/Project.toml` (empty project)

Import your package

In [34]:
using my_package

And call the single function in the package

In [35]:
my_package.greet()
Hello World!

Recap: Key functions in Pkg

  • generate : Create a new, barebones julia package structure
  • activate : Activates a julia environment (or creates one if it doesn't already exist)
  • instantiate : Install the package and required dependencies

Bonus: Delete the directory

In [38]:
rm("my_package", recursive=true)
In [40]:
readdir()
Out[40]:
2-element Vector{String}:
 ".ipynb_checkpoints"
 "A Julia Workflow.ipynb"
In [ ]: