A possibly lesser known fact about Jupyter, the wondrous framework in which we write and present these notebooks, is the origin of its name. The name is a reference to the three core languages supported by Jupyter: Julia, Python, and R. We at NumFys mainly use Python in our notebooks, but it is fitting that we also write a quick introduction to the Julia programming language. Maybe sometime later, we will also write an introduction to R.
Firstly, the Julia documentation is a very good starting point; it offers a clear and simple introduction to the language. We will however here write a quick tutorial for those who want to venture from Python to Julia. We will in this notebook assume that the reader has knowledge of both Python and NumPy. For a simple example where Julia is used, see our notebook on the Bak-Sneppen model of evolution. The code should be readily understood by anyone familiar with Python and NumPy.
First of all, you need to install Julia. For this, we refer you to Julias download-page: https://julialang.org/downloads/. Find the appropriate installer for your system. Note that if you are on a Linux-system, most major distributions have Julia in their repositories. For those who prefer an IDE, we may suggest looking into Juno, https://docs.junolab.org/latest/man/installation/, we will however here only use Jupyter.
When everything is set up correctly, you should be able to use Julia in your Jupyter notebooks. When creating a new notebook, select Julia as the kernel.
At a first glance, Julia might easily be mistaken for Python; they have many similarities.
my_var = 10
println(my_var)
my_var_sqrt = sqrt(my_var)
println(my_var_sqrt)
10 3.1622776601683795
Julia also has arrays similar to NumPy.
arr = [2, 3, 5, 7, 11]
println(arr*2)
[4, 6, 10, 14, 22]
And control flow constructs are also familiar for those who venture from Python
function add_one(x)
return x + 1
end
for i in 1:10
if add_one(i) == 6
println("i+1 = 6!")
else
println("i+1 != 6, it is ", add_one(i))
end
end
i+1 != 6, it is 2 i+1 != 6, it is 3 i+1 != 6, it is 4 i+1 != 6, it is 5 i+1 = 6! i+1 != 6, it is 7 i+1 != 6, it is 8 i+1 != 6, it is 9 i+1 != 6, it is 10 i+1 != 6, it is 11
As we saw in the previous code snippet, the Julia syntax is not exactly the same as python.
For example, all code blocks must be ended with the end
keyword, while Python only separates blocks by indentation.
# Python
def my_func(x):
# This is in my function block
...
# This is not in my function block
# Julia
function my_func(x)
# This is in my function block
...
end
# This is not in my function block
There are also other, more profound differences to be aware of.
1. Julia is one-based indexed, meaning that the index of the first element in an array is 1.
my_arr = [1, 2, 3]
println(my_arr[1])
println(my_arr[0]) # Gives error, no such index.
1
BoundsError: attempt to access 3-element Array{Int64,1} at index [0] Stacktrace: [1] getindex(::Array{Int64,1}, ::Int64) at ./array.jl:744 [2] top-level scope at In[4]:3
2. From Python, and NumPy in particular, we are used to operators automatically vectorizing when appropriate.
That is, we may write np.sqrt(my_number)
and np.sqrt(my_numpy_array)
, and expect the operator to understand that the first operation is on a normal number and that the second is on an array, vectorized.
In Julia, we have to specify this.
The standard convention is to add a dot to the operator, to indicate that we mean the vectorized version.
arr = [1, 2, 3]
println(sqrt.(arr))
println(arr .^ 2) # Note that ^ is the power operator in Julia (** in Python).
# The following causes errors, because sqrt without dot is a scalar operator.
println(sqrt(arr))
println(arr ^ 2)
[1.0, 1.4142135623730951, 1.7320508075688772] [1, 4, 9]
MethodError: no method matching sqrt(::Array{Int64,1}) Closest candidates are: sqrt(!Matched::Float16) at math.jl:1085 sqrt(!Matched::Complex{Float16}) at math.jl:1086 sqrt(!Matched::Missing) at math.jl:1138 ... Stacktrace: [1] top-level scope at In[5]:4
3. Probably the most striking difference if Python is your only previous programming language, is that Julia is (dynamically) typed.
This can increase performance and lead to clearer code, but also cause some headache for those who are not accustomed to typed languages.
So what is a typed language?
In a typed language, each variable has a "type" - they contain a certain type of data, for example integers, floating numbers, strings, chars, etc.
Julia is dynamically typed, where put a bit simplified, you can specify as much or little about the type as you wish.
Most often, the Julia compiler does a very good job of infering the types of variables that are unspecified.
There are a few specific instances where declarations are helpful, these are covered in detail in Julia's Performance Tips.
The main advantage of type declaration is most often readability and extended functionality.
More information on types and the possibilities this offers, is found at Julia's documentation on types.
function demonstrate_types()
x::Int = 1 # Define x as an integer
x = x + 0.5 # Causes error, as 1.5 is not an integer.
end
demonstrate_types()
InexactError: Int64(1.5) Stacktrace: [1] Int64 at ./float.jl:709 [inlined] [2] convert(::Type{Int64}, ::Float64) at ./number.jl:7 [3] demonstrate_types() at ./In[6]:3 [4] top-level scope at In[6]:5
There is too much to be said about Julia's powerful type-system to cover it all here. It suffices to say that Julia's type-system is extremely versatile, and well worth looking into for the serious Julia developer.
Below is a short snippet courtesy of Julia's documentation which demonstrates one of the possibilities offered.
# Courtesy of Julia's documentation.
# Define an object Point, with x- and y-coordinates of some unspecified type T.
struct Point{T}
x::T
y::T
end
# Define the norm of point, for all points where the type T is a subtype of Real (Int, Float, etc.).
function norm(p::Point{<:Real})
sqrt(p.x^2 + p.y^2)
end
norm (generic function with 1 method)
4. Two print functions.
Using print
in Julia does not append a newline.
The solution is simply to use println
, short for "print newline".
5. Strict difference between String and Char.
In Python, there is no difference between Strings and Chars.
Julia, like many programming languages, differentiates between these two.
A String is delimited by double quotation marks, as in "my string"
.
A Char is, put simply, a singe character, and is delimited by single quotation marks, as in 'C'
.
When reading Julia code, you will sometimes see function names ending with an exclamation mark, for example push!(my_vector, my_element)
.
In our notebook on the Bak Sneppen model, for example, we have such a function, called simulate!
.
This is a naming convention inherited from older languages, where one appends an exclamation mark to the function name on functions that alter one or more of its arguments.
For example, one could imagine having two variants of a function for sorting a list, one with an exclamation mark and one without.
One would then expect the latter to return a new sorted list, leaving the original list as it were, while the former would be expected to sort the list in-place, ie. altering the original list.
This specific example corresponds to sort
and sorted
in Python, where the latter returns a new list which is sorted and the former sorts the list in-place.
This is especially relevant for many numerical solvers, where one in an effort to save memory, will use the memory allocated to the arguments in the process of solving the system. As a result, many solvers, especially in linear algebra, come in two variants, one with an exclamation mark and one without.
In Python, we are so used to thinking about NumPy as an intrinsic part of Python itself, that we might forget that it is simply a package.
In Julia, linear algebra is built right into the language.
Now, some might argue that linear algebra should not be included in an introductory article, but we beg to differ.
For representing vectors and matrices, we will use the Array
-object.
The syntax for creating these arrays is quite intuitive.
Elements of the same row is separated by a space, columns are separated by semicolon.
Everything is wrapped in square brackets.
Let's first show a simple example of solving a matrix equation $Ax = b$.
We use Julia's "matrix division" operator \
, defined such that x=A\b => Ax==b
.
For those experienced with Matlab, this syntax should be familiar.
# Firstly, we must include the linear algebra module.
using LinearAlgebra
A = [1 2 3; 0 1 4; 5 6 0]
b = [1; 2; 3]
#=
Note: We here use Julia's syntax for
multiline comments, '#= ... =#'.
1 2 3
A = 0 1 4
5 6 0
,
1
b = 2
3
Solve Ax=b.
=#
x = A \ b
println("x: ", x)
# Verify
println("Ax = ", A * x)
x: [27.000000000000096, -22.00000000000008, 6.00000000000002] Ax = [0.9999999999999929, 2.0, 3.0]
WARNING: using LinearAlgebra.norm in module Main conflicts with an existing identifier.
We see that our solution is correct within some round-off error.
Note: Now, some might say that it is strange that we put emphasis on the fact that Julia has linear algebra built in, and then seeing that we had to write using LinearAlgebra
.
This is simply for importing the module, which must not be confused with it being a package in itself.
Julia's linear algebra module functions as a wrapper to the powerful LAPACK library (same as used by NumPy). For specialized matrices, such as symmetric, hermitian, upper triangular, Julia can use more specialized, and thus efficient, methods. For example, we might use the specialized method for finding eigenvalues and eigenvectors of a symmetric tridiagonal matrix.
# Construct a symmetric tridiagonal matrix
diagonal = [1, 2, 3, 4]
off_diagonal = [1, 0, 1]
A = SymTridiagonal(diagonal, off_diagonal)
#=
1 1 0 0
A = 1 2 0 0
0 0 3 1
0 0 1 4
=#
# eigen calls the correct method by looking at the type of the input, here LinearAlgebra.SymTridiagonal.
eigen(A)
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}} eigenvalues: 4-element Array{Float64,1}: 0.381966011250106 2.381966011250106 2.618033988749895 4.618033988749895 eigenvectors: 4×4 Array{Float64,2}: 0.850651 0.0 0.525731 0.0 -0.525731 0.0 0.850651 0.0 0.0 0.850651 0.0 0.525731 0.0 -0.525731 0.0 0.850651
One important point is of course, why bother? What is the benefit of using Julia over Python? The answer is speed.
In contrast to Python, Julia is a compiled language$^1$, and as a consequence one can expect huge differences in performance between Julia and Python. We will now show a simple example: Consider some lattice, or grid if you want, of size $N \times N$. At each lattice point $(i,j)$, we attribute a value $\sigma_{ij}$. For this system, we say that the energy of one lattice point, is its value times the sum of its neighbors: $\sigma_{ij} (\sigma_{\text{right}} + \sigma_{\text{left}} + \sigma_\text{up} + \sigma_\text{down})$. To find the total energy of the system, we then iterate over each lattice point, and sum the energy of each point. This is a type of model one encounters often in Physics, for example in a nearest-neigbor spin-spin interaction model. Calculations on such systems quickly become very computationally demanding as our system grows, thus efficient code is of the essence.
Note: If you are not interested in the physics of this example, and simply want to learn Julia, you may disregard the explanation above and keep on reading without any loss.
# Create some grid
N = 1000
grid = rand(N, N) # Populate our grid with random numbers between 0 and 1.
function energy(grid)
energy = 0
# This notation gives us two for-loops,
# an outer loop over i and
# an inner loop over j.
for i in 2:N-1, j in 2:N-1
right = grid[i+1, j]
left = grid[i-1, j]
up = grid[i, j+1]
down = grid[i, j-1]
# Nearest neighbor interaction
nn_interaction = grid[i,j] * (right + left + up + down)
energy += nn_interaction
end
return energy
end
energy (generic function with 1 method)
# Measure the time it takes to execute the function
@time energy(grid)
0.221708 seconds (11.89 M allocations: 196.658 MiB, 5.63% gc time)
995700.5089307781
In this example, the function takes 0.22s on our machine. In a very similar implementation in Python, shown below, the execution time on the same computer was measured to 1.21s, so 6 times slower! That is the difference between waiting one day and almost a week for some simulation!
# A similar implementation in Python.
import numpy as np
N = 1000
grid = np.random.rand(N,N)
def energy(grid):
energy = 0
for i in range(1, N-1):
for j in range(1, N-1):
right = grid[i+1, j]
left = grid[i-1, j]
up = grid[i, j+1]
down = grid[i, j-1]
# Nearest neighbor interaction
nn_interaction = 4*grid[i,j] - (right + left + up + down)
energy += nn_interaction
return energy
Admittedly, the implementation could be made more clever. Also, one could use more NumPy-functionality to speed up the Python implementation. The example does however demonstrate that Julia may offer an advantage in performance, especially in cases where one is unable to replace for-loops with NumPy functionality. In our notebook on Monte Carlo simulations on the Ising model, we showed how replacing some inner loops with Fortran could improve performance dramatically. That is also an excelent example of a situation where we could have used Julia.
This notebook is in no way a complete guide to the Julia programming language. We do however hope that it may serve as a starting point to Julia.
$^1$ Julia is a just-in-time compiled language. We will not discuss the meaning of that here, but as a mental image think of it as a middle ground between Python, which is not compiled at all, and C-like languages where one creates a new file that is the machine code to be executed.