%matplotlib inline
(c) 2019 Steve Phelps
Python is an interpreted language, in contrast to Java and C which are compiled languages.
This means we can type statements into the interpreter and they are executed immediately.
5 + 5
10
x = 5
y = 'Hello There'
z = 10.5
x + 5
10
In Python when we write x = 5
this means something different from an equation $x=5$.
Unlike variables in mathematical models, variables in Python can refer to different things as more statements are interpreted.
x = 1
print('The value of x is', x)
x = 2.5
print('Now the value of x is', x)
x = 'hello there'
print('Now it is ', x)
The value of x is 1 Now the value of x is 2.5 Now it is hello there
We can call functions in a conventional way using round brackets
round(3.14)
3
Values in Python have an associated type.
If we combine types incorrectly we get an error.
print(y)
Hello There
y + 5
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-8-b85a2dbb3f6a> in <module> ----> 1 y + 5 TypeError: can only concatenate str (not "int") to str
type
function.type(1)
int
type('hello')
str
type(2.5)
float
type(True)
bool
Sometimes we represent "no data" or "not applicable".
In Python we use the special value None
.
This corresponds to Null
in Java or SQL.
result = None
None
in the interactive interpreter, no result is printed out.result
is
operator:result is None
True
x = 5
x is None
False
x = 1
x
1
type(x)
int
y = float(x)
y
1.0
int()
function.type(y)
float
int(y)
1
y = 'hello'
print('The type of the value referred to by y is ', type(y))
y = 5.0
print('And now the type of the value is ', type(y))
The type of the value referred to by y is <class 'str'> And now the type of the value is <class 'float'>
1 + 1
2
'a' + 'b'
'ab'
'1' + '1'
'11'
The syntax for control structures in Python uses colons and indentation.
Beware that white-space affects the semantics of Python code.
Statements that are indented using the Tab key are grouped together.
if
statements¶x = 5
if x > 0:
print('x is strictly positive.')
print(x)
print('finished.')
x is strictly positive. 5 finished.
x = 0
if x > 0:
print('x is strictly positive.')
print(x)
print('finished.')
0 finished.
if
and else
¶x = 0
print('Starting.')
if x > 0:
print('x is strictly positive.')
else:
if x < 0:
print('x is strictly negative.')
else:
print('x is zero.')
print('finished.')
Starting. x is zero. finished.
elif
¶print('Starting.')
if x > 0:
print('x is strictly positive')
elif x < 0:
print('x is strictly negative')
else:
print('x is zero')
print('finished.')
Starting. x is zero finished.
We can use lists to hold an ordered sequence of values.
l = ['first', 'second', 'third']
l
['first', 'second', 'third']
Lists can contain different types of variable, even in the same list.
another_list = ['first', 'second', 'third', 1, 2, 3]
another_list
['first', 'second', 'third', 1, 2, 3]
Lists are mutable; their contents can change as more statements are interpreted.
l.append('fourth')
l
['first', 'second', 'third', 'fourth']
Whenever we bind a variable to a value in Python we create a reference.
A reference is distinct from the value that it refers to.
Variables are names for references.
X = [1, 2, 3]
Y = X
The above code creates two different references (named X
and Y
) to the same value [1, 2, 3]
Because lists are mutable, changing them can have side-effects on other variables.
If we append something to X
what will happen to Y
?
X.append(4)
X
[1, 2, 3, 4]
Y
[1, 2, 3, 4]
The state referred to by a variable is different from its identity.
To compare state use the ==
operator.
To compare identity use the is
operator.
When we compare identity we check equality of references.
When we compare state we check equality of values.
X = [1, 2]
Y = [1]
Y.append(2)
X
[1, 2]
Y
[1, 2]
X == Y
True
X is Y
False
Y.append(3)
X
[1, 2]
X == Y
False
X is Y
False
for
loop:my_list = ['first', 'second', 'third', 'fourth']
for i in my_list:
print(i)
first second third fourth
my_list = ['first', 'second', 'third', 'fourth']
for i in my_list:
print("The next item is:")
print(i)
print()
The next item is: first The next item is: second The next item is: third The next item is: fourth
for i in [0, 1, 2, 3]:
print("Hello!")
Hello! Hello! Hello! Hello!
range
function¶To save from having to manually write the numbers out, we can use the function range()
to count for us.
We count starting at 0 (as in Java and C++).
list(range(4))
[0, 1, 2, 3]
for
loops with the range
function¶for i in range(4):
print("Hello!")
Hello! Hello! Hello! Hello!
my_list
['first', 'second', 'third', 'fourth']
my_list[0]
'first'
my_list[1]
'second'
We can also a specify a range of positions.
This is called slicing.
The example below indexes from position 0 (inclusive) to 2 (exclusive).
my_list[0:2]
['first', 'second']
my_list[:2]
['first', 'second']
my_list[2:]
['third', 'fourth']
new_list = my_list[:]
new_list
['first', 'second', 'third', 'fourth']
new_list is my_list
False
new_list == my_list
True
my_list[-1]
'fourth'
my_list[:-1]
['first', 'second', 'third']
Lists are an example of a collection.
A collection is a type of value that can contain other values.
There are other collection types in Python:
tuple
set
dict
Tuples are another way to combine different values.
The combined values can be of different types.
Like lists, they have a well-defined ordering and can be indexed.
To create a tuple in Python, use round brackets instead of square brackets
tuple1 = (50, 'hello')
tuple1
(50, 'hello')
tuple1[0]
50
type(tuple1)
tuple
tuple1.append(2)
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-64-46e3866e32ee> in <module> ----> 1 tuple1.append(2) AttributeError: 'tuple' object has no attribute 'append'
Lists can contain duplicate values.
A set, in contrast, contains no duplicates.
Sets can be created from lists using the set()
function.
X = set([1, 2, 3, 3, 4])
X
{1, 2, 3, 4}
type(X)
set
{
and }
brackets.X = {1, 2, 3, 4}
type(X)
set
X.add(5)
X
{1, 2, 3, 4, 5}
X.add(5)
X
{1, 2, 3, 4, 5}
Sets do not have an ordering.
Therefore we cannot index or slice them:
X[0]
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-70-19c40ecbd036> in <module> ----> 1 X[0] TypeError: 'set' object is not subscriptable
X = {1, 2, 3}
Y = {4, 5, 6}
X | Y
{1, 2, 3, 4, 5, 6}
X = {1, 2, 3, 4}
Y = {3, 4, 5}
X & Y
{3, 4}
X - Y
{1, 2}
A dictionary contains a mapping between keys, and corresponding values.
Given a key, we can very quickly look up the corresponding value.
The values can be any type (and need not all be of the same type).
Keys can be any immutable (hashable) type.
They are abbreviated by the keyword dict
.
In other programming languages they are sometimes called associative arrays.
A dictionary contains a set of key-value pairs.
To create a dictionary:
students = { 107564: 'Xu', 108745: 'Ian', 102567: 'Steve' }
The above initialises the dictionary students so that it contains three key-value pairs.
The keys are the student id numbers (integers).
The values are the names of the students (strings).
Although we use the same brackets as for sets, this is a different type of collection:
type(students)
dict
students[108745]
'Ian'
KeyError
:students[123]
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-77-26e887eb0296> in <module> ----> 1 students[123] KeyError: 123
students[108745] = 'Fred'
print(students[108745])
Fred
students[104587] = 'John'
print(students[104587])
John
We can use any immutable type for the keys of a dictionary
For example, we can map names onto integers:
age = { 'John':21, 'Steve':47, 'Xu': 22 }
age['Steve']
47
We often want to initialise a dictionary with no keys or values.
To do this call the function dict()
:
result = dict()
for i in range(5):
result[i] = i**2
print(result)
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
for id in students:
print(students[id])
Xu Fred Steve John
We can count the number of values in a collection using the len
(length) function.
This can be used with any type of collection (list, set, tuple etc.).
len(students)
4
len(['one', 'two'])
2
len({'one', 'two', 'three'})
3
empty_list = []
len(empty_list) == 0
True
Python also has arrays which contain a single type of value.
i.e. we cannot have different types of value within the same array.
Arrays are mutable like lists; we can modify the existing elements of an array.
However, we typically do not change the size of the array; i.e. it has a fixed length.
numpy
module¶Arrays are provided by a separate module called numpy. Modules correspond to packages in e.g. Java.
We can import the module and then give it a shorter alias.
import numpy as np
We can now use the functions defined in this package by prefixing them with np
.
The function array()
creates an array given a list.
array()
function defined in the numpy
module:x = np.array([0, 1, 2, 3, 4])
x
array([0, 1, 2, 3, 4])
type(x)
numpy.ndarray
y = x * 2
y
array([0, 2, 4, 6, 8])
x = np.array([-1, 2, 3, -4])
y = abs(x)
y
array([1, 2, 3, 4])
np.arange()
function:x = np.arange(0, 10)
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
x = np.arange(0, 1, 0.1)
x
array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
We will use a module called matplotlib
to plot some simple graphs.
This module provides functions which are very similar to MATLAB plotting commands.
import matplotlib.pyplot as plt
y = x*2 + 5
plt.plot(x, y)
plt.show()
from numpy import pi, sin
x = np.arange(0, 2*pi, 0.01)
y = sin(x)
plt.plot(x, y)
plt.show()
hist()
function in matplotlib
to plot a histogram# Generate some random data
data = np.random.randn(1000)
ax = plt.hist(data)
plt.show()
histogram()
in the numpy
module will count frequencies into bins and return the result as a 2-dimensional array.np.histogram(data)
(array([ 14, 41, 128, 178, 243, 203, 109, 66, 14, 4]), array([-2.81515826, -2.19564948, -1.57614071, -0.95663193, -0.33712315, 0.28238562, 0.9018944 , 1.52140318, 2.14091195, 2.76042073, 3.3799295 ]))
def squared(x):
return x ** 2
squared(5)
25
Variables created inside functions are local to that function.
They are not accessable to code outside of that function.
def squared(x):
temp = x ** 2
return temp
squared(5)
25
temp
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-102-da77557ed0c8> in <module> ----> 1 temp NameError: name 'temp' is not defined
Functions are first-class citizens in Python.
They can be passed around just like any other value.
squared
<function __main__.squared(x)>
y = squared
y
<function __main__.squared(x)>
y(5)
25
We can apply a function to each element of a collection using the built-in function map()
.
This will work with any collection: list, set, tuple or string.
This will take as an argument another function, and the list we want to apply it to.
It will return the results of applying the function, as a list.
list(map(squared, [1, 2, 3, 4]))
[1, 4, 9, 16]
[squared(i) for i in [1, 2, 3, 4]]
[1, 4, 9, 16]
{squared(i) for i in [1, 2, 3, 4]}
{1, 4, 9, 16}
image courtesy of Quartl
The Cartesian product of two collections $X = A \times B$ can be expressed by using multiple for
statements in a comprehension.
A = {'x', 'y', 'z'}
B = {1, 2, 3}
{(a,b) for a in A for b in B}
{('x', 1), ('x', 2), ('x', 3), ('y', 1), ('y', 2), ('y', 3), ('z', 1), ('z', 2), ('z', 3)}
first_names = ('Steve', 'John', 'Peter')
surnames = ('Smith', 'Doe', 'Rabbit')
[(first_name, surname) for first_name in first_names for surname in surnames]
[('Steve', 'Smith'), ('Steve', 'Doe'), ('Steve', 'Rabbit'), ('John', 'Smith'), ('John', 'Doe'), ('John', 'Rabbit'), ('Peter', 'Smith'), ('Peter', 'Doe'), ('Peter', 'Rabbit')]
The Cartesian product pairs every combination of elements.
If we want a 1-1 pairing we use an operation called a zip.
A zip pairs values at the same position in each sequence.
Therefore:
list(zip(first_names, surnames))
[('Steve', 'Smith'), ('John', 'Doe'), ('Peter', 'Rabbit')]
list(map(lambda x: x ** 2, [1, 2, 3, 4]))
[1, 4, 9, 16]
We can filter a list by applying a predicate to each element of the list.
A predicate is a function which takes a single argument, and returns a boolean value.
filter(p, X)
is equivalent to $\{ x : p(x) \; \forall x \in X \}$ in set-builder notation.
list(filter(lambda x: x > 0, [-5, 2, 3, -10, 0, 1]))
[2, 3, 1]
We can use both filter()
and map()
on other collections such as strings or sets.
list(filter(lambda x: x > 0, {-5, 2, 3, -10, 0, 1}))
[1, 2, 3]
Again, because this is such a common operation, we can use simpler syntax to say the same thing.
We can express a filter using a list-comprehension by using the keyword if
:
data = [-5, 2, 3, -10, 0, 1]
[x for x in data if x > 0]
[2, 3, 1]
from numpy import sqrt
[sqrt(x) for x in data if x > 0]
[1.4142135623730951, 1.7320508075688772, 1.0]
reduce()
function recursively applies another function to pairs of values over the entire list, resulting in a single return value.from functools import reduce
reduce(lambda x, y: x + y, [0, 1, 2, 3, 4, 5])
15
The map()
and reduce()
functions form the basis of the map-reduce programming model.
Map-reduce is the basis of modern highly-distributed large-scale computing frameworks.
It is used in BigTable, Hadoop and Apache Spark.
See these examples in Python for Apache Spark.