Introductory Python

Quentin CAUDRON

Ecology and Evolutionary Biology

qcaudron@princeton.edu

@QuentinCAUDRON

This section moves quickly. I'm assuming that everyone speaks at least one programming language well, and / or has introductory Python experience, and so this chapter gives a lightning intro to syntax in Python. The sections are subheaded, but they really overlap quite a lot, so they're there more as a page reference...

Variables and Arithmetic

In [1]:
print("He said, 'what ?'")
He said, 'what ?'

Strings are delimited by ", but can also use '. This is useful because you can now use one set of quotes inside another, and it'll still be one big string.

In [2]:
s = "This is a string."
print(s)
print(type(s))
print(len(s))
This is a string.
<class 'str'>
17
In [3]:
s = 42
print(s)
print(type(s))
42
<class 'int'>

Variables don't need to be given a type, as Python is dynamically-typed. That means if I wanted to reuse s as an integer, Python would have no issue with that.

In [4]:
print(s * 2)
print(s + 7)
# Neither statement modifies the variable.
84
49

Single-line comments use #.

Arithmetic uses the standard operators : +, -, *, /.

You can take powers using **.

Python also allows += syntax :

In [5]:
s += 2**3 # s is being incremented by 2^3
print("Same as s = s + 2**3")
print(s)
Same as s = s + 2**3
50

This statement is equivalent to saying s = s + 2**3, it's just shorthand. Also works with

-=, *=, /=, **=

In [6]:
print(s == 42)
print(s == 50)
print(s > 10)
False
True
True

The == operator is the comparison operator. Here, we also see Python's syntax for logical statements : True and False. As with any programming syntax, capitalisation is important. In Python, 1 is also True, and 0 is also False.

In [7]:
x = "Blah"
print(x + x)
print(len(x))
BlahBlah
4

Strings can be concatenated using the + operator. The len() function returns the length of a string.

Lists

In [8]:
mylist = [1, 2.41341]
mylist.append("We can mix types !")

print(mylist)
print(type(mylist))
[1, 2.41341, 'We can mix types !']
<class 'list'>

Python accesses elements in lists from 0, not from 1 as in Matlab or R. This will be familiar to C and Java users.

.append is a method of the list class - it's a function that belongs to the list object, and so you can call it directly from the object itself. This particular function appends something to the end of the list.

In [9]:
print(mylist, "\n")

print(mylist[0])
print(mylist[1])
print(mylist[2])
[1, 2.41341, 'We can mix types !'] 

1
2.41341
We can mix types !

Lists have several methods ( count, sort, reverse, pop, insert, remove, ... ). Here are a few.

In [10]:
print("Length is {} long.\n".format(len(mylist)))

print("There are {} ones in this list.\n".format(mylist.count(1)))

mylist.reverse()
print("Reversed ! {}".format(mylist))
Length is 3 long.

There are 1 ones in this list.

Reversed ! ['We can mix types !', 2.41341, 1]

Any thoughts on why len() is a global function in Python, and not a method of the list object ?

Control Structures

Python objects like lists are iterables. That is, we can directly iterate over them :

In [11]:
for i in mylist :
    print(i)
    print("Hello\n")
print("Finished")
We can mix types !
Hello

2.41341
Hello

1
Hello

Finished

Note the indentation. Loops in Python don't get delimited by brackets like in C or R. Each block gets its own indentation.

Typically, people use tabs, but you can use any amount of whitespace you want as long as you are consistent. To end the loop, simply unindent. We'll see that in a few lines.

Users of languages like C or Java, where code blocks are delimited by curly braces, sometimes ask that they be made available in Python.

for i in range {
    do something to i
}

Python's __future__ module has already taken care of this.

In [12]:
from __future__ import braces
  File "<ipython-input-12-2aebb3fc8ecf>", line 1
    from __future__ import braces
SyntaxError: not a chance

The keyword in can also be used to check whether something is in a container :

In [13]:
print(1 in mylist)
print(2 in mylist)
True
False

If you wanted to loop by indexing the list, we can use range(), which, in its simplest ( single-argument ) form, returns a list from 0 to that element minus 1.

In [14]:
for i in range(len(mylist)) :
    print(i, mylist[i])
0 We can mix types !
1 2.41341
2 1

Another way to do this is the enumerate function :

In [15]:
for index, value in enumerate(mylist) :
    print("Element number {} in the list has the value {}".format(index, value))
Element number 0 in the list has the value We can mix types !
Element number 1 in the list has the value 2.41341
Element number 2 in the list has the value 1

What about if statements ?

In [16]:
x = 5

if x > 3 :
    print("x is greater than 3.")
    
elif x == 5 :
    print("We aren't going to see this. Why ?")
    
else :
    print("x is not greater than 3.")
    
print("We can see this, it's not in the if statement.")
x is greater than 3.
We can see this, it's not in the if statement.

Notice how the contents of the while loop are indented, and then code that is outside the loop continues unindented below.

Here's a nested loop to clarify :

In [17]:
for outer in range(1, 3) :
    print("BIG CLICK, outer loop change to {}".format(outer))
    
    for inner in range(4) :
        print("*little click*, outer is still {}, and inner is {}.".format(outer, inner))
        
print("I'm done here.")
BIG CLICK, outer loop change to 1
*little click*, outer is still 1, and inner is 0.
*little click*, outer is still 1, and inner is 1.
*little click*, outer is still 1, and inner is 2.
*little click*, outer is still 1, and inner is 3.
BIG CLICK, outer loop change to 2
*little click*, outer is still 2, and inner is 0.
*little click*, outer is still 2, and inner is 1.
*little click*, outer is still 2, and inner is 2.
*little click*, outer is still 2, and inner is 3.
I'm done here.

Here, we used range() with two arguments. In Python 2, it generates a list from the first argument to the second argument minus 1. In Python 3, it returns an immutable iterable, but you can cast it to a list by calling something like list(range(5)). Also, note that we can feed the print function several things to print, separated by a comma.

Interacting Between Different Variable Types

Beware of integer division with Python 2. Unlike R, Python 2 doesn't assume that everything is a float unless explicitly told; it recognises that 2 is an integer, and this can be good and bad. In Python 3, we don't need to worry about this; the following code was run under a Python 3 kernel, but test it under Python 2 to see the difference.

In [18]:
myint = 2
myfloat = 3.14
print(type(myint), type(myfloat))
<class 'int'> <class 'float'>
In [19]:
# Multiplying an int with a float gives a float : the int was promoted.
print(myint * myfloat)
print(type(myint * myfloat))
6.28
<class 'float'>
In [20]:
# A minor difference between Python 2 and Python 3 :
print(7 / 3)
# Py2 : 2
# Py3 : 2.3333

# In Python 2, operations between same type gives the same type :
print(type(7 / 3))
# Py2 : <type 'int'>
# Py3 : <class 'float'>
2.3333333333333335
<class 'float'>
In [21]:
# Quick hack with ints to floats - there's no need to typecast, just give it a float 
print(float(7) / 3)
print(7 / 3.0)

# In Python 3, this is handled "correctly"; you can use // as integer division
print(7 // 3)
2.3333333333333335
2.3333333333333335
2
In [22]:
# Quick note for Py2 users - see https://www.python.org/dev/peps/pep-0238/
from __future__ import division
print(7 / 3)
2.3333333333333335

More Lists : Accessing Elements

Let's go back to lists. They're a type of generic, ordered container; their elements can be accessed in several ways.

In [23]:
# Create a list of integers 0, 1, 2, 3, 4
A = list(range(5))
print(A)

# Py2 vs Py3 :
# In Py2, range() returns a list already
[0, 1, 2, 3, 4]
In [24]:
# Let's replace the middle element
A[2] = "Naaaaah"
print(A)
[0, 1, 'Naaaaah', 3, 4]

What are the middle three elements ? Let's use the : operator. Like range(), it creates a list of integers.

[1:4] will give us elements 1, 2, and 3, because we stop at n-1, like with range().

In [25]:
print(A[1:4])
[1, 'Naaaaah', 3]

We don't need to give a start or an end :

In [26]:
print(A[:2])
print(A[2:])
[0, 1]
['Naaaaah', 3, 4]

Can we access the last element ? What about the last two ?

In [27]:
print(A[len(A)-2:])
print(A[-2:])
[3, 4]
[3, 4]

Earlier, we saw that range() can take two arguments : range(start, finish). It can actually take a third : range(start, finish, stride).

In [28]:
print(list(range(0, 5, 2)))
[0, 2, 4]

The : operator can also do the same.

In [29]:
print(A[0:5:2])
# Here, it will give us elements 0, 2, 4.
[0, 'Naaaaah', 4]

What if I don't want to explicitly remember the size of the list ?

In [30]:
# This will simply go from start to finish with a stride of 2
print(A[::2])
[0, 'Naaaaah', 4]
In [31]:
# And this one, from the second element to finish, with a stride of 2
print(A[1::2])
[1, 3]
In [32]:
# So, uh... Reverse ?
print(A[::-1])
[4, 3, 'Naaaaah', 1, 0]

List arithmetic ?

In [33]:
print(A + A)
print(A * 3)
[0, 1, 'Naaaaah', 3, 4, 0, 1, 'Naaaaah', 3, 4]
[0, 1, 'Naaaaah', 3, 4, 0, 1, 'Naaaaah', 3, 4, 0, 1, 'Naaaaah', 3, 4]

Dictionaries

Let's take a very brief look at dictionaries. These are unordered containers that you can use to pair elements in, similar to a std::map if you're a C++ coder.

In [34]:
pythonPoints = { "Quentin" : 1./3, "Paul" : 42, "Matthew" : 1e3 }
print(pythonPoints)
{'Quentin': 0.3333333333333333, 'Paul': 42, 'Matthew': 1000.0}
In [35]:
# Dictionaries associate keys with values
print(pythonPoints.keys())
print(pythonPoints.values())
dict_keys(['Quentin', 'Paul', 'Matthew'])
dict_values([0.3333333333333333, 42, 1000.0])
In [36]:
# You can access them through their keys
print(pythonPoints["Paul"] * 2)
84
In [37]:
if "Ruthie" in pythonPoints :     # for dicts, "in" checks the keys
    print("Ruthie's here too !")
    
else :
    pythonPoints["Ruthie"] = 0
    
print("Ruthie has {} mad skillz.".format(pythonPoints["Ruthie"]))
Ruthie has 0 mad skillz.

There are a couple of other built-in containers, like tuples and sets. I won't go into them here, plainly because I have to use them so rarely that it's not worth the time during the session. If you want to read up : http://docs.python.org/2/tutorial/datastructures.html

List Comprehension and Inlines

In [38]:
# Let's build a list of elements 1^2, 2^2, ..., 5^2
y = [i**2 for i in range(6)]

print(y)
[0, 1, 4, 9, 16, 25]
In [39]:
# Want to keep your index ? Use a dictionary.
squares = { x : x**2 for x in range(6) }

for key, val in squares.items() :
    print("{} squared is {}".format(key, val))
    
# Also useful : zip()
# for key, val in zip(squares.keys(), squares.values()) :
#     print("{} : {}".format(key, val))
0 squared is 0
1 squared is 1
2 squared is 4
3 squared is 9
4 squared is 16
5 squared is 25
In [40]:
# We can inline if statements too
print(42 if type(42) is int else 32)

# Note this is interpreted as 
# print (something if a, else print something_else)
# and not
# (print something) if a, else (do something_else)
42

Functions

In [41]:
# Fibonacci numbers
# OH NO RECURSION

def fib(n) :
    if n < 2 :
        return n
    else :
        return fib(n-1) + fib(n-2)
    
print("Done defining.")
Done defining.
In [42]:
# Testing :
for i in range(10) :
    print(fib(i))
0
1
1
2
3
5
8
13
21
34

Looks good. We've just defined a function that takes one argument, n, and returns something based on what n is. The Fibonacci function is quite particular because it calls itself ( recursion ), but it's a small, fun example, so why not.

In [43]:
def printFib(i) :
    print("The {}th number of the Fibonnaci sequence is {}.".format(i, fib(i)))
In [44]:
printFib(20)
The 20th number of the Fibonnaci sequence is 6765.

Here, %d is a format code for integer. %f is for floating point numbers ( floats ), and %s is for strings.

Note how, to pass more than one thing in, we had to put it into round brackets. This is a tuple, we mentioned it briefly in the last notebook. It's basically just an immutable list. String formatting like this takes tuples.

In [45]:
# I modified this one from Learn Python The Hard Way ( highly recommended ) :
formatstring = "Start {} {}"
print(formatstring.format(formatstring, formatstring))
Start Start {} {} Start {} {}

Also worth knowing are \n and \t : the newline and tab characters, respectively.

In [46]:
# Written on-the-fly, because I got mad skills
print("This is a haiku\n\tI'm awful at poetry\nWait, this really worked")
This is a haiku
	I'm awful at poetry
Wait, this really worked

File IO

A very, very quick look at file IO, because there are packages that can do a better job.

In [47]:
myfile = open("example.txt", "r")
for line in myfile :
    print(line.strip("\n"))
    
# There are other options instead of looping over each line. 
# You can instead use myfile.read().
# Writing : you can dump a variable using myfile.write() 
# after having opened it in "w" mode.

# There are many other ways to read and write files, 
# including ways to read and write CSV directly.
This is an example text file.

It contains three lines.

Maybe four.

Four, definitely four.

Syntax Exercises

( very easy )

  • Generate a list A of integers from 4 to 28 inclusive in strides of 3.
    • range(), and list() if you're in Python 3
  • Generate a new list B, squaring each element of A.
    • list comprehensions
    • ** operator
  • Append a reversed version of B to itself
    • + or +=
    • [::-1] - be careful with reverse, it affects the list directly !
  • Write a function called addflip that will do all of this and return you the new list.
    • def
    • return
In [48]:
A = list(range(4, 29, 3))
print(A)
[4, 7, 10, 13, 16, 19, 22, 25, 28]
In [49]:
B = [a**2 for a in A]
print(B)
[16, 49, 100, 169, 256, 361, 484, 625, 784]
In [50]:
B += B[::-1]
print(B)
[16, 49, 100, 169, 256, 361, 484, 625, 784, 784, 625, 484, 361, 256, 169, 100, 49, 16]
In [51]:
def addflip(mylist) :
    squared = [element**2 for element in mylist]
    return squared + squared[::-1]

print(addflip(range(5)))
[0, 1, 4, 9, 16, 16, 9, 4, 1, 0]