Information Retrieval Lab: Python Tutorial 🐍¶

(Re)sources:¶

Obligatory Wikipedia excerpt:

Python is an interpreted, high-level and general-purpose programming language. Created by Guido van Rossum and first released in 1991, Python's design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects.

Python is dynamically typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly, procedural), object-oriented, and functional programming. Python is often described as a "batteries included" language due to its comprehensive standard library.

An important goal of Python's developers is keeping it fun to use. This is reflected in the language's name—a tribute to the British comedy group Monty Python—and in occasionally playful approaches to tutorials and reference materials, such as examples that refer to spam and eggs (from a famous Monty Python sketch) instead of the standard foo and bar.

A common neologism in the Python community is pythonic, which can have a wide range of meanings related to program style. To say that code is pythonic is to say that it uses Python idioms well, that it is natural or shows fluency in the language, that it conforms with Python's minimalist philosophy and emphasis on readability. In contrast, code that is difficult to understand or reads like a rough transcription from another programming language is called unpythonic.

The language's core philosophy is summarized in the document The Zen of Python (PEP 20) [...]

In [ ]:
#MAKE PAGE WIDER
from IPython.display import HTML
display(HTML("<style>.container { width:80% !important; }</style>"))

In [ ]:
import this


Python is Dynamically Typed¶

In [ ]:
x = 1         # x is an integer
x = 1.        # x is now a float
x = 'Hello, world!'   # x is now a string
x = [1, 2, 3, 4] # x is now a list


A Python integer is not a C integer:¶

The C struct behind the curtain:

struct _longobject {
long ob_refcnt;
PyTypeObject *ob_type;
size_t ob_size;
long ob_digit[1];
};


Everything is an object (even functions)¶

In [ ]:
x=1
type(x)

In [ ]:
x=1.
type(x)

In [ ]:
x = 'Hello, World!'
type(x)

In [ ]:
x = [1,2,3,4]
type(x)

In [ ]:
x = [1, 2, 3]
y = x

In [ ]:
x.append(4) # append 4 to x
print(y) # y's list is modified as well!

In [ ]:
x = 'something else'
print(y)  # y is unchanged


Arithmetic Operations¶

Python implements seven basic binary arithmetic operators:

Operator Name Description
a + b Addition Sum of a and b
a - b Subtraction Difference of a and b
a * b Multiplication Product of a and b
a / b True division Quotient of a and b
a // b Floor division Quotient of a and b, removing fractional parts
a % b Modulus Integer remainder after division of a by b
a ** b Exponentiation a raised to the power of b

Boolean Operations¶

When working with Boolean values, Python provides operators to combine the values using the standard concepts of "and", "or", and "not". Predictably, these operators are expressed using the words and, or, and not:

In [ ]:
a=5
b=2
c=3

In [ ]:
a==5 and b==2

In [ ]:
a==5 and not c==5

In [ ]:
(a==5 or not b==6) and not c==3

In [ ]:
a==5 or not b==6 and not c==3

In [ ]:
(a==5 or not b==6) and not c==3


Identity and Membership Operators¶

Python also contains prose-like operators to check for identity and membership. They are the following:

Operator Description
a is b True if a and b are identical objects
a is not b True if a and b are not identical objects
a in b True if a is a member of b
a not in b True if a is not a member of b
In [ ]:
a = [1,2,3]
b = a
b is a

In [ ]:
a = [1,2,3]
b = [1,2,3]
b is a

In [ ]:
1 in [1,2,3]

In [ ]:
'foo' in [1,2,3]

In [ ]:
5 not in [1,2,3]


Python Scalar Types¶

Type Example Description
int x = 1 integers (i.e., whole numbers)
float x = 1.0 floating-point numbers (i.e., real numbers)
complex x = 1 + 2j Complex numbers (i.e., numbers with real and imaginary part)
bool x = True Boolean: True/False values
str x = 'abc' String: characters or text
NoneType x = None Special object indicating nulls
In [ ]:



Python Data Structures¶

Type Name Example Description
list [1, 2, 3] Ordered collection
tuple (1, 2, 3) Immutable ordered collection
dict {'a':1, 'b':2, 'c':3} Unordered (key,value) mapping
set {1, 2, 3} Unordered collection of unique values

Lists¶

In [ ]:
L = [2, 3, 5, 7]

In [ ]:
# Length of a list
len(L)

In [ ]:
# Append a value to the end
L.append(11)
L

In [ ]:
# Addition concatenates lists
L + [13, 17, 19]

In [ ]:
# sort() method sorts in-place
L = [2, 5, 1, 6, 3, 4]
L.sort()
L

In [ ]:
L = [1, 'two', 3.14, [0, 3, 5]]

In [ ]:
L = [2, 3, 5, 7, 11]

In [ ]:
L[0]

In [ ]:
L[1]

In [ ]:
L[-1]

In [ ]:
L[-2]

In [ ]:
L[0:3]

In [ ]:
L[:3]

In [ ]:
L[3:]

In [ ]:
L[:3]+L[3:]

In [ ]:
L.reverse() #This happen
L

In [ ]:
l = [7, 1, 2]
r = [9, 6, 8]
l + r*2

In [ ]:
print(l)
l.sort()
print(l)

In [ ]:
print(r)
sorted(r)

In [ ]:
print(r)

In [ ]:
sorted?

In [ ]:
dir(l)


Dicts¶

In [ ]:
numbers = {'one':1, 'two':2, 'three':3}

In [ ]:
numbers['two']

In [ ]:
numbers['ninety'] = 90

In [ ]:
k = numbers.keys()
k

In [ ]:
v = numbers.values()
v

In [ ]:
dir(numbers)

In [ ]:
for thing in numbers.keys():
print(thing)


Sets¶

In [ ]:
primes = {2, 3, 5, 7}
odds = {1, 3, 5, 7, 9}

In [ ]:
# union: items appearing in either
primes | odds      # with an operator
primes.union(odds) # equivalently with a method

In [ ]:
# intersection: items appearing in both
primes & odds             # with an operator
primes.intersection(odds) # equivalently with a method

In [ ]:
# difference: items in primes but not in odds
primes - odds           # with an operator
primes.difference(odds) # equivalently with a method

In [ ]:
# symmetric difference: items appearing in only one set
primes ^ odds                     # with an operator
primes.symmetric_difference(odds) # equivalently with a method

In [ ]:
numbers = {'one', 'four', 'twenty'}

In [ ]:
k & numbers

In [ ]:
k | numbers

In [ ]:
k - numbers

In [ ]:
k ^ numbers


Control Flow¶

In [ ]:
x = -15

if x == 0:
print(x, "is zero")
elif x > 0:
print(x, "is positive")
elif x < 0:
print(x, "is negative")
else:
print(x, "is unlike anything I've ever seen...")


Conditional Expression¶

Introduced in PEP 308, and often referred to as a ternary operator:

x = x_if_true if condition else x_if_false


which is the succint version of:

if condition:
x = x_if_true
else:
x = x_if_false

In [ ]:
sun_shining = False
x = 35 if sun_shining else -4
print(x)

In [ ]:
def sun_shining(sun_shining=True):
return 35 if sun_shining else -4

In [ ]:
sun_shining(True)

In [ ]:
sun_shining(False)


for loops¶

In [ ]:
for N in [2, 3, 5, 7]:
print(N, end=' ') # print all on same line

In [ ]:
for i in range(10):
print(i, end=' ')

In [ ]:
for n in range(20):
# if the remainder of n / 2 is 0, skip the rest of the loop
if n % 2 == 0:
continue
print(n, end=' ')


while loops¶

In [ ]:
i = 0
while i < 10:
print(i, end=' ')
i += 1

In [ ]:
a, b = 0, 1
amax = 100
L = []

while True:
(a, b) = (b, a + b)
if a > amax:
break
L.append(a)

print(L)


Functions¶

In [ ]:
def fibonacci(N, a=0, b=1):
L = []
while len(L) < N:
a, b = b, a + b
L.append(a)
return L

In [ ]:
fibonacci(10)

In [ ]:
fibonacci(10, b=3, a=1)

In [ ]:
def catch_all(*args, **kwargs):
print("args =", args)
print("kwargs = ", kwargs)

In [ ]:
catch_all(1, 2, 3, a=4, b=5)

In [ ]:
catch_all('a', keyword=2)

In [ ]:
inputs = (1, 2, 3)
keywords = {'pi': 3.14}

catch_all(*inputs, **keywords)

In [ ]:
add = lambda x, y: x + y

In [ ]:
things = ["cat", "apple", "boat"]
sorted(things) # alphabetically, upper case first

In [ ]:
sorted(things, key=lambda x: len(x))


Standard Library¶

Math¶

In [ ]:
import math

math.log2(1024)

In [ ]:
math.log(math.e)

In [ ]:
math.cos(math.pi)


Random¶

In [ ]:
import random as rnd  ## you can re-name imported modules

rnd.randint(1, 6)  ## Here, the end points are both included

In [ ]:
things = ['cat', 'apple', 'boat']
rnd.choice(things)


urllib¶

In [ ]:
## Modules can have sub-modules
import urllib.request as rq

response = rq.urlopen("http://en.wikipedia.org/wiki/Python")



itertools¶

In [ ]:
import itertools

perms = itertools.permutations([1, 2, 3], r=2)
# r-length tuples, all possible orderings, no repeated elements
# default r: length of the iterable

for p in perms:
print(p)

In [ ]:
combs = itertools.combinations([1, 2, 3], r=2)
# r-length tuples, in sorted order, no repeated elements

print(list(combs))


Comprehensions¶

List Comprehensions¶

In [ ]:
[n for n in range(11)]

In [ ]:
perms = itertools.permutations([1, 2, 3], r=2)
[p for p in perms]

In [ ]:
perms = itertools.permutations([1, 2, 3], r=2)
[thing for thing in perms]

In [ ]:
[n**2 for n in range(11)]

In [ ]:
[n for n in range(11) if n%3] # n%3 is shorthand for n%3!=0

In [ ]:
[n if n%2 else -n for n in range(11) if n%3]

In [ ]:
[(n,n) if n%2 else (-n,9) for n in range(11) if n%3]


Dictionary Comprehensions¶

In [ ]:
list_of_tuples = [(n,n) if n%2 else (-n,9) for n in range(11) if n%3]
{a:b for a,b in list_of_tuples}

In [ ]:
numbers

In [ ]:
{v:k for k,v in numbers.items()}


Set Comprehensions¶

In [ ]:
{a%4 for a in range(1000)}


Lambdas¶

Lambdas can be used to create anonymous functions.

In [ ]:
sum_lambda = lambda x, y: x+y

sum_lambda(3,2)


Map, Filter, Reduce¶

Supports map, filter, and reduce functions. All three can be replaced with List Comprehensions or loops, but often provide a more elegant solution. Keep in mind that they return a generator by default, so we have to cast them back into a list to actually apply the transformation.

In [ ]:
list(map(lambda x: x+1, [1,2,3,4,5]))

In [ ]:
list(filter(lambda x: x % 2 == 0, [1,2,3,4,5]))

In [ ]:
# in Python3, reduce() isn't a built-in function anymore
# and has to be imported from the functools module
from functools import reduce

reduce(lambda x, y: x+y, [1,2,3,4,5])