Introduction to Python and Jupyter notebooks¶

Python is a general computing language that is very popular in the sciences and also in statistics. Code along with maths are often elegantly presented using Jupyter notebooks, where one can use Markdown and Latex. It is also possible to run R within Jupyter notebooks. A key feature to keep in mind is that indented coding is required.

Jupyter notebooks can be run online, and also locally. An easy way to obtain Python and Juptyer is to use the Anaconda package.

Maths and Stats in Python¶

Since Python is general computing language, you may need to install some packages, such as Numpy to do various basic operations that are available to you by default, in say R. In what follows we illustrate some common useful operations. In this module, we will only use basic packages, and we will do our own coding.

In [1]:

import numpy as np       #you may need to install this package

y = np.array([1,2,3])     # to make vectors
x = np.array([4,5,6])
print(x+y)                
z=np.concatenate((x,y))   # like the combine function in R
print(z)
print(len(z))             # length of a vector
print(y[0])                # indexing starts at zero
print(y[2])
print(x[-1])               # last element of the vector

t=np.linspace(0,1,num=10)
print(t)

[5 7 9]
[4 5 6 1 2 3]
6
1
3
6
[0.         0.11111111 0.22222222 0.33333333 0.44444444 0.55555556
 0.66666667 0.77777778 0.88888889 1.        ]

In [2]:

import matplotlib.pyplot as plt  # In order to plot functions
t=np.linspace(0,2,num=2000)
y = pow(t, 2)
plt.plot(t,y, label='Square')
plt.plot(t,np.exp(t), label='Exp')
plt.legend(loc='upper left')
plt.show()

Simulating random variables in Python¶

In what follows, we simulate pseudo random numbers in Python.

In [3]:

u=np.random.random()  # calls a uniform [0,1]
v=np.random.normal(0, 1)  # calls a standard normal

print(u)
print(v)

u = [np.random.random() for _ in range(10)]  # calls 10 uniforms
print(u)
v = [ np.random.normal(0, 1) for _ in range(10)] # calls 10 standard normals
print(v)



v = np.random.standard_t(4)   # calls t-distribution with 4 degrees of freedom
print(v)



x = np.random.binomial(1, 0.3, 2) # calls two biased coin flips
print(x)

0.7639152015372788
0.5241284824517142
[0.8577452281049696, 0.7002250732042746, 0.7679407345709673, 0.540571421185523, 0.6178957502185759, 0.6303587568984433, 0.44381978342373674, 0.07826520978324247, 0.0774643659498514, 0.5563360801325067]
[-0.39788099950984124, 0.5932487482514452, 1.9559293930657926, 2.456239766112847, -0.9415710934995281, -0.3874814447265034, -0.5755147958585493, -0.8992606007545129, -0.7642800460358097, -0.3047355622740535]
-0.24080569241323488
[0 1]

Von Neumann's method of creating a fair toss, from a possibly biased coin of unknown parameter¶

Suppose we are given a coin, with unknown parameter $p$, so that the probability that the outcome of a flip is heads is $p$ which is not necessary $\tfrac{1}{2}$. We want to generate a method of simulating a fair toss. It is easy to verify that the following method works. We toss the coin twice:

If we see HT, then we consider this as H;
If we see TH, then we consider this T;
If we see HH or TT, we lose, and we start all over, and have to toss the coin at least twice again.

The following code will illustrate the use a function and a while loop.

In [4]:

def vn(p):   
    x=np.array([1,1])  # notice the indent the lack of brakcets   
    num=np.sum(x)
    while num==0 or num==2:     # notice the indent
        x = np.random.binomial(1, p, 2)
        num = np.sum(x)
    return x[0]

z= vn(0.3)
print(z)

We run some simple (but not complete) verfications to make sure the defined function does indeed give fair flip, when $p=0.2$.

In [5]:

x = [vn(0.2)  for _ in range(10)]
print(x)
x = [vn(0.2)  for _ in range(1000)]
print(np.mean(x))
print(np.std(x))

[0, 0, 1, 1, 0, 0, 1, 0, 0, 1]
0.496
0.4999839997439918

Summing two independent uniforms¶

We will run some simple code to discover the resulting distribution.

In [6]:

x = [np.random.uniform() for _ in range(2)]
y = [np.random.uniform() for _ in range(2)]
print(x)
print(y)
z = x+y
print(z)  # notice what happens

x = np.array([np.random.uniform() for _ in range(10000)])
y = np.array([np.random.uniform() for _ in range(10000)])
z = x+y  #now we get component-wise addition

plt.hist(z, bins = 50, density=True)  # this gives a probability histogram, rather than a frequency one.
plt.show()

[0.3050875231505201, 0.09758995594201147]
[0.07175069206742335, 0.5286999259718799]
[0.3050875231505201, 0.09758995594201147, 0.07175069206742335, 0.5286999259718799]

R inside Jupyter¶

It is possible to use R with Jupyter notebooks. See this ipynb file. It is a little harder to run both R and Python on the same notebook. But it is possible.

Other resources¶

In [7]:

from datetime import datetime
print(datetime.now())

2021-10-02 01:14:56.762181