Lab 0: Getting Started with Python for ECE 314

This is the first half of Lab 1 for ECE 314 Probability in Engineering Lab. We post it in case you would like to learn a bit about Python in advance of taking the course. At this point in your academic careers you should have some knowledge of object oriented computer programming. It would certainly help if you've had experience with Python, but if not, have no fear. Python is a very intuitive programming language. If you've coded in C#, JAVA, or Matlab you should have no trouble learning Python. Before we get too far into the code, we present a few general notions of what the environment will look like.

IPython Notebook:

The computer you are using to read this file probably has installed on it the Jupyter Notebook App or similar application to read IPython version 4 notebooks. We also assume the notebooks are run using Python version 2.7XX rather than version 3.4XX. For more information on installation or using an engineering work station (EWS) Linux machine, see instructions on the course webpage. An IPython Notebook file (with extension .ipynb) is an accumulation of cells, each composed of either code or markdown (i.e., text). Each code cell is individually executable. Each markdown cell can contain (among many things) LaTex and HTML. Throughout each lab you will be shown examples of code, probability theory, and coding applications. *You will need to be able modify this file to include your own answers and edits. Each of the questions is numbered in bold and we ask that you put all your responses/code in cells just after the stated questions. Let's go over some of the basics:

  1. Running a Cell: While the file is running one cell has the focus. To run the cell that is the current focus you can press the play button in the toolbar or use the shortcut SHIFT-ENTER. You will notice it brings the focus to the next cell after it completes. To run and keep focus in the same cell, you can use CTRL-ENTER. The order the cells run in can be important. In these labs the order will always go from top to bottom. In order to run code in the middle of the lab you may need to have run the code in a cell prior to it. 
  2. Stopping a Cell:There may come times when a particular section of code is causing errors or running an infinite loop. You may need to interrupt the cell from running. To do this simply click the stop button in the toolbar or use the shortcut CTRL-C
  3. Creating a Cell: A new cell can be created using the Insert tab at the top of the page. It will default to be a code type. You can change the cell type of any cell by clicking on it and then using the Cell tab at the top of the page. For normal text, use the "markdown" type. It allows you to use HTML and LaTex as well.
  4. Clearing Output: If your screen becomes too busy, it may be useful to be able to clear output. This can be done again from the Cell tab under "All Output". The program is still running, but has been reset.
  5. Saving Your File: There is an autosave that can be set to save your file at a given rate (default is to save once every two minutes). If you prefer saving on your own you can use the File tab or the CTRL-S shortcut. A handy feature, also under the File tab, is that you can revert to a previous saved checkpoint.
  6. Keyboard Shortcuts: It can be useful to learn the keyboard shortcuts for these. They allow you to insert cells, run code, clear code, at a much quicker a pace. The list can be displayed by typing Ctrl-m h, and can be found here:
  7. LaTex and Math: In these labs, you will be asked a number of questions, some requiring typed answers in a markdown cell, others requiring python answers in a code cell. It may be useful to learn LaTex to better explain yourself in mathematical terms. LaTex for the formulation of mathematical equations is very intuitive and can be picked up easily. For a reference, look here:

Introduction to Python Code

Importing Modules

Python is an object oriented programming language where the user has access to functions through imported packages. A package is a collection of modules in directories that have a hierarchy. The three most common packages that we will use in this course are numpy, scipy, and matplotlib, though we will pick up others along the way. Before you can use any of these, you must import them. You only need to import them once in an IPython Notebook file, and then any cell in the notebook can have access to them. Running the code below imports all the pakages you will need for this lab. The simple print statement lets you know when it's completed.

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import scipy as sp
import scipy.stats as st
print ("Modules Imported!")
Modules Imported!

The first line is slightly different than the others and uses what is known as a "magic" function. This particular "magic" function simply makes it so that the plots we generate with the matplotlib package occur inline as opposed to opening in new windows outside of the notebook.

Basic Math

Python is very similar to Matlab and can be used to solve numerical problems. We simply need to run an expression and it will output an answer.

In [ ]:

We can also create a variable, set it equal to an expression, and print the value.

In [ ]:
x = 3+4**2
print (x)

We used ** to represent an exponent. Similarly, we can take the square root of a number this way. Here is an attempt:

In [ ]:

You should get the answer 5 if running Python 3.x. Under Python 2.7, the division of integers 1/2 would return 0 and the final output would be 4. That could be corrected by changing 1/2 to 1./2.

Python handles lists very similarly to Matlab. We can set variables equal to lists and perform operations on them. We can change the contents of the list and they don't need to be of the same type. This is called being mutable. Note that Python indexes starting with 0, as shown below.

In [ ]:
x = [1,2,3,4,5]
y = [6,7,8,9,10]
print (x, y)
x[0] = 'Dog'
print (x[0])

Python also has what is known as a tuple. A tuple is very similar to a list, but is immutable. We cannot change the contents of the tuple. Tuples are often used to input or return objects. Below is the same code as above, but with tuples. It gives us an error message when we try to set x[0].

In [ ]:
x = (1,2,3,4,5)
y = (6,7,8,9,10)
print (x, y)
x[0] = 'Dog'
print (x[0])

Below is a list of tuples. It has two tuples and each tuple has five elements.

In [ ]:
x = [(1,2,3,4,5),(6,7,8,9,10)]
print (x)
print (x[0][3])

You may like to think of lists and tuples as arrays in some sense, but try to keep them separate. An array is actually an object from the NumPy module. We'll go over them a little bit further in the lab, but there are some notable differences.

Ifs, Loops, and Functions

If statements in Python are like those of most other languages. You need to use a keyword (if or else), followed by a condition, and finally a colon (:). Keep in mind instead of using brackets for grouping, Python goes by indentation. In the if statement below all parts of the if statement are contained within that indentation.

In [2]:
x = 3
y = 1
if x>y:
    print ("I")
    if x>3:
        print ("Hate")
        print ("Love")
    print ("Probability")
print ("!")

For loops use the keyword "for" followed by a variable and the keyword "in" and a certain range or vector. The same rules for indentation apply here. Recall that indexing starts at 0. The range(n) function simply creates a integer list from 0 to n-1 in whole number increments.

In [3]:
x = [0,0,0,0,0]
for i in range(5):
    c = 2*i**2
print (x)
[0, 2, 8, 18, 32]

Similarly, you can use while loops. In the code below, we make use of the .append method of a list to keep adding to our list without needing to know the size initially. (By the way, a "method" is a function associated with an object. In this case, append is a method associated with a list.)

In [5]:
x = [0]
i = 0
while x[i]<12:
    i = i+1
print (x)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

To specify a function, we need to use the "def" keyword. You need to give the number of inputs and have a return line to end your function. Below is a function that returns the factorial of the input.

In [ ]:
def factorial(x):
    c = 1
    for i in range(x,1,-1): #range(x,1,-1) creates a vector from x to 2 in -1 increments
        c = c*i
    return c

print (factorial(5))

You can also return multiple outputs. Technically, we are still returning a single object, but it is a tuple. We can unpack the tuple when we call the function. Below is a function that returns the first and last digit of any integer.

In [7]:
def firstNlast(x):
    l = x%10 # Uses the modulus operator %
    while x>0:
        f = x%10
        x = int(x/10)
    return f,l

x = 70094921348
first, last = firstNlast(x)
print (first, last)
7 8

The returned items get returned as a tuple and you can individually retrieve them by setting them equal to another tuple.

Using Modules

One of the reasons Python is so popular is due to the building capability of the modules. Remember those files we imported initially? We have access to all of the methods they contain. We abbreviated them to shorthand signifiers so we can code more quickly. It would be impossible to give you an overview of all the useful methods because there are so many. But they are fairly intuitive, so if you think something should be a method, it's probably included. Let's start with NumPy and create an array.

In [8]:
x = np.array([1,2,3,4,5])
print (x)
print (x[3])
[1 2 3 4 5]

In order to access the "array" method we just needed to type our signifier "np" and then put a decimal and the method. If you want a list of methods to come up as you're coding, after typing the decimal, hit tab on your keyboard. We can similarly declare multidemensional arrays, but notice the use of brackets and indexing. Unlike lists, arrays can only contain a single type. Indexing is also done a little more intuitively (like Matlab) than that of lists. Arrays are also mutable and can be used in multiple dimensions (to create matrices for instance).

In [9]:
x = np.array([[1,2,3],[4,5,6],[7,8,9]])
print (x)
print (x[0,0])
print (x[:,1])
print (x[1,:])
[[1 2 3]
 [4 5 6]
 [7 8 9]]
[2 5 8]
[4 5 6]

To give you a better idea of how to use these modules, here are a number of coding examples with functions that will be particularly useful to you this semester. Below we create a function and then plot it over time. Of course we need to properly title and label the graph.

In [10]:
def f(t): #Creates the function that we are going to plot
    return t**3-t**2+t-1

t = np.linspace(-10,10,1000) #Creates an array from -10 to 10 with 1000 points in it
plt.plot(t,f(t)) #Generates a plot of these two vectors.
plt.title('Function vs. Time')
plt.ylabel('Function Value')
Text(0, 0.5, 'Function Value')

The following code is going to create a large vector of random numbers using NumPy's random function. Then it's going to plot them. It's taking the random numbers from an exponential distribution and a normal (Gaussian) distribution. These are both continuous random variables which you will learn about later in the course.

In [11]:
x = np.random.exponential(1,size = 100) #Generates a vector of 100 points from the exponential distribution
y = np.random.normal(size = 100) #Generates a vector of 100 points from the Normal distribution
plt.plot(x,'ro', label='exponential') #Plots x in red circles with the label exponential
plt.plot(y,'go', label = 'normal')
plt.title('Random values.')
<matplotlib.legend.Legend at 0x1a146e4588>

This code creates two matrices, multiplies one times the transpose of the other and then finds the eigenvalues:

In [12]:
A = np.array([(3,7,9),(4,5,1),(12,6,3)]) #Creates Matrix A
B = np.array([(1,0,3),(2,4,0),(8,3,1)]) #Creates Matrix B
A_transpose = A.T #Takes the transpose of A
C = #Takes the matrix multiplication of A_transpose and B. Note using * performs a different operation on 2-d arrays
                       # * is the usual matrix multiplication when applied to np.matrix objects
print (np.linalg.eigvals(C)) #Uses the eigvals method under linalg under NumPy to print the eigenvalues
[149.57404656   8.88119895  16.54475449]

These are just the basics to be able to program in Python. I have highlighted some of what I feel are the most important functions or modules to know. For a more complete tutorial, take a look at

Creating Probability Distribution Objects for Discrete Distributions

The scipy stats package contains a number of functions for using and analyzing distributions. Two of its classes are rv_discrete and rv_continous, for discrete type and for continuous type distributions, respectively. A discrete probability distribution is specified by a set of possible values, $c_1,c_2, \ldots $ and associated probabilities for the values, $p_1, p_2, \ldots $ which sum to one. The probability mass function $p$ is defined by $p(c_i)=p_i$ for all $i,$ and $p(c)=0$ for values $c$ not in the list of possible values. A random variable $X$ has such a discrete distribution if $P\{X = u\} = p(u)$ for all $u.$

There are several important families of discrete probability distributions that frequently arise in applications. A very basic example is the Bernoulli distribution with parameter $p,$ where $0\leq p \leq 1.$ The distribution assigns probability $p$ to value 1, and probability $1-p$ to value 0. If a random variable $X$ has the Bernoulli distribution with parameter $p$, we call $X$ a Bernoulli random variable with parameter $p,$ and we write $X \sim Bernoulli(p).$ For example, if $X \sim Bernoulli(\frac{1}{4}),$ then $P\{X = 1\}=\frac{1}{4}$ and $P\{X = 0\}=1-\frac{1}{4} = \frac{3}{4}$. There is zero probability that $X$ is any value other than $1$ or $0$. The class rv_discrete within the scipy stats package is for working with general discrete type random variables, with many instances of the class corresponding to particular well known probability distribuions. It gives a convenient way to compute the mean, variance, pmf, and other attributes for a given distribution, and for generating random variates, using random number generators, with the given distribution.

For example, one instance of the rv_discrete class is the object for the bernoulli distribution. By specifying (aka freezing) a value for the parameter $p$ we create a more specialized instance of a rv_discrete class. The cumulative distribution function (CDF) of a random variable $X$ is the function $F_X$ defined by $F_X(c)=P\{X\leq c\}$ for any real value of $c.$ The CDF for the $Bernoulli(\frac{1}{4})$ distribution has a jump of size 3/4 at zero and a jump of size 1/4 at one.

In [13]:
p = 1./4 #Sets the probability, uses decimal to create double (not integer)
bernoulli25 = st.bernoulli(p) #Generates object for Bernoulli(0.25) distribution
x = np.linspace(-2,2,1001) #Generates a vector on [-2,2] with 1001 points in it
print ('Mean:', bernoulli25.mean())  #Prints the mean (aka expected value) for the distribution
print ('Var:', bernoulli25.var())   #Prints the variance of X
plt.plot(x,bernoulli25.cdf(x))  #Creates a graph of the cumulative distribution fucntion (CDF)  of X
plt.title('CDF of Bernoulli(0.25) distribution')
plt.axis([-2, 2, 0, 1.05])  # Sets the displayed ranges of x-axis and y-axis to be [-2, 2] and [0, 1.05]
Mean: 0.25
Var: 0.1875
[-2, 2, 0, 1.05]

Above, we were able to recreate our Bernoulli distribution through scipy.stats.

Problem 1: Using the scipy.stats package do the following:

  1. Print the mean and standard deviation of a Bernoulli variable where $p=\frac{14}{17}$
  2. Create a graph of the probability mass function (pmf). (The function is zero except at zero and one. Try adapting the code in the previous cell to plot the pmf. What happens if you change np.linspace(-2,2,1001) to np.linspace(-2,2,1000)?)
In [ ]:
########Student Answer##############

Student Answer for last part of Problem 1, part 2. (Questions such as "What happens if . . . ?" and "Can you explain why?" call for answers writen out as text in a markdown cell such as this one, rather than in a code cell.):

# # This completes Lab 0 (first part of Lab 1). You will not need to turn in Lab 0; it is meant only to help you get started with Python notebooks before the course begins.