Notebook

Basic Programming Using Python: Creating Functions¶

Objectives¶

Explain the benefits of breaking a program up into functions.
Define a function of a single parameter.
Explain what a call stack is, what a variable's scope is, and how the two concepts are related.
Trace values as they are passed into and returned by functions.
Compose function calls.

Declaring a Function¶

We used functions several times in the previous lesson. It is now time to learn how to create new ones ourselves so that we don't have to type in the same lines of code repeatedly.

A function's job is to bundle several steps together so that they can be used as if they were a single command—in other words, to create a new verb in our programming language. The simplest possible function is one that produces the same value each time it is called:

In [1]:

def zero():
    return 0

We define a new function in Python using the keyword def, followed by the function's name. The empty parentheses signal that the function doesn't take any inputs—we'll see functions that do in a moment. The colon signals the start of a new block of code, called the body of the function, which is indented. The keyword return then specifies the value the function produces when it is called.

Defining a function tells the computer how to do something. To actually do that "something", we need to call the function:

In [2]:

result = zero()
print 'function produced:', result

function produced: 0

When Python sees the call zero() it sets aside whatever it was doing, does whatever the function zero tells it to do, and then continues with its original calculation using the function's result. In this case the overall effect is to assign 0 to result, which is then printed. We can achieve the same effect without the assignment:

In [3]:

print 'function produced:', zero()

function produced: 0

Functions that always produce the same value aren't particularly useful, so most functions take input values called parameters or arguments. A parameter is a variable that is assigned a value each time the function is called. For example, this function that converts a temperature from Fahrenheit to Kelvin has one parameter called temp:

In [4]:

def fahr_to_kelvin(temp):
    return ((temp - 32.0) * 5.0/9.0) + 273.15

print 'water freezes at', fahr_to_kelvin(32)
print 'water boils at', fahr_to_kelvin(212)

water freezes at 273.15
water boils at 373.15

Functions can have any number of parameters. When we call a function, we must provide as many values as there are parameters; values are assigned to parameters from left to right. For example, here's a function that calculates the average of three values:

In [8]:

def average3(left, middle, right):
    return (left + middle + right) / 3

x = 3
y = 5
z = 4
print 'average is:', average3(x, y, z)

average is: 4

The Stack¶

The key to understanding how functions work, and to debugging them when they don't, is to understand exactly what happens when a function is called. After Python has executed the first six lines of the program above, the variables x, y, and z are stored like this in memory:

Python stack frame

The box containing those three variables is called a stack frame. When average3 is called on line 7, Python creates a new stack frame on top of the first one, then creates the variables left, middle, and right in it:

average3 stack

This pile of one set of variables on top of another is called the function call stack, or just "stack". When the function call is finished, Python discards the topmost stack frame and starts using the one beneath it again. The temporary variables left, middle, and right vanish, which is why trying to print one of their values fails:

The program displays the string 'middle after call:', but when Python tries to get the value of middle, it discovers that it doesn't exist any longer and reports an error. (The error appears first because the Python interpreter gives higher priority to error messages than "normal" output.)

In [10]:

print 'middle after call:', middle

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-10-8ff7c0159b4f> in <module>()
----> 1 print 'middle after call:', middle

NameError: name 'middle' is not defined

 middle after call:

Let us create a new function kelvin_to_celsius to convert temperatures from, well, Kelvin to Celsius.

In [17]:

def kelvin_to_celsius(temp):
    return temp - 273.15

print 'absolute zero is', kelvin_to_celsius(0.0), 'degrees Celsius'

absolute zero is -273.15 degrees Celsius

Now that we have it, we don't have to do any calculations to convert Fahrenheit to Celsius. Instead, we can combine the two functions we already have. This idea of composing function calls is very powerful and is at the core of what makes functions so useful.

In [19]:

def fahr_to_celsius(temp):
    degrees_k = fahr_to_kelvin(temp)
    degrees_c = kelvin_to_celsius(degrees_k)
    return degrees_c

body_temp_f = 98.6
print 'body temperature in Celsius:', fahr_to_celsius(body_temp_f)

body temperature in Celsius: 37.0

Why does Python go to all the trouble of creating and discarding stack frames? To understand the answer let's trace what happens when we calculate human body temperature in °C. Just before Python executes line 7 of our program, the stack consists of a single frame that contains the variable body_temp_f:

body temperature stack frame

When fahr_to_celsius is called, Python puts a new frame on the stack containing a variable temp (the parameter of fahr_to_celsius):

fahr_to_celcius stack

The first thing fahr_to_celsius does is call fahr_to_kelvin. Python creates yet another stack frame to keep track of this call, and this frame also contains a variable called temp:

fahr_to_kelvin stack

fahr_to_kelvin's temp is not the same variable as fahr_to_celsius's temp. The two variables have the same names, but because they're in different stack frames, they're different variables. In this case, they happen to reference the same object, but that is not because their names are the same.

When fahr_to_kelvin finishes running, its stack frame is thrown away:

fahr_to_kelvin result stack

and the value fahr_to_kelvin produced is assigned to a new variable called degrees_k. This variable is created in fahr_to_celsius's stack frame; like the parameter temp, it only exists as long as the function is being executed. fahr_to_celsius then passes this value to kelvin_to_celsius, and once again, Python creates a new stack frame to keep track of the call:

kelvin_to_celcius stack

When kelvin_to_celsius is finished running, its result is assigned to degrees_c and its stack frame is discarded:

kelvin_to_celcius result stack

Since fahr_to_celsius is now finished, Python discards its stack frame and prints the final result.

fahr_to_celcius result stack frame

You can see an interactive visualization of the frames in our temperature conversion program <a href=http://www.pythontutor.com/visualize.html#code=def+kelvin_to_celsius(temp)%3A%0A++++return+temp+-+273.15%0A%0Adef+fahr_to_kelvin(temp)%3A%0A++++return+((temp+-+32.0)+*+5.0/9.0)+%2B+273.15%0A%0Adef+fahr_to_celsius(temp)%3A%0A++++degrees_k+%3D+fahr_to_kelvin(temp)%0A++++degrees_c+%3D+kelvin_to_celsius(degrees_k)%0A++++return+degrees_c%0A%0Abody_temp_f+%3D+98.6%0Aprint+%27body+temperature+in+Celsius%3A%27,+fahr_to_celsius(body_temp_f)&mode=display&cumulative=true&heapPrimitives=false&drawParentPointers=false&textReferences=false&showOnlyOutputs=false&py=2&curInstr=0%3Eat the Online Python Tutor. Here we can get an idea of how frames are created and discarded as you go through the program.

Students use the concept of function stacks intuitively in a lot of situations. Imagine you're sitting in an exam where you have to solve math problems like ((1.02\*4.96)+8.7\*2.3\*1.1)/3 only with pen and paper. But you're not supposed to write anything but the final solution on the solution sheet but you can use small pieces of draft paper. So you're computing intermediate results on draft paper, having one piece of paper for the first multiplication and one for the second, then copying their results on a different piece of paper for the addition and finally the division. All the intermediate computations and results are the different stack levels, where the local variables (the intermediate results) are discarded and only the final result is kept.

Creating Thumbnails¶

We're now ready to write a function that creates thumbnails. The program we had at the end of our previous lesson was:

In [1]:

from skimage import novice
flower = novice.open('flower.png')
new_height = flower.height * (100.0 / flower.width)
flower.size = (100, new_height)
flower.show()

skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8

Out[1]:

Most programmers put import statements at the top of their programs rather than inside functions, both to make it easier for people to see what libraries a program depends on, and because many functions might depend on the contents of a particular library. That leaves us with four lines of code to encapsulate in our function, which we will rather unimaginatively call make_thumbnail:

In [16]:

def make_thumbnail(filename):
    picture = novice.open(filename)
    new_height = int(picture.height * 100.0 / picture.width)
    picture.size = (100, new_height)
    return picture

Our function takes a single parameter, which is the name of the image file to be thumbnailed, and loads and thumbnails that picture. As always, defining the function tells Python how to do something new, but it doesn't actually do that "something" until we call the function:

In [19]:

flower = make_thumbnail('flower.png')
flower.show()

skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8

Out[19]:

We can now make other thumbnails with a single call:

In [20]:

biking = make_thumbnail('biking.png')
biking.show()

skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8

Out[20]:

Default Parameter Values¶

What if we want to create thumbnails that are 80 pixels wide as well? One possibility would be to create another function called make_thumbnail_80 that contained exactly the same lines of code, but with the number 80 everywhere instead of the number 100. This would work, but would be a bad design. A major reason for writing functions is to reduce duplicated code. If we have two or more functions that contain almost the same code, we haven't really achieved that. (It's also bad design to have one function called make_thumbnail_80 and another called make_thumbnail: if one function's name specifies the thumbnail size, the other should as well.)

A better design is to require users to tell us how wide they want thumbnails to be:

In [24]:

def make_thumbnail(filename, width):
    picture = novice.open(filename)
    new_height = int(picture.height * float(width) / picture.width)
    picture.size = (width, new_height)
    return picture

Let's try it for two different sizes of thumbnails:

In [25]:

test_100 = make_thumbnail('flower.png', 100)
test_100.show()

skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8

Out[25]:

In [26]:

test_80 = make_thumbnail('flower.png', 80)
test_80.show()

skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8

Out[26]:

It does what we want, but we can go one step further. Suppose that thumbnails are almost always 100 pixels wide, and that other sizes are rare. In that case, we can define make_thumbnail_width with a width parameter to handle the general case, and make_thumbnail_default without such a parameter to handle the usual case. Rather than duplicating code, we have the second call the first with the default width as a second parameter:

In [27]:

def make_thumbnail_width(filename, width):
    picture = novice.open(filename)
    new_height = int(picture.height * float(width) / picture.width)
    picture.size = (width, new_height)
    return picture

def make_thumbnail_default(filename):
    return make_thumbnail_width(filename, 100)

In some languages, like C and Fortran, this is the best we can do. In Python and many other modern languages, though, we can improve our design even further by writing a single function with two parameters, and specifying a default value for the second parameter:

In [28]:

def make_thumbnail(filename, width=100):
    picture = novice.open(filename)
    new_height = int(picture.height * float(width) / picture.width)
    picture.size = (width, new_height)
    return picture

If we call this function with one parameter, the default value of 100 is assigned to width:

In [29]:

temp = make_thumbnail('biking.png')
temp.show()

skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8

Out[29]:

If we call it with two, though, the second value we give it overrides that default:

In [30]:

temp = make_thumbnail('biking.png', 60)
temp.show()

skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8

Out[30]:

When creating functions with default values, the parameters which have default values have to be at the end of the parameter list. For example if we want to directly save the thumbnail as image, the following function would create an error:

In [1]:

def make_thumbnail_and_save(filename, width=100, thumbnailname):
    picture = novice.open(filename)
    new_height = int(picture.height * float(width) / picture.width)
    picture.size = (width, new_height)
    picture.save(thumbnailname)
    return picture

  File "<ipython-input-1-dd640766165e>", line 1
    def make_thumbnail_and_save(filename, width=100, thumbnailname):
SyntaxError: non-default argument follows default argument

Because it is not allowed to have parameters without default values after a parameter with a default value. The solution here can be either

a) reorder the parameters, such that the parameters with default values are in the end, or
b) add a default value to the new parameter

Taking option b), the correct function definition now looks like this:

In [2]:

def make_thumbnail_and_save(filename, width=100, thumbnailname='mythumb.png'):
    picture = novice.open(filename)
    new_height = int(picture.height * float(width) / picture.width)
    picture.size = (width, new_height)
    picture.save(thumbnailname)
    return picture

If we want to call make_thumbnail_and_save() with filename and thumbnailname given as parameters, but not width, we have to explicitly tell Python which parameters of those with default value we are passing. If we call

In [ ]:

make_thumbnail_and_save('biking.png', 'littlebike.png')

Python thinks the second parameter is the second parameter of the function definition - width - and tries to use 'littlebike.png' as image size and obviously fails. Therefore we have to tell Python that the second parameter is thumbnailname:

In [ ]:

make_thumbnail_and_save('biking.png', thumbnailname='littlebike.png')

Now Python knows that it should use the default witdh and use the string as thumbnailname. This is especially helpful for functions with dozens of parameters, e.g. plotting routines or mathematical optimization functions.

Seven Plus Or Minus Two¶

We set out to write a thumbnailing function so that we wouldn't have to type in the same calculations over and over. Now that we have it, we can see another reason for building programs out of functions. Human short-term memory can only hold a few items at a time; the value is sometimes given as "seven plus or minus two", and while that is an over-simplification, it's a good guideline. If we need to remember more unrelated bits of information than that for more than a few seconds, they become jumbled and we start making mistakes. If we have to keep more than half a dozen things straight in our mind in order to understand or change a piece of code, we will therefore start making mistakes. Since most calculations involve more than half a dozen steps, we have to group those steps together and give them names if we're to have any hope at all of getting them right. Using functions also makes the code easier to maintain. For example, we have a set of data we are working on and we want to visualize the data after each processing step. If we write the visualization code after each processing step and later want to change something in the visualization, we have to change it several times. This makes the code prone to inconsistencies because we overlooked one visualization or made a typo. If we write a function which does the visualization and just call the function after each processing, we only have to make changes in the function.

Key Points¶

Define a function using def name(...)
The body of a function must be indented.
Use name(...) to call a function.
Use return to return a value from a function.
The values passed into a function are assigned to its parameters in left-to-right order.
Function calls are recorded on a call stack.
Every function call creates a new stack frame.
The variables in a stack frame are discarded when the function call completes.
Grouping operations in functions makes code easier to understand and re-use.