We used functions several times in the previous lesson. It is now time to learn how to create new ones ourselves so that we don't have to type in the same lines of code repeatedly.
A function's job is to bundle several steps together so that they can be used as if they were a single command—in other words, to create a new verb in our programming language. The simplest possible function is one that produces the same value each time it is called:
def zero():
return 0
We define a new function in Python using the keyword def
,
followed by the function's name.
The empty parentheses signal that the function doesn't take any inputs—we'll
see functions that do in a moment.
The colon signals the start of a new block of code,
called the body of the function,
which is indented.
The keyword return
then specifies the value the function produces
when it is called.
Defining a function tells the computer how to do something. To actually do that "something", we need to call the function:
result = zero()
print 'function produced:', result
function produced: 0
When Python sees the call zero()
it sets aside whatever it was doing,
does whatever the function zero
tells it to do,
and then continues with its original calculation using the function's result.
In this case the overall effect is to assign 0 to result
,
which is then printed.
We can achieve the same effect without the assignment:
print 'function produced:', zero()
function produced: 0
Functions that always produce the same value aren't particularly useful,
so most functions take input values
called parameters or arguments.
A parameter is a variable that is assigned a value each time the function is called.
For example,
this function that converts a temperature from Fahrenheit to Kelvin
has one parameter called temp
:
def fahr_to_kelvin(temp):
return ((temp - 32.0) * 5.0/9.0) + 273.15
print 'water freezes at', fahr_to_kelvin(32)
print 'water boils at', fahr_to_kelvin(212)
water freezes at 273.15 water boils at 373.15
Functions can have any number of parameters. When we call a function, we must provide as many values as there are parameters; values are assigned to parameters from left to right. For example, here's a function that calculates the average of three values:
def average3(left, middle, right):
return (left + middle + right) / 3
x = 3
y = 5
z = 4
print 'average is:', average3(x, y, z)
average is: 4
The key to understanding how functions work,
and to debugging them when they don't,
is to understand exactly what happens when a function is called.
After Python has executed the first six lines of the program above,
the variables x
, y
, and z
are stored like this in memory:
The box containing those three variables is called a stack frame.
When average3
is called on line 7,
Python creates a new stack frame on top of the first one,
then creates the variables left
, middle
, and right
in it:
This pile of one set of variables on top of another is called the function call stack,
or just "stack".
When the function call is finished,
Python discards the topmost stack frame
and starts using the one beneath it again.
The temporary variables left
, middle
, and right
vanish,
which is why trying to print one of their values fails:
The program displays the string 'middle after call:'
,
but when Python tries to get the value of middle
,
it discovers that it doesn't exist any longer
and reports an error.
(The error appears first because the Python interpreter gives higher priority to error messages than "normal" output.)
print 'middle after call:', middle
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-10-8ff7c0159b4f> in <module>() ----> 1 print 'middle after call:', middle NameError: name 'middle' is not defined
middle after call:
Let us create a new function kelvin_to_celsius
to convert temperatures from, well, Kelvin to Celsius.
def kelvin_to_celsius(temp):
return temp - 273.15
print 'absolute zero is', kelvin_to_celsius(0.0), 'degrees Celsius'
absolute zero is -273.15 degrees Celsius
Now that we have it, we don't have to do any calculations to convert Fahrenheit to Celsius. Instead, we can combine the two functions we already have. This idea of composing function calls is very powerful and is at the core of what makes functions so useful.
def fahr_to_celsius(temp):
degrees_k = fahr_to_kelvin(temp)
degrees_c = kelvin_to_celsius(degrees_k)
return degrees_c
body_temp_f = 98.6
print 'body temperature in Celsius:', fahr_to_celsius(body_temp_f)
body temperature in Celsius: 37.0
Why does Python go to all the trouble of creating and discarding stack frames?
To understand the answer let's trace what happens when we calculate human body temperature in °C.
Just before Python executes line 7 of our program,
the stack consists of a single frame
that contains the variable body_temp_f
:
When fahr_to_celsius
is called,
Python puts a new frame on the stack
containing a variable temp
(the parameter of fahr_to_celsius
):
The first thing fahr_to_celsius
does is call fahr_to_kelvin
.
Python creates yet another stack frame to keep track of this call,
and this frame also contains a variable called temp
:
fahr_to_kelvin
's temp
is not the same variable as fahr_to_celsius
's temp
.
The two variables have the same names, but because they're in different
stack frames, they're different variables. In this case, they happen to
reference the same object, but that is not because their names are the
same.
When fahr_to_kelvin
finishes running,
its stack frame is thrown away:
and the value fahr_to_kelvin
produced is assigned to a new variable called degrees_k
.
This variable is created in fahr_to_celsius
's stack frame;
like the parameter temp
,
it only exists as long as the function is being executed.
fahr_to_celsius
then passes this value to kelvin_to_celsius
,
and once again,
Python creates a new stack frame to keep track of the call:
When kelvin_to_celsius
is finished running,
its result is assigned to degrees_c
and its stack frame is discarded:
Since fahr_to_celsius
is now finished,
Python discards its stack frame
and prints the final result.
You can see an interactive visualization of the frames in our temperature conversion program <a href=http://www.pythontutor.com/visualize.html#code=def+kelvin_to_celsius(temp)%3A%0A++++return+temp+-+273.15%0A%0Adef+fahr_to_kelvin(temp)%3A%0A++++return+((temp+-+32.0)+*+5.0/9.0)+%2B+273.15%0A%0Adef+fahr_to_celsius(temp)%3A%0A++++degrees_k+%3D+fahr_to_kelvin(temp)%0A++++degrees_c+%3D+kelvin_to_celsius(degrees_k)%0A++++return+degrees_c%0A%0Abody_temp_f+%3D+98.6%0Aprint+%27body+temperature+in+Celsius%3A%27,+fahr_to_celsius(body_temp_f)&mode=display&cumulative=true&heapPrimitives=false&drawParentPointers=false&textReferences=false&showOnlyOutputs=false&py=2&curInstr=0%3Eat the Online Python Tutor. Here we can get an idea of how frames are created and discarded as you go through the program.
Students use the concept of function stacks intuitively in a lot of situations. Imagine you're sitting in an exam where you have to solve math problems like ((1.02\*4.96)+8.7\*2.3\*1.1)/3
only with pen and paper. But you're not supposed to write anything but the final solution on the solution sheet but you can use small pieces of draft paper. So you're computing intermediate results on draft paper, having one piece of paper for the first multiplication and one for the second, then copying their results on a different piece of paper for the addition and finally the division. All the intermediate computations and results are the different stack levels, where the local variables (the intermediate results) are discarded and only the final result is kept.
We're now ready to write a function that creates thumbnails. The program we had at the end of our previous lesson was:
from skimage import novice
flower = novice.open('flower.png')
new_height = flower.height * (100.0 / flower.width)
flower.size = (100, new_height)
flower.show()
skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8
Most programmers put import
statements at the top of their programs
rather than inside functions,
both to make it easier for people to see what libraries a program depends on,
and because many functions might depend on the contents of a particular library.
That leaves us with four lines of code to encapsulate in our function,
which we will rather unimaginatively call make_thumbnail
:
def make_thumbnail(filename):
picture = novice.open(filename)
new_height = int(picture.height * 100.0 / picture.width)
picture.size = (100, new_height)
return picture
Our function takes a single parameter, which is the name of the image file to be thumbnailed, and loads and thumbnails that picture. As always, defining the function tells Python how to do something new, but it doesn't actually do that "something" until we call the function:
flower = make_thumbnail('flower.png')
flower.show()
skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8
We can now make other thumbnails with a single call:
biking = make_thumbnail('biking.png')
biking.show()
skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8
What if we want to create thumbnails that are 80 pixels wide as well?
One possibility would be to create another function called make_thumbnail_80
that contained exactly the same lines of code,
but with the number 80 everywhere instead of the number 100.
This would work,
but would be a bad design.
A major reason for writing functions is to reduce duplicated code.
If we have two or more functions that contain almost the same code,
we haven't really achieved that.
(It's also bad design to have one function called make_thumbnail_80
and another called make_thumbnail
:
if one function's name specifies the thumbnail size,
the other should as well.)
A better design is to require users to tell us how wide they want thumbnails to be:
def make_thumbnail(filename, width):
picture = novice.open(filename)
new_height = int(picture.height * float(width) / picture.width)
picture.size = (width, new_height)
return picture
Let's try it for two different sizes of thumbnails:
test_100 = make_thumbnail('flower.png', 100)
test_100.show()
skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8
test_80 = make_thumbnail('flower.png', 80)
test_80.show()
skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8
It does what we want,
but we can go one step further.
Suppose that thumbnails are almost always 100 pixels wide,
and that other sizes are rare.
In that case,
we can define make_thumbnail_width
with a width
parameter to handle the general case,
and make_thumbnail_default
without such a parameter to handle the usual case.
Rather than duplicating code,
we have the second call the first with the default width as a second parameter:
def make_thumbnail_width(filename, width):
picture = novice.open(filename)
new_height = int(picture.height * float(width) / picture.width)
picture.size = (width, new_height)
return picture
def make_thumbnail_default(filename):
return make_thumbnail_width(filename, 100)
In some languages, like C and Fortran, this is the best we can do. In Python and many other modern languages, though, we can improve our design even further by writing a single function with two parameters, and specifying a default value for the second parameter:
def make_thumbnail(filename, width=100):
picture = novice.open(filename)
new_height = int(picture.height * float(width) / picture.width)
picture.size = (width, new_height)
return picture
If we call this function with one parameter,
the default value of 100 is assigned to width
:
temp = make_thumbnail('biking.png')
temp.show()
skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8
If we call it with two, though, the second value we give it overrides that default:
temp = make_thumbnail('biking.png', 60)
temp.show()
skimage.dtype_converter: WARNING: Possible precision loss when converting from float64 to uint8
When creating functions with default values, the parameters which have default values have to be at the end of the parameter list. For example if we want to directly save the thumbnail as image, the following function would create an error:
def make_thumbnail_and_save(filename, width=100, thumbnailname):
picture = novice.open(filename)
new_height = int(picture.height * float(width) / picture.width)
picture.size = (width, new_height)
picture.save(thumbnailname)
return picture
File "<ipython-input-1-dd640766165e>", line 1 def make_thumbnail_and_save(filename, width=100, thumbnailname): SyntaxError: non-default argument follows default argument
Because it is not allowed to have parameters without default values after a parameter with a default value. The solution here can be either
a) reorder the parameters, such that the parameters with default values are in the end, or
b) add a default value to the new parameter
Taking option b), the correct function definition now looks like this:
def make_thumbnail_and_save(filename, width=100, thumbnailname='mythumb.png'):
picture = novice.open(filename)
new_height = int(picture.height * float(width) / picture.width)
picture.size = (width, new_height)
picture.save(thumbnailname)
return picture
If we want to call make_thumbnail_and_save()
with filename
and thumbnailname
given as parameters, but not width
, we have to explicitly tell Python which parameters of those with default value we are passing. If we call
make_thumbnail_and_save('biking.png', 'littlebike.png')
Python thinks the second parameter is the second parameter of the function definition - width
- and tries to use 'littlebike.png'
as image size and obviously fails. Therefore we have to tell Python that the second parameter is thumbnailname
:
make_thumbnail_and_save('biking.png', thumbnailname='littlebike.png')
Now Python knows that it should use the default witdh and use the string as thumbnailname
. This is especially helpful for functions with dozens of parameters, e.g. plotting routines or mathematical optimization functions.
def name(...)
name(...)
to call a function.return
to return a value from a function.