As much as this series is to educate aspiring computer programmers and data scientists of all ages, after playing with computers and numbers for nearly 4 decares, I've also made this as a reminder for myself on how to have fun with computers and maths.
In this second third, we will focus on putting in to practice what we learned in Part 1 and Part 2 of this series. We will focus on understanding what algorithms are, and creating one of our own (for looking for prime numbers). As a reminder from Part 1, prime numbers are the numbers that all other numbers are made of. This also means that any number that is only divisible by 1 or itself, is a prime number. Consequently any number that is divisible by a number other than 1 or itself, is not a prime number.
In a factory there is something that comes in (for example recycled newspapers), there is a process of some sort in between (for example turning newspaper in to pulp and then in to paper), and something that comes out (for example toilet paper). Algoritms are the part in the middle, where some process takes place in order to transform what comes in to what goes out.
Using what we have already learn, let's create a very simple algoritm. One that takes in two numbers, finds out if the first number we input is divisible by the second number we input. Algorithms are sometimes called 'algos' and we will be using that shorthand from now on.
def first_algo(left, right):
left % right
Now that we have created our function, which contains an algoritm that finds out if the left number is divisible by the right, you might remember that we have to call it to get the output.
first_algo(8, 4)
How come we did not get a result even we seemingly did everything right? Actually this is an expected behavior of the function, because we are not explicitly stating that we want to print something out. This is easy to fix by modifying our function slightly.
def first_algo(left, right):
print(left % right)
first_algo(8, 4)
0
Nice, now it works. Before we move on, let's look at a better way to achieve the same thing. Not always we want to print something, later you'll learn why, so it's better to use 'return' at the end of the function. Return just means that there is some kind of thing we want to spit out of the function once its done its job.
def first_algo(left, right):
return left % right
first_algo(8, 4)
0
As you see, this behaves exactly like we want it even though we don't use print() anymore. Keep this in mind, it's one of the most commonly used features in Python programming. Let's run through a few examples of how we could use our function / and the algorithm inside it.
first_algo(5,2)
1
first_algo(15,3)
0
first_algo(1523434234234, 234323)
120990
As you can see, regardless of what numbers we use as input, we always get exactly what is expected; the remant of the modulus. In other words, we always see what is remaining after we divide the left input with the right input. Let's apply some Boolean logic to the our algoritmh.
def second_algo(left, right):
return left % right is 0
second_algo(8, 4)
True
second_algo(30, 2)
True
second_algo(7, 5)
False
One of the most important, and commonly used tools of computer programmers and data scientist are conditional statements. The easiest way to understand conditional statements is to consider a statement such as one like this:
"I will go to play football if it's not going to rain"
As you can see, this is a typical boolean statement. There are two options, it will rain or not. If "will rain" is true, then "go play football" is false. This is exactly how conditional statements work in the programming context. As everything else you learn so far, conditional statements are incredibly intuitive to use in Python. Let's see a simple example.
if 1 is 1:
print("hello world")
hello world
In this example, we are just saying that if 1 is 1, which we know is true, then print "hello world". We know exactly what to expect, and actually that is a key principle with conditional statements, we should know exactly what to expect when we are writing them! Same works reversely.
if 1 is 2:
print("hello world")
We know 1 is not 2, so we also know that there will be no print out. Let's introduce a common continuation to this, called 'else'. The best way to understand this, using our previous example, is a statement such as this:
"I will go to play football if it's not going to rain, otherwise I will play playstations"
Here we are adding one more component to our conditional statement, that if it's raining and as result we don't go to play football, we'll play playstation instead. Let's see how this looks like in Python:
if 1 is 1:
print("hello world")
else:
print("bye world")
hello world
if 1 is 2:
print("hello world")
else:
print("bye world")
bye world
At this point, let's put some of the concepts we've learn together in to something just slightly more involving.
def third_algo(left, right):
if left % right is 0:
return True
else:
return False
third_algo(8, 2)
True
third_algo(8, 3)
False
Before moving on the nex section, where we will cover generating numbers, let's consider a conditional statement with one more clause.
"I will go to play football if it's not going to rain at all, and if it rains lightly I will go for a walk still, otherwise I will play playstations"
Now we have a case where how heavy the rain is effects the oucome. If it's not raining at all, we go play football, if it's raining a little we go for a walk, but otherwise we'll play playstation. For this we're going to again modify our function. Now we're going to add 'elif' clause, which is just another way to say if between if and else. We will also introduce the idea of comments, where inside our function we use human language to explain what parts of code do. Anything that starts with '#' is consider a comment in Python. It means that part of the code will not be excecuted together with others. In other words, comments do not effect output of the function in anyway.
def third_algo(left, right):
# it will not rain
if left % right is 0:
return True
# it will rain a little
elif left % right is 1:
return
# it will rain heavily
else:
return False
In this example we decide if we will play football or not. If the output is 0, it means there is no rain and we go play, and output is True. If it rains a little, we go to walk instead and output is False, and if it's more than 0, we play playstation and output is also False. That's it, you now understand conditional statements which is not just a key concept in Python language, but is the primary means we use in order to instruct computers and tell them what we want them to do.
Before moving on to the final section of this part, which will cover loops, another important basic building block of algoritms, let's have some straighforward computing fun by generating numbers. This will become handy soon, so we don't have to key in the many numbers we want to check in terms of if they are prime or not. To do this, we will use a function called range().
range(3)
[0, 1, 2]
range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
range(10, 20)
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
In the first two examples, we are simply telling the range() function to generate numbers starting from zero. In the third example we are giving also a second input, and now get numbers between the two inputs. Let's see a few more examples to make sure this is clear.
range(25, 28)
[25, 26, 27]
range(11, 18)
[11, 12, 13, 14, 15, 16, 17]
We can also add a 'step' argument, which gives us even more control over the range of numbers we want to create. For example with step argument 2, we will get every other number in a range:
range(2,20,2)
[2, 4, 6, 8, 10, 12, 14, 16, 18]
This way we only get the even numbers between 2 and 2. Let's try the same for odd numbers.
range(1,20,2)
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
There are many other ways you can use to create numbers, including random numbers, but this will be more than enough for what we want to do. Let's move on to the next section and learn about loops.
Even though our previous algo examples technically speaking are algoritms, it's rare that an algoritm is used without introducing at least some kind of a loop as part of the system. You guessed it, loops are just as easy and intuitive to use in Python language as everything else we've learn so far. For now, we're going to focus on the most commonly used type of loop, a 'for' loop. A for loop is a way to say that some process will go on for as long as something is true. Consider this example:
"I will play football as long as 5 goals are scored"
To do this, we can use a very simple loop which keeps going for a number of rounds. In other words, for a range of numbers.
for number in range(5):
print number
0 1 2 3 4
That's, there is nothing more to it. You've learn yet another fundamental building block of data science. Note how the numbers in the range start from 0 unless we tell range() to start from somewhere else. In Python language and most other programming languages, numbers start from 0 and not 1. Let's put this together with our algo and also introduce one more new concept (I promise it's the last one for a while), storing something on to a variable. Storing something in to the computer memory is nothing like storing something in to human memory.
Even though we talk about storing something in to the memory, it'a also nothing like storing something in to a safety deposit box where things will be kept for a long time. Computer memory is a temporary storage for something we're going to use as part of our computer program. Let's see some examples.
answer = 1 is 1
answer
True
Instead of getting the answer straight away, we store it in memory so we can access it later. This is useful when we want to use the same value many times. For example, we could use this approach to simplify the most recent version of our algo. Take a note below how we are using the same operation twice:
def third_algo(left, right):
# it will not rain
if left % right is 0: # < -- here first time
return True
# it will rain a little
elif left % right is 1: # < -- here second time
return
# it will rain heavily
else:
return False
Now let's make a slight simplifcation by storing the operation we do twice in to memory first.
def fourth_algo(left, right):
stored_value = left % right
# it will not rain
if stored_value is 0:
return True
# it will rain a little
elif stored_value is 1:
return
# it will rain heavily
else:
return False
Before wrapping up for now, let's put all that we've learn together in one example.
# first we create a range of numbers
numbers = range(1,10)
# then we create a loop
for number in numbers:
# then we perform the modulus operation
result = number % number
# then we create a conditional statement for cases when it's true
if result is 0:
print True
# and finish with else for cases when it's false
else:
print False
True True True True True True True True True
Obviously we are getting True as result everytime because we are always having both the right number and the left number the same (for example 8 % 8). Let's make a slight modification to take us step closer to something that will help us a great deal in finding prime numbers later. This time I'm removing the comments to keep the code neat.
left = 20
right_numbers = range(1,10)
for right in numbers:
result = left % right
if result is 0:
print True
else:
print False
True True False True True False False False False
So what we are doing now, is fixing the left number to be 20, and then checking it against every number in the range of 1 to 10 and see if it's divisible. This makes checking if a number is prime a whole lot simpler! Let's try an example where we know it's a prime nubmer, for example 13 (it's not divisisble by any other number than 1 and itself).
left = 13 # <-- changed
right_numbers = range(1,12) # <-- changed
for right in numbers:
result = left % right
if result is 0:
print True
else:
print False
True False False False False False False False False
Because we are starting our range from 1, one get one True in the beginning, so we have to start the range from 2 instead to get the right answer. As you can see, I changed the second line so that we scan until 12 which is the last number before 13. Let's put this inside a function as our fifth algo version and make the range start from 2 instead of 1.
def fifth_algo(left, right):
right_numbers = range(2, right) # <-- changed
for right in right_numbers: # <-- changed
result = left % right
if result is 0:
print True
else:
print False
Now things are starting to look good. We could now remove 'left' variable entirely as it comes as an argument from the function, and also instead of having to modify the function for the last number of the range, we also input that as an argument.
fifth_algo(7,6)
False False False False
That's it, we're prime number checking now! :) Because the result is False for all, we know for sure that our input, in this case 7, is a prime. There is one more very small change we can do using the skill we've already learn to make a nice improvement to what we already have. Instead of requiring the user to input the end of the range, we can automatically compute it as it's always the last number before left. In other words, it's left - 1.
def sixth_algo(left): # <-- changed
right_numbers = range(2, left - 1) # <-- changed
for right in right_numbers:
result = left % right
if result is 0:
print True
else:
print False
sixth_algo(9)
False True False False False False
sixth_algo(11)
False False False False False False False False
sixth_algo(19)
False False False False False False False False False False False False False False False False
Things are working real nicely now. But clearly we will later have a problem with larger numbers with this current approach, as if we input 1,000, we will have 1,000 True or False values printed on the screen. To overcome this, we can make a small change to our latest version.
def seventh_algo(left):
right_numbers = range(2, left - 1)
output = 0 # <-- changed
for right in right_numbers:
result = left % right
if result is 0:
output += 1 # <-- changed
else:
output += 0 # <-- changed
return output # <-- changed
What we are doing, is first we declare a variable 'output' with starting value 0. Then instead of printing out True, we silently add 1 to output, and in case of False we add 0. Only in the end we print the value out, with the return statement that is outside of the for loop (note how it's indentation is equal to the for statement, meaning it will be processed only once the for loop has completed its job).
seventh_algo(19)
0
Nice. Now we can key in much larger numbers, and just get one output.
seventh_algo(127)
0
Before wrapping up, let's simplify our code slightly and instead of outputting a number, output a True or False statement. True for 'it's a prime' and False for 'it's not a prime'.
def eight_algo(left):
right_numbers = range(2, left - 1)
output = 0
for right in right_numbers:
result = left % right
if result is 0:
output += 1
return output is 0 # <-- changed
eight_algo(19)
True
eight_algo(127)
True
eight_algo(12)
False
Note how we removed the else statements entirely. Because we are doing nothing in the cases where the left number is not divisible by the right number. In other words, whenever the product of the modulus operation is not zero, we do nothing. Therefore it's enough to just have the if statement without the else. This is quite common.
We've made great progress! Time to wrap up for now, and then in the next part we get in to the real action, looking for prime numbers! With the skills you're learn so far, you're doing a lot of the things the day-to-day of advanced programmers and data scientists is made of.