import addutils.toc ; addutils.toc.js(ipy_notebook=True)
from addutils import css_notebook
css_notebook()
As far as scientific computing is concerned, it's hard to find a good alternative to Python. Python is the do it all language. If you want to perform a statistical analysis, then model some data, and then come up with a GUI and web platform to share with other users, you can perfectly do this with Python.
Nevertheless Python tutorials for Data Analysis in Engineering, Finance and Scientific applications are difficult to find. For this reason we made a complete set of courses and tutorials to address the scientist's and engineer's needs:
We use Python because it's:
with a compiled language (C/C++/Fortran) without requiring all the complexity. It is extensible in C or C++. Clear syntax enhances readability: “Executable Pseudo Code”
Windows, Windows NT, and OS/2, Android and many other platforms.
Compiled languages: C, C++, Fortran, etc.
Scripting languages: Matlab
Other scripting languages: Scilab, Octave, Igor, R, IDL, etc.
The operators are much like as in Matlab. We will see and play later with the different data types and control structures, that are very handy and useful.
We defer functions and classes discussion to a more advanced tutorial. However, functions are much like Matlab functions, and classes are the basic concept of object-oriented programming, very useful for more structured, large projects. Put simply, an object is an instance of a class, as the Colosseum is an instance of the buildings class.
Objects have many properties. For example every object has an univocal id. In the following example three variables are assigned in the same codeline, then d = a
define d to be the a
object. In other words, d
and a
are two different names for the same object: this is confirmed by the same object id.
a, b, c = 5, 6, 7
d = a
print (a, d, id(a), id(d))
5 5 94796251806336 94796251806336
Unlike "real" pointers like those in C and C++, things change, when we assign a new value to y. In this case d will receive a separate memory location. Basically, Python creates only real copies, if it has to, i.e. if the user, the programmer, explicitly demands it:
d = 8
print (a, d, id(a), id(d))
5 8 94796251806336 94796251806432
Pay attention to this behavour (shallow copy) when copying mutable objects (check the next chapters). In this case we reassign an element of list_02: the mutable object doesn't change, so list_01 stll maintain the same id() of list_02 and consequently results updated:
list_01 = [1, 2, 3]
list_02 = list_01
list_02[1] = 5
list_01
[1, 5, 3]
'isinstance' checks if the passed value correspond to one of the listed instances: in this case 'a' is a float:
isinstance(a, (int, float, bool))
True
Some Examples:
Some words on the organizational structure of Python code:
__init__.py
, that can be empty, that signal that the folder is a (sub)package. Numpy and Matplotlib are examples of packages.Python is expanded by modules. To use a module it must first be imported. There are three ways to import modules:
import modulename
- will preserve the full package name in the namespace. To use a module keyword in the code you will use modulename.keyword
import modulename as name
- will replace the full package name with a suitable alias. To use a module keyword in the code you will use name.keyword
from modulename import *
- THIS IS NOT ADVISABLE IN MOST CASES: will include the package keywords in the base namespace, this means that some keywords could be overvritten. To use a module keyword in the code you will use keyword
Some examples:
import math # Then math. must be used before using any command
import numpy as np # Then the alias np. must be used before any command
from pandas import * # Import EVERYTHING in the current namespace
import math # Then 'math.'must be used before using any command
math.sin(3)
0.1411200080598672
Strings can be defined with both double or single quotes. Escape codes like \t [tab]
, \n [newline]
or \xHH [special character]
can be used. The output can be printed multiple times by using *k
a, b, c = 'hello', "HELLO", "Hello, how's going?"
print(a, b, c, sep="-"*5)
hello-----HELLO-----Hello, how's going?
The in function can be used to find substrings:
a = '\t abcdefβγδ♻_gh \n '
'δ♻' in a
True
strip
is one of the most used functions while working with strings in Python. Discover by yourself what it is by using <b>?</b>
!
print(a.strip())
abcdefβγδ♻_gh
*Try by yourself* the power of split
by running the following code (strip
, split
and many other functions can be put in the same statement by using the '.' operator:
a.strip().split('_')
b = '236 23 32 23 55'
b.split()
b = '236 23 32 23 55'
b.split()
['236', '23', '32', '23', '55']
*Try by yourself* the following commands:
c = a.strip()
print(c) # 'abcdefβγδ♻_gh'
print(c.upper()) # 'ABCDEFΒΓΔ♻_GH'
print(c.title()) # 'Abcdefβγδ♻_Gh'
print(c.center(30,'=')) # '========abcdefβγδ♻_gh========='
print(c.find('c')) # 2 (index start from zero)
print(c.split('_')) # ['abcdefβγδ♻', 'gh']
print(c.replace('_','')) # 'abcdefβγδ♻gh'
print(' *** '.join(['one', 'two', 'three'])) # one *** two *** three
Exercise: format the following string and remove trailing and leading escape characters and internal separation characters, format the name to have the first letter capitalized and the other lowercase (output must be: Johnn Richard Thompson). Everything can be done in just one line!
name = ' JOHNN - Richard-Thompson '
print (' '.join(name.strip().replace('-',' ').title().split()))
Johnn Richard Thompson
More examples for split
:
# Split
s1 = '236 23 32 23 55'
s1.split() # ['236', '23', '32', '23', '55'] - Multiple separators
s3 = '236 32 || 23 ||32--44||2|5||6'
s3.split('||') # ['236 32 ', ' 23 ', '32--44', '2|5', '6']
s3.split('||', 1) # ['236 32 ', ' 23 ||32--44||2|5||6']
# Dealing with multiple separators using 'split'
s4 = 'a;b,c;d'
s4.replace(';',',').upper().split(',')
# Alternative solution: 'regexp'
import re
phrase = "Hey, '32' you - what are you doing here???"
print ' '.join(re.findall('\w+', phrase))
This shows how the string method .format()
works for positional parameters::
print('First Argument: {} --- second one: {}'.format(47.99,11.55))
print('First Argument: {0} --- second one: {1}'.format(47.99,11.55))
print('First Argument: {1} --- second one: {0}'.format(47.99,11.55))
First Argument: 47.99 --- second one: 11.55 First Argument: 47.99 --- second one: 11.55 First Argument: 11.55 --- second one: 47.99
This shows how to use heyword parameters:
print('First Argument: {a} --- second one: {b}'.format(b=47.99, a=11.55))
First Argument: 11.55 --- second one: 47.99
Of course the parameters can be formatted individually:
print('First Argument: {a:08.0f} --- second one: {b:08.3f}'.format(b=47.99, a=11.55))
First Argument: 00000012 --- second one: 0047.990
Conversion Table:
Flags:
x = 12222
a = 'βγδ♻'
print('Signed Integer Decimal: {0:12d}'.format(x))
print('Signed Integer Decimal with thousands separator: {0:12,d}'.format(x))
print('Signed Integer Decimal padded with zeroes: {0:012d}'.format(x))
print('Signed Integer Decimal padded with zeroes signed: {0:+012d}'.format(x))
print('Signed Integer Decimal leading space if positive: {0: 012d}'.format(x))
print('Signed Integer Decimal aligned to the left: {0:<12d}'.format(x))
print('Signed Integer Decimal centered: {0:^12d}'.format(x))
print('Floating point decimal format: {0:12.2F}'.format(x))
print('Unsigned hexadecimal (uppercase): {0:12X}'.format(x))
print('Unsigned hexadecimal (uppercase): {0:#012X}'.format(x))
print('Unsigned octal: {0:12o}'.format(x))
print('Floating point exponential format (lowercase): {0:12e}'.format(x))
print('String using repr(): {0!r}'.format(a))
print('String using str(): {0!s}'.format(a))
print('-'*21, '\n') # '-'*21 is the same as '-'+'-'+'-'+... 21 times
print('{0:>22s}: {1:012d}'.format('Description', x))
print('{0:>22s}: {1:012d}'.format('Description x2', x*2))
print('{0:>22s}: {1:012d}'.format('Longer Description x4', x*4))
Signed Integer Decimal: 12222 Signed Integer Decimal with thousands separator: 12,222 Signed Integer Decimal padded with zeroes: 000000012222 Signed Integer Decimal padded with zeroes signed: +00000012222 Signed Integer Decimal leading space if positive: 00000012222 Signed Integer Decimal aligned to the left: 12222 Signed Integer Decimal centered: 12222 Floating point decimal format: 12222.00 Unsigned hexadecimal (uppercase): 2FBE Unsigned hexadecimal (uppercase): 0X0000002FBE Unsigned octal: 27676 Floating point exponential format (lowercase): 1.222200e+04 String using repr(): 'βγδ♻' String using str(): βγδ♻ --------------------- Description: 000000012222 Description x2: 000000024444 Longer Description x4: 000000048888
List are ordered Non-Homogeneus containers. The main properties of Python lists are the following:
ls = [2, 3, 4, 5, 'six', 9]
ls[-1] = 8 # Redefine the last element
print(ls)
[2, 3, 4, 5, 'six', 8]
*Try by yourself* the following commands:
ls = [1, 2, 3]
ls.append([11, 12, 'one'])
ls.extend([33,44])
ls.insert(2,[55,66])
ls[1:1] = [77, 88, 99] # See 'slicing' next
ls = ls + ['aa', 'bb']
ls[1:1] = [77, 88, 99]
print(ls)
[2, 77, 88, 99, 3, 4, 5, 'six', 8]
*Try by yourself* more commands:
ls = [5, 6, 3, 7, 3, 9, 7]
ls.sort()
ls.reverse()
ls.pop()
ls.count(7)
len(ls) # Length
range(10) # Generate a list of integers
range(4,20,3) # range(start, stop, step)
ls = [5, 6, 3, 7, 3, 9, 7]
ls.count(7)
2
sort
can be used with a secondary sort key (a function to generate the key): in this case the sort key is the lenght of the strings
ls= ['Zr', 'wax', 'grid', 'I', 'Sir', 'zirconium']
ls.sort(key=len)
ls
['I', 'Zr', 'wax', 'Sir', 'grid', 'zirconium']
sort
modifies the list (sort in place). If you don't want to modify the list use the 'sorted' function
print(sorted(ls))
['I', 'Sir', 'Zr', 'grid', 'wax', 'zirconium']
in
checks if one element is in the list
'wax' in ls
True
index
finds the position of a given element in a list
print(ls, ls.index('grid'))
['I', 'Zr', 'wax', 'Sir', 'grid', 'zirconium'] 4
Lists can be iterated with for
. In Python the index is not requires but you can have one if you need it for your purposes. Check the following two examples
for string in ls:
print(string.rjust(10))
I Zr wax Sir grid zirconium
for index, string in enumerate(ls):
print(index, string.rjust(10))
0 I 1 Zr 2 wax 3 Sir 4 grid 5 zirconium
*List comprehension* is one of the more important constructs in Python. The general syntax is:
[expression(argument) for argument in list if boolean_expression]
Expression can contain control structures such as if ... else
.
Lets see one example. Imagine to start from a list of numbers and build a second list containing just the string representation of the numbers that can be divided by three (in Python x%y
is the reminder of the division x/y):
numbers = range(4, 20)
strings = [str(number) for number in numbers if not number%3]
print(strings)
['6', '9', '12', '15', '18']
Slicing can be done on any sequential object (like strings and list) and is used to extract (slice) a part of the object, delete or add elements to the object.
It works like this:
s[begin: end: step]
The resulting sequence consists of the following elements:
s[begin], s[begin + 1*step], ... s[begin + i*step] for all (begin + i*step) < end
*Try by yourself* some slicing on a string:
s = 'abcdefghi' + '123' # s is a string
s[:4]
s[5:]
s[::2]
ls = list('abcdefghi') # ls is a list
ls[-1:-1] = ['i', 1, 2, 3]
ls[0:3] = ['A', 'B', 'C']
s1 = ''.join(str(s) for s in ls)
s = 'abcdefghi' + '123' # s is a string
s[:4]
ls = list('abcdefghi') # ls is a list
ls[-1:-1] = ['i', 1, 2, 3]
print(ls)
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 1, 2, 3, 'i']
Sets are lists with UNIQUE elements. The main properties of Python sets are the following:
frozenset
set1 = set([1, 2, 'a', 7, 7, 9])
set1.add('xyz')
set2 = set('54677788')
print (set1, set2)
{1, 2, 7, 9, 'xyz', 'a'} {'6', '8', '4', '5', '7'}
*Try by yourself* the following commands:
set1 & set2 # AND
set1 | set2 # OR
set1 ^ set2 # XOR
set1 - set2
ls = list(set1) # To index a set, first transform it to a list
import ipywidgets
from IPython.display import display
text01 = ipywidgets.Text(value='123ab')
text02 = ipywidgets.Text(value='345bc')
text03 = ipywidgets.Text(description='JOIN:', value='---', width=450,
border_color='black', color='white', background_color='green')
button01 = ipywidgets.Button(description='JOIN SETS', tooltip='Join', value=False,
border_color='black', color='white', background_color='red')
box01 = ipywidgets.HBox(children=[text01, text02])
box02 = ipywidgets.HBox(children=[button01, text03])
display(box01, box02)
def click(b):
set01 = set(str(text01.value))
set02 = set(str(text02.value))
text03.value = str(set01 | set02)
button01.on_click(click)
Failed to display Jupyter Widget of type HBox
.
If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean that the widgets JavaScript is still loading. If this message persists, it likely means that the widgets JavaScript library is either not installed or not enabled. See the Jupyter Widgets Documentation for setup instructions.
If you're reading this message in another frontend (for example, a static rendering on GitHub or NBViewer), it may mean that your frontend doesn't currently support widgets.
Failed to display Jupyter Widget of type HBox
.
If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean that the widgets JavaScript is still loading. If this message persists, it likely means that the widgets JavaScript library is either not installed or not enabled. See the Jupyter Widgets Documentation for setup instructions.
If you're reading this message in another frontend (for example, a static rendering on GitHub or NBViewer), it may mean that your frontend doesn't currently support widgets.
A tuple is an IMMUTABLE list, i.e. a tuple cannot be changed in any way once it has been created. A tuple is defined analogously to lists, except that the set of elements is enclosed in parentheses instead of square brackets. The rules for indices are the same as for lists. Once a tuple has been created, you can't add elements to a tuple or remove elements from a tuple.
So, what is the reason to use tuples? Mainly three:
def myfunction(pack):
a, b, c, d = pack
print(a+b+c+d[0]-d[1])
t = 1, 2, 3, (8, 9) # Pack arguments to pass it to a function
myfunction(t)
5
seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
for a, b, c in seq:
print(a, b, c) # Common use for tuple unpacking
1 2 3 4 5 6 7 8 9
'zip' can be used to reorganize (transpose) columns of data:
names = [('Chloe', 'Emily', 'Sophia'), ('Stuart', 'Winsor', 'Davidson')]
for firstname, lastname in zip(*names):
print(firstname, lastname)
Chloe Stuart Emily Winsor Sophia Davidson
Dictionaries are 'Associative Arrays': values are indexed by generic keys. In other words the indexing kay can be an integer number but can be a string, a tuple or any other immutable object.
Here we make a dictionary using tuples as keys and telephone numbers as values. Then we access a dictionary item by providing a key (tuple)
data = [(('Chloe', 'Stuart'), '(831) 758-7214'),
(('Emily', 'Winsor'), '(877) 359-8474'),
(('Sophia', 'Davidson'), '(800) 445-2854')]
d = dict(data)
print(d[('Chloe', 'Stuart')])
(831) 758-7214
*Try by yourself* the following commands:
d.keys()
d.values()
d.items()
print(d.keys())
dict_keys([('Chloe', 'Stuart'), ('Emily', 'Winsor'), ('Sophia', 'Davidson')])
When reading the dictionary you must check if the dictionary contain the key by using in
. If you ask for an unknown key, Python rises an exception. Alternatively you can use get
with a default value to be used if the key is not found
('Chloe', 'Winsor') in d
False
print(d.get(('Chloe', 'Winsor'),'Number not available'))
Number not available
When iterating a dictionary, the items order is not guaranteed:
for key, value in d.items():
print('KEY: ', key, '\t\t VALUE: ', value)
KEY: ('Chloe', 'Stuart') VALUE: (831) 758-7214 KEY: ('Emily', 'Winsor') VALUE: (877) 359-8474 KEY: ('Sophia', 'Davidson') VALUE: (800) 445-2854
# Iterators work on keys by default
for key in d:
print(key, '\t\t', d[key])
('Chloe', 'Stuart') (831) 758-7214 ('Emily', 'Winsor') (877) 359-8474 ('Sophia', 'Davidson') (800) 445-2854
Some more examples:
# How to create a dictionary with a for loop (from a list of tuples)
d1 = dict([(n, str(n)) for n in range(5)])
print(d1) # {0: '0', 1: '1', 2: '2', 3: '3', 4: '4'}
{0: '0', 1: '1', 2: '2', 3: '3', 4: '4'}
d1.pop(2)
d2 = {'10': 'ten', '11': 'eleven'}
d1.update(d2) # {0: '0', 1: '1', 2: '2', 3: '3', '10': 'ten', '11': 'eleven'}
print(d1)
{0: '0', 1: '1', 3: '3', 4: '4', '10': 'ten', '11': 'eleven'}
# Creating dict from sequences
d3 = {}
for key, value in zip(list('abcd'), list('1234')):
d3[key] = value
print(d3)
{'a': '1', 'b': '2', 'c': '3', 'd': '4'}
Counters are a very special type of dictionaries: they give you a simple and effective way to count items.
from collections import Counter
colorlist = ['red', 'blue', 'red', 'green', 'blue', 'blue', 'green', 'blue', 'cyan']
cnt = Counter(colorlist)
print('Total of all counts: ', sum(cnt.values()))
print('Most common elements: ')
for item, number in cnt.most_common(3):
print('\t'*2, item, number)
print('Least common elements: ')
for item, number in cnt.most_common()[:-4:-1]:
print('\t'*2, item, number)
Total of all counts: 9 Most common elements: blue 4 red 2 green 2 Least common elements: cyan 1 green 2 red 2
This is a brief overview of the flow-control instructions in Python
a = 34
if a != 7:
print("'a' is not 7")
if a > 15:
print("'a' is greater than 15")
elif a == 15:
print("'a' is exactly 15!")
else:
print("'a' is less than 15")
'a' is not 7 'a' is greater than 15
In Python you can iterate Lists, Dictionaries, Lines in a file and all the 'ITERABLE' Objects
l = ['a', 'b', 'c', 'd', 'e', 'f']
for v in l:
if v == 'e':
break # Skip all loops and go the 'else' statement
elif v == 'b':
continue # Skip this loop
print(v)
print('Done !') # Executed upon completion of the for loop
a c d Done !
for element in [3,4,5]: # Elements in LISTS
print(element)
for element in (7,8,9): # Elements in TUPLES
print(element)
for char in 'abc': # Elements in STRINGS
print(char)
import os.path
path = os.path.join(os.path.curdir, "example_data", "my_input.txt")
for line in open(path): # Elements in FILES
print(line, end='')
3 4 5 7 8 9 a b c First Second 10 0.32432 20 1.324 21 7.237923 36 .83298932 56 237.327823
# Enumerate returns an enumerate object:
for i, season in enumerate(['Spring', 'Summer', 'Fall', 'Winter']):
print(i, season)
0 Spring 1 Summer 2 Fall 3 Winter
a = 0
while a < 10:
a += 1
print(a, end='')
12345678910
a = 2**10
while a>0.5:
print('{0:.1f}'.format(a), end=' - ')
a = a/2
1024.0 - 512.0 - 256.0 - 128.0 - 64.0 - 32.0 - 16.0 - 8.0 - 4.0 - 2.0 - 1.0 -
Visit www.add-for.com for more tutorials and updates.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.