Python Getting Started

In [1]:
import addutils.toc ; addutils.toc.js(ipy_notebook=True)

After learning the basic Python concepts, there are still some skills to learn to start working effectively.

In this Notebook we will see how to manage functions and how to work with inport, namespaces and packages. Then we will see how to read and write external data and how to manage the external environment. Since most of our customers are working on Windows-based systems, this notebook is mainly oriented to this specific OS. Nevertheless many concepts you will find here can be applied to Mac or Linux systems.

In [2]:
from addutils import css_notebook
In [3]:

1 Comment your code !

If there was a single thing you have to keep in mind, well, this is it: comment your code! This is particulary important when you start to have structured code involving classes and functions and when you start to collaborate with someone else. As you see in the following code there are two types of comments in Python:

# Single line comments are defined by the # sign
    Milti-line comments are defined using
    three consecutive single quote signs.

But remember:

  • Comments must be used just to explain what can't be understood by reading the code
  • Bad, obvious and out-of-date comments are much worst than no-comment
  • First write clear code, then add comments to explain what isn't obvious

2 Functions

2.1 Local Functions

Local functions are used to avoid code repetition and to give a tidy face to your code. Have a look to the code in the next cell and notice the following things:

  1. Always use intelligible names for variables. In this case we used spacing_string
  2. Function arguments can have default values. This mean that you don't have to reassign all the values every time you call a function: you will define just the values you need. Notice: mandatory (non-default) values come always first in the function call definition
  3. Function arguments are named: you can specify the arguments in any order when calling the function
  4. Functions can be defined anywhere in the code (better at the beginning of the code), and always before being called
  5. Arguments are passed by reference
In [4]:
def local(spacing_string, n=5):
    '''Print n carriage returns
       "spacing_string" must be provided
       "n" can be omitted and gets the default value'''
    # Variables defined inside functions are local

local('-')     # n = 5 (default value)
local('*',n=9) # n = 9 (named argument)
In [5]:
# Since you wrote a nice description for you function
# you can invoke help with help(local) or alternatively with local?

2.2 External Functions

External functions are saved in external files. As an example you will find in this folder a file named This is the code:

In [6]:
import addutils.my_module as my
%pfile my.my_function
# Check the code below ↓

my_function(name) accepts a tuple made of two strings and calls _my_private_function. Functions whos name begins with '_' are meant to be private and cannot be called from outside. Lets try a call

In [7]:
import addutils.my_module as my
print(my.my_function(('rick', 'bayes')))
rick [BAYES]

Try by yourself  the following commands:

In [8]:

Testing your code with if name == 'main': To write reliable code, one of the most important things is to do continuous testing. In Python there is an easy way to test your code every time you modify your functions. When check name == 'main' is True, it means that the module has been called from the command line. You can use this check to write your Unit Testing code:

if __name__ == '__main__':
    ''' This is a Unit Test: use "run my_module" from Python interpreter'''
    print 'This is the testing code:'
    print my_function(('John', 'Doe'))

Try to call your module from the command line:

In [9]:
%run -m addutils.my_module
This is the testing code:
Johnn [DOE]
/home/matteo/anaconda3/envs/addfor_tutorials/lib/python3.6/ RuntimeWarning: 'addutils.my_module' found in sys.modules after import of package 'addutils', but prior to execution of 'addutils.my_module'; this may result in unpredictable behaviour

2.3 Private Methods

Methods whose names start with '_' are meant to be private: this means you aren't supposed to access it. This is an example:

def _my_private_function(first_name, second_name):
    return full_name

If you "import my_module as my" and try to type my.[TAB] you'll see just the public methods and variables.

Actually Python allows you to call private methods anyway but we advice you to do it just when you'll be much more proficient in using this language. Try if you want:

my._my_private_function('John', 'McEnroe')
In [10]:
#my._my_private_function('John', 'McEnroe')

Try by yourself  some more examples:

# Explore other private methods with: my._ + TAB
import numpy as np
name = ('Graham', 'Chapman')
my?               # Module documentation: OBJECT INTROSPECTION
my??              # will also show the function's source code if possible
np.*load*?        # ? can be also used to search the namespace
my.my_function?   # Module function documentation
help(my)          # Module Help: notice private functions not listed
In [11]:

2.4 Anonymous Functions (lambda functions)

Python supports the creation of anonymous functions (i.e. functions that are not bound to a name) at runtime, using a construct called "lambda".

This piece of code shows the difference between a normal function definition f and a lambda function g:

In [12]:
def f (x):
    return x**2

g = lambda x: x**2
print(f(4), g(4))
16 16

Note that the lambda definition does not include a 'return' statement (it always contains an expression which is returned).

Also note that you can put a lambda definition anywhere a function is expected, and you don't have to assign it to a variable at all.

Check the following code to have an idea of the typical usage for lambda functions: here we sanitize a list of strings by 'mapping' a list:

In [13]:
import re
states = [' Alabama ', 'Georgia!', '  ## Georgia', ' ? georgia', 'FlOrIda']
clean = lambda str: re.sub('[!#?]', '', str.strip()).title()
for c in map(clean, states):

3 File I/O

3.1 Simple I/O

In Python is very easy to work with files. Try by yourself  this self-explaining code:

In [14]:
import os.path
path = os.path.join(os.path.curdir, "example_data", "my_input.txt")
ifile = open(path, 'r')
for l in ifile: # ifile is an iterator
    print(l, end='')    # ',' is for suppressing the newline '\n'
First Second
10     0.32432
20  1.324
21 7.237923
36 .83298932
56        237.327823
In [15]:
# Read all the lines in a list
ifile = open(path, 'r')
lines = ifile.readlines()
['First Second\n', '10     0.32432\n', '20  1.324\n', '21 7.237923\n', '36 .83298932\n', '56        237.327823\n']

Read a file, format and write back

In [16]:
ifile = open(path)         # 'read mode' is default
path_2 = os.path.join(os.path.curdir, "tmp", "my_input2.txt")
ofile = open(path_2, 'w')   # Open Output file in 'write mode'
for line in ifile:                   # Read ONE line at a time
    s = line.split()
        ofile.write('{:g} {:14.3e}\n'.format(float(s[0]), float(s[1])))
        print('{:g} {:14.3e}\n'.format(float(s[0]), float(s[1])), end='')
        ofile.write('{0} {1}\n'.format(s[0], s[1]))
        print('{} {}\n'.format(s[0], s[1]))
# Notice: 'print' automatically adds a newline at the of the string

First Second

10      3.243e-01
20      1.324e+00
21      7.238e+00
36      8.330e-01
56      2.373e+02

When it's possible use the "with" syntax, this will close the file automatically in case of an exception preventing the program flow to reach the 'close' statements. This is also considered a "more pythonic" style.

In [17]:
with open(path) as fid:
    for line in fid:
        s = line.split()
            print('{:g} {:14.3e}\n'.format(float(s[0]), float(s[1])), end='')
            print('{} {}\n'.format(s[0], s[1]))
First Second

10      3.243e-01
20      1.324e+00
21      7.238e+00
36      8.330e-01
56      2.373e+02

3.2 Pickle / cPickle

This is the most common way to serialize and save to disk any type of Python object. Mind that if you need to save complex and structured data and share it, cPickle is not the preferred method: consider instead of using a specific file format like hdf5

A Python pickle file is (and always has been) a byte stream. Which means that you should always open a pickle file in binary mode: “wb” to write it, and “rb” to read it. The Python docs contain correct example code.

See also Programming Python for absolute beginners: Chapter 7 Storing Complex Data on stackoverflow.

In [18]:
import pickle                                # in Python 3 cPickle doesn't exist anymore
ls = ['one', 'two', 'three']
with open('tmp/out_ascii.pkl', 'wb') as f:   # Can choose an arbitrary extension
    pickle.dump(ls, f, 0)                    # dump with protocol '0': readable ASCII
with open('tmp/out_compb.pkl', 'wb') as f:   # Can choose an arbitrary extension
    pickle.dump(ls, f, 2)                    # dump with protocol '2': compressed bin

with open('tmp/out_compb.pkl', 'rb') as f:
    d2 = pickle.load(f)                      # Protocol is automatically detected
['one', 'two', 'three']

4 Operating System

4.1 General Info

Python gives you extensive possibilities to access you PC operating system. There are three modules in the Python Standard Library that you must be aware of:

4.2 sys — System-specific parameters and functions

Try some example commands by running the following cells

In [19]:
import sys 
# Platform identifier
In [20]:
# Version number of the Python interpreter
'3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19) \n[GCC 7.2.0]'
In [21]:
# PYTHONPATH: Folders in which looking for modules
for p in sys.path:
In [22]:
# Shows where the Python files are installed
In [23]:
# Information about the float DataType
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)
In [24]:
# The largest (simple) positive integer supported, in Python 2.x was sys.maxint
# now in Python 3 the integers are only limited by 'maxsize'. Example:

In [25]:
# Maximum size integers, lists, strings, dicts can have

4.3 os — Miscellaneous operating system interfaces

Try some example commands by running the following cells

In [26]:
import os
for counter, osvariable in enumerate(os.environ):
    if counter >= 10:
        print('AND MORE ...')
    print('{:>25s}: {:s}'.format(osvariable,os.environ[osvariable][:64]))
    print('============ No more OS Variables ============')
                 XDG_VTNR: 7
                 LC_PAPER: it_IT.UTF-8
               LC_ADDRESS: it_IT.UTF-8
           XDG_SESSION_ID: c2
     XDG_GREETER_DATA_DIR: /var/lib/lightdm-data/matteo
              LC_MONETARY: it_IT.UTF-8
        CLUTTER_IM_MODULE: xim
                  SESSION: ubuntu
           GPG_AGENT_INFO: /home/matteo/.gnupg/S.gpg-agent:0:1
                     TERM: xterm-color
In [27]:
# How to check a system variable:
if 'NUMBER_OF_PROCESSORS' in os.environ:
    print('Number of processors in this machine:', os.environ['NUMBER_OF_PROCESSORS'])
In [28]:
# Working directory
In [29]:
# List the files in the current directory
for filename in sorted(os.listdir(os.getcwd())):

Difference between and sys.platform:

  • sys.platform will distinguish between linux, other unixes, and OS X
  • is "posix" for all of them
In [30]:
In [31]:
# Correctly handle paths, and filenames
# can be 'posix', 'nt', 'mac', 'os2', 'ce', 'java', 'riscos'
if == 'posix':
    full_path = "/Users/dani/"
elif == 'nt':
    full_path = 'C:\\'

('', '/Users/dani/')
('/Users/dani', '')
In [32]:
if == 'posix':

4.4 glob — Unix style pathname pattern expansion

Try some example commands by running the following cells

In [33]:
import glob

Visit for more tutorials and updates.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.