In [1]:
import time
print('Last updated: %s' %time.strftime('%d/%m/%Y'))
Last updated: 17/06/2014


I am really looking forward to your comments and suggestions to improve and
extend this little collection! Just send me a quick note
via Twitter: @rasbt
or Email: bluewoodtree@gmail.com


Day 11 - One Python Benchmark per Day

The deque container data type



The collections module in Python's standard library provides some convienient container datatypes that not only make your code more "Pythonic", but they also are optimized for performance for certain operations. Previously, we have seen how we can use the collections.Counter class (Day 3: 6 different ways for counting elements using a dictionary) to write short and efficient code to quantify items.
Today, we will take a look at another collections class: the deque container data type.

Sections





A short introduction to deques

Most of you might already be familiar with deques due to their popularity in C++ and might want to skip this short introduction.

For all the others: deque (pronounced "deck") is short for "double-ended queue" and thus it is sometimes also written as "dequeue". This abstract data structure is quite similar to an array (or Python list), which is optimized for adding and removing elements from either the back or front.
The deque class comes with methods that are already familiar from the Python list type, such as

Common methods for Python's list and deque objects

  • append(val)
    appends and element to the right
  • clear()
    removes all elements
  • count(val) returns the number of elements that are equal to val
  • extend(iterable)
    adds elements to the right from an iterable object
  • pop()
    removes and returns an element from the right end
  • reverse()
    reverses elements in-place; returns None
  • remove(val)
    removes first occurence of val

Methods unique to deque objects

  • appendleft(val)
    appends an element to the left
  • extendleft(iterable)
    adds elements to the left from an iterable object
  • popleft()
    removes and returns an element from the left end
  • rotate(n)
    Rotate the deque n steps to the right. If n is negative, rotate to the left. Rotating one step to the right is equivalent to: d.appendleft(d.pop()).
In [1]:
import collections
a = collections.deque([1,2,3])
a.rotate(2)
print(a)
deque([2, 3, 1])

Note that deques don't have equivalents to the Python list methods, such as index() for looking up values, or the .sort() method (although the sorted() function can be used as a workaround). If we want to get a slice from a deque, equivalent to my_list[0:3], we would need to use itertools, for example.

import itertools
list(itertools.islice(my_deque, 0, 3))



Benchmarking via timeit

In [3]:
import timeit
import collections
from copy import copy



labels = ['.append()', '.insert(0,x)\n.appendleft(x)', 
          '.pop()', '.pop(0)\n.popleft()', '.reverse()']
list_time, deque_time = [], []

n = 10000
l = [i for i in range(n)]
d = collections.deque([i for i in range(n)])

l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.append(1)', 
            'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.append(1)', 
            'from __main__ import d_cp').repeat(repeat=3, number=1000)))

l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.insert(0,1)', 
            'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.appendleft(1)', 
            'from __main__ import d_cp').repeat(repeat=3, number=1000)))

l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.pop()', 
            'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.pop()', 
            'from __main__ import d_cp').repeat(repeat=3, number=1000)))

l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.pop(0)', 
            'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.popleft()', 
            'from __main__ import d_cp').repeat(repeat=3, number=1000)))

l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.reverse()', 
            'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.reverse()', 
            'from __main__ import d_cp').repeat(repeat=3, number=1000)))



Preparing to plot the results

In [4]:
import platform
import multiprocessing

def print_sysinfo():
    
    print('\nPython version  :', platform.python_version())
    print('compiler        :', platform.python_compiler())
    
    print('\nsystem     :', platform.system())
    print('release    :', platform.release())
    print('machine    :', platform.machine())
    print('processor  :', platform.processor())
    print('CPU count  :', multiprocessing.cpu_count())
    print('interpreter:', platform.architecture()[0])
    print('\n\n')
In [5]:
%matplotlib inline
In [13]:
import matplotlib.pyplot as plt

def plot():
    plt.rcParams.update({'font.size': 12})

    fig = plt.figure(figsize=(11,13))

    # Setting the positions and width for the bars
    pos = list(range(len(list_time))) 
    width = 0.2 
    
    # Plotting the bars
    fig, ax = plt.subplots(figsize=(8,6))

    plt.bar(pos, list_time, width,
                 alpha=0.5,
                 color='g',
                 label=labels[0])

    plt.bar([p + width for p in pos], deque_time, width,
                 alpha=0.5,
                 color='b',
                 label=labels[1])

    # Setting axis labels and ticks
    ax.set_ylabel('time in seconds\n(for 1,000 operations on starting size 10,000)')
    ax.set_title('Python list and collections.deque operations')
    ax.set_xticks([p + 1.5 * width for p in pos])
    ax.set_xticklabels(labels)

    # Setting the x-axis and y-axis limits
    plt.xlim(0-width, len(list_time))
    plt.ylim([0, max(list_time + deque_time) * 1.5])

    # Adding the legend and showing the plot
    plt.legend(['list', 'collections.deque'], loc='upper left')
    plt.grid()
    plt.show()



Results

In [14]:
plot()
print_sysinfo()
<matplotlib.figure.Figure at 0x1064c8ba8>
Python version  : 3.4.1
compiler        : GCC 4.2.1 (Apple Inc. build 5577)

system     : Darwin
release    : 13.2.0
machine    : x86_64
processor  : i386
CPU count  : 4
interpreter: 64bit



Conclusion

As we can see in the plot above, the performance of the shared methods between Python list and deque objects are about the same. However, as expected, the appendleft(x) and popleft(x) methods on deques are significantly faster than workaround-approaches on Python lists.
Thus, if our application requires a lot of adding and removing of items at both ends of the data structure, deques are definitely to be preferred over lists.