#!/usr/bin/env python
# coding: utf-8

# #Pairinglist
# Some tools or processes generate output, usually in some form of CSV output which holds a line entry for a given type of object, and any number of associated properties, all of the same type. This can be viewed as a compressed form of a table describing a many to many relationship, sometimes called a [junction table](https://en.wikipedia.org/wiki/Junction_table) or as I prefer in the context of this example, a pairing table.
# ##Example
# Consider a list of people and the operating systems they work with. I have included the trailing empty entries that can occur in csv files.

# In[1]:


example = [
    ['Tom', 'Windows', 'Linux', ''],
    ['Dick', 'Windows', 'OSX', 'AIX'],
    ['Harry', 'Linux', 'BSD', '']
    ]


# ##2-dimensional table
# Although this might originally be stored somehwere as a pairing table already, in a database somewhere, if we do not control the source, or whatever mechanism generates the output, how do we extract the underlying structure? Let's say we want to end up with a representation like this:
# <table>
#   <tr>
#     <th></th>
#     <th>Windows</th>
#     <th>Linux</th>
#     <th>OSX</th>
#     <th>BSD</th>
#     <th>AIX</th>
#   </tr>
#   <tr>
#     <td>Tom</td>
#     <td>X</td>
#     <td>X</td>
#     <td></td>
#     <td></td>
#     <td></td>
#   </tr>
#   <tr>
#     <td>Dick</td>
#     <td>X</td>
#     <td></td>
#     <td>X</td>
#     <td></td>
#     <td>X</td>
#   </tr>
#   <tr>
#     <td>Harry</td>
#     <td></td>
#     <td>X</td>
#     <td></td>
#     <td>X</td>
#     <td></td>
#   </tr>
# </table>
# 
# Because the columns in the original data are not aligned we need to extract the original pairing table to enable a pivot transformation.

# In[2]:


def pairinglist(csvlist):
    # clear out the empty strings from csv conversion
    nonemptylist = map(lambda line: filter(None, line), csvlist)
    
    # generate all combinations as sublists of tuples
    pairings = map(lambda line: [(line[0], pairing) for pairing in line[1:]], nonemptylist)

    # flatten the sublists
    return [pairing for line in pairings for pairing in line]


# In[3]:


for entry in example:
    print entry


# In[4]:


for entry in pairinglist(example):
    print entry


# ##Visualizations
# The list of tuples generated by the function can be fed to a csvwriter, or some other tool. I tend to use Excel pivottables for this kind of data, but some python visualization could also be used.