#!/usr/bin/env python # coding: utf-8 # ## DataFrame Creation # # In this notebook, we will learn to create new ```DataFrame``` object from other data structures( e.g.,numpy array and dictionary) and convert data frame to numpy array and dictionary. The defult setting for pandas ```DataFrame``` is # # ```pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)``` # In[1]: import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt get_ipython().run_line_magic('matplotlib', 'inline') sns.set() # #### 1. To create new ```DataFrame``` from Numpy array. # # Let's create a random array of size(100,20) and random column names. We will use these array and column names to create the ```DataFrame``` in next step. # In[7]: import random as random A = np.random.rand(100,10) letter = ['A','B','C','D','E','F','G','H','X'] def namer(n): col_names = [ random.choice(letter)\ +random.choice(letter)\ +random.choice(letter)\ +random.choice(letter) for i in range(n)] return col_names # In[8]: print(namer(A.shape[1])) # In[9]: df = pd.DataFrame(A, columns = col_names ) df.head() # - To save data from ```new DataFrame``` to a file: # In[19]: df.to_csv('data/test.csv') # #### 2. To create new ```DataFrame``` from list of dictionaries. # # Here we will create a list with collection of dictionaries. Each of the dictionary will have keys and values. Using this list of dictionaries, we will create another ```DataFrame```. The keys of the dictionary will serve as the column names. # In[18]: LD = [] for i in range(100): LD.append({'Player' : namer(1)[0],\ 'game1' : random.uniform(0,1),\ 'game2' : random.uniform(0,1),\ 'game3' : random.uniform(0,1), 'game4' : random.uniform(0,1), 'game5' : random.uniform(0,1)}) # In[19]: LD[0] # In[20]: DF = pd.DataFrame(LD) DF=DF.set_index("Player") # In[21]: DF.head(10) # #### 3. To create ```DataFrame``` from a List : # In[26]: A = [random.uniform(0,1)for i in range(10)] B = [random.uniform(0,1)for i in range(10)] C = [random.uniform(0,1)for i in range(10)] D = [random.uniform(0,1)for i in range(10)] df = pd.DataFrame() df['A'],df['B'],df['C'],df['D'] = A,B,C,D df.head() # ### References: # 1. [Pydata document for Styling DataFrame visualization](https://pandas.pydata.org/docs/user_guide/style.html)