Before getting started let us do a quick revision, Pandas is a python library that gives elite, and simple to-utilize information structure for data analysis tools for python programming language. Show below is the article which will help you gain your understanding level in the Pandas library. Don’t forget to have a look at it. In fact, this is the official docs. It is worth reading it
Note → If you are using Google Colab or any Jupyter notebook environment, then you can skip the Python installation step.
Let’s understand a scenario here and come up with a solution using Pandas DataFrame.
“Everybody does and loves to do shopping (grocery). Not all the time you grocery shop at the same store, you shop based on discounts and offers (I do). Now obviously it’s a good practice to keep track of shopping details like the store name, location, amount, date, etc.”
To store and manipulate date I know you will use “Microsoft Excel” but the catch point here is we will use “Pandas DataFrame” which is by far easier and fun to use. Don’t worry, I’ll show you how.
Don’t get me wrong, “Microsoft Excel” is where it’s at. But to all those Python enthusiasts this is the best example to understand and take Pandas to next level.
Let’s solve the above scenario using Pandas DataFrame. Now in order to that, you need to follow certain steps to not commit mistakes. I will provide a systematic way of using Pandas so that you can use it mechanically in your upcoming projects or whatever.
Let’s use the concept of aliasing and use the pandas as pd
. So in the later steps, rather than using pandas
every time, we can just tell pd
.
import pandas as pd
For those of you who don’t about Python list. Please look it up here. Since I’m currently living in Regina, Saskatchewan, Canada. So most of the grocery stores would be similar or different when compared to yours. Please bear that in mind. It really does not matter, you can feed the data as per your choice.
date = ['1-9-20', '3-9-20', '3-9-20', '6-9-20', '9-9-20']
storeName = ['Walmart', 'Real Canadian Superstore', 'Co-op Food Store', 'Sobeys', 'M&M Food Market']
storeLocation = ['Gordon Road', 'Albert Street', 'Albert Street', 'Quance Street', 'Gordon Street']
amount = [55.65, 21.62, 7.10, 15.56, 5.85]
Now, what do I mean by “for the first time”, is that later say suppose if you keep on doing grocery shopping in the future, rather than storing the values manually in the list. The list would be grown as long as a train. Also, this is not recommended. So I have written a way to handle this situation in the lower sections.
Here, since we have all the values store in a list, let’s put them in a DataFrame. We can use pd.DataFrame()
and pass the value, which is all the list in this case.
df = pd.DataFrame({'Date': date,
'Store Name': storeName,
'Store Location': storeLocation,
'Amount Purchased': amount})
df
Date | Store Name | Store Location | Amount Purchased | |
---|---|---|---|---|
0 | 1-9-20 | Walmart | Gordon Road | 55.65 |
1 | 3-9-20 | Real Canadian Superstore | Albert Street | 21.62 |
2 | 3-9-20 | Co-op Food Store | Albert Street | 7.10 |
3 | 6-9-20 | Sobeys | Quance Street | 15.56 |
4 | 9-9-20 | M&M Food Market | Gordon Street | 5.85 |
As I said, it’s not a very good practice to store all the values in a list. In this way, we automatically make the list grow. There is a neat and cute way to handle this kind of situation. First, let’s take the input from the input and store them in temporary variables as shown below:
dateNew = input("Enter the date in dd-mm-yy format ---> ")
storeNameNew = input("Enter the name of the store ---> ")
storeLocationNew = input("Enter the location of the store ---> ")
amountNew = float(input("Enter the total amount purchased ---> "))
Enter the date in dd-mm-yy format ---> 10-9-20 Enter the name of the store ---> India Market Enter the location of the store ---> Albert Street Enter the total amount purchased ---> 24.68
So on the next day, I mean 10–9–20 I went to the grocery store to shop for some spices with Gordon Ramsay, just kidding I went alone. Below are the details of the new store.
Enter the date in dd-mm-yy format ---> 10-9-20
Enter the name of the store ---> India Market
Enter the location of the store ---> Albert Street
Enter the total amount purchased ---> 24.68
This is an obvious step because we need to append the new data (shopping details) to the existing DataFrame. We can do this with the help of append() Python.
date.append(dateNew)
storeName.append(storeNameNew)
storeLocation.append(storeLocationNew)
amount.append(amountNew)
This step is trivial because all we are doing is just displaying all the updated results from the above step as a DataFrame. Similar to the “Creating a Pandas DataFrame to store all the list values” step shown above.
df = pd.DataFrame({'Date': date,
'Store Name': storeName,
'Store Location': storeLocation,
'Amount': amount})
df
Date | Store Name | Store Location | Amount | |
---|---|---|---|---|
0 | 1-9-20 | Walmart | Gordon Road | 55.65 |
1 | 3-9-20 | Real Canadian Superstore | Albert Street | 21.62 |
2 | 3-9-20 | Co-op Food Store | Albert Street | 7.10 |
3 | 6-9-20 | Sobeys | Quance Street | 15.56 |
4 | 9-9-20 | M&M Food Market | Gordon Street | 5.85 |
5 | 10-9-20 | India Market | Albert Street | 24.68 |
There you go, you have successfully handled the scenario like a piece of cake. You can keep on adding more data (shopping details) and maintain monthly or yearly wise.
Shown below are some tips and tricks that you can perform or do in your free time. Because they are that easy to understand. Let me get into it.
It’s always a good practice to plot the data for better readability. Now for plotting let us use the column Amount
so that we know how much we have spent till now with the help of a plot. This can be done by using df.plot.bar()
as shown below
Note: Always remember only numeric data can be plotted in Pandas
df.plot.bar()
<matplotlib.axes._subplots.AxesSubplot at 0x7fef662459e8>
Say suppose we need to delete the last row because there is a mistake in the entry so this can be performed as follows. Now to delete an entire row we need to use drop()
in Pandas as shown below:
df = df.drop([5])
df
Date | Store Name | Store Location | Amount | |
---|---|---|---|---|
0 | 1-9-20 | Walmart | Gordon Road | 55.65 |
1 | 3-9-20 | Real Canadian Superstore | Albert Street | 21.62 |
2 | 3-9-20 | Co-op Food Store | Albert Street | 7.10 |
3 | 6-9-20 | Sobeys | Quance Street | 15.56 |
4 | 9-9-20 | M&M Food Market | Gordon Street | 5.85 |
Note: Here the value 5 is the index value of the last row
Deleting the column is like that of a row all you have to do is pass the column names to delete in this case like df.drop([column_name, axis = 1])
the axis = 1
is important here. Say suppose you need to delete the column Amount
in the DataFrame then all you need to say is:
df = df.drop(['Amount'], axis = 1)
df
Date | Store Name | Store Location | |
---|---|---|---|
0 | 1-9-20 | Walmart | Gordon Road |
1 | 3-9-20 | Real Canadian Superstore | Albert Street |
2 | 3-9-20 | Co-op Food Store | Albert Street |
3 | 6-9-20 | Sobeys | Quance Street |
4 | 9-9-20 | M&M Food Market | Gordon Street |
By executing this you can now see that the column Amount has now been dropped.
Suppose we need to update a specific entry in the DataFrame, in this case, the 4th index with the store name M&M Food Market has a mistake in its store Location (wrong entry) Gordon Street, we need to correct it to Gordon Road. To do this just use:
df['Store Location'][4] = "Gordon Road"
df
Date | Store Name | Store Location | |
---|---|---|---|
0 | 1-9-20 | Walmart | Gordon Road |
1 | 3-9-20 | Real Canadian Superstore | Albert Street |
2 | 3-9-20 | Co-op Food Store | Albert Street |
3 | 6-9-20 | Sobeys | Quance Street |
4 | 9-9-20 | M&M Food Market | Gordon Road |
We need to know the index of the particular entry to update the entry, so after executing the above we get the updated result
Well, congratulations guys you have successfully completed reading/implementing this beautiful article “Using the Pandas DataFrame in Day-To-Day Life”. Now, this is not the end, there are many other methods or functions of DataFrame that we can use and take it to next level. I have only covered only the basics here. If you guys find something new or creative, then comment it down below. I hope you guys have learned something new today. Stay tuned for more updates, until then see you next time. Bye Have a good day and stay safe!