'Spreadsheets: What most people are familiar with'
Name | Landmass | Zone | Area | Population | Language | Religion | Bars | Stripes | Colors | ... | Saltires | Quarters | Sunstars | Crescent | Triangle | Icon | Animate | Text | Topleft | Botright | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Afghanistan | 5 | 1 | 648 | 16 | 10 | 2 | 0 | 3 | 5 | ... | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | black | green |
1 | Albania | 3 | 1 | 29 | 3 | 6 | 6 | 0 | 0 | 3 | ... | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | red | red |
2 | Algeria | 4 | 1 | 2388 | 20 | 8 | 2 | 2 | 0 | 3 | ... | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | green | white |
3 | American-Samoa | 6 | 3 | 0 | 0 | 1 | 1 | 0 | 0 | 5 | ... | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | blue | red |
4 | Andorra | 3 | 1 | 0 | 0 | 6 | 0 | 3 | 0 | 3 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | blue | red |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
189 | Western-Samoa | 6 | 3 | 3 | 0 | 1 | 1 | 0 | 0 | 3 | ... | 0 | 1 | 5 | 0 | 0 | 0 | 0 | 0 | blue | red |
190 | Yugoslavia | 3 | 1 | 256 | 22 | 6 | 6 | 0 | 3 | 4 | ... | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | blue | red |
191 | Zaire | 4 | 2 | 905 | 28 | 10 | 5 | 0 | 0 | 4 | ... | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | green | green |
192 | Zambia | 4 | 2 | 753 | 6 | 10 | 5 | 3 | 0 | 4 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | green | brown |
193 | Zimbabwe | 4 | 2 | 391 | 8 | 10 | 5 | 0 | 7 | 5 | ... | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | green | green |
194 rows × 30 columns
'Also what Computers are familiar with. What the computer understands as a table...'
pixel_0_0 | pixel_0_1 | pixel_0_2 | pixel_0_3 | pixel_0_4 | pixel_0_5 | pixel_0_6 | pixel_0_7 | pixel_1_0 | pixel_1_1 | ... | pixel_6_6 | pixel_6_7 | pixel_7_0 | pixel_7_1 | pixel_7_2 | pixel_7_3 | pixel_7_4 | pixel_7_5 | pixel_7_6 | pixel_7_7 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.0 | 0.0 | 5.0 | 13.0 | 9.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 6.0 | 13.0 | 10.0 | 0.0 | 0.0 | 0.0 |
1 | 0.0 | 0.0 | 0.0 | 12.0 | 13.0 | 5.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 11.0 | 16.0 | 10.0 | 0.0 | 0.0 |
2 | 0.0 | 0.0 | 0.0 | 4.0 | 15.0 | 12.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 5.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.0 | 11.0 | 16.0 | 9.0 | 0.0 |
3 | 0.0 | 0.0 | 7.0 | 15.0 | 13.0 | 1.0 | 0.0 | 0.0 | 0.0 | 8.0 | ... | 9.0 | 0.0 | 0.0 | 0.0 | 7.0 | 13.0 | 13.0 | 9.0 | 0.0 | 0.0 |
4 | 0.0 | 0.0 | 0.0 | 1.0 | 11.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 16.0 | 4.0 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1792 | 0.0 | 0.0 | 4.0 | 10.0 | 13.0 | 6.0 | 0.0 | 0.0 | 0.0 | 1.0 | ... | 4.0 | 0.0 | 0.0 | 0.0 | 2.0 | 14.0 | 15.0 | 9.0 | 0.0 | 0.0 |
1793 | 0.0 | 0.0 | 6.0 | 16.0 | 13.0 | 11.0 | 1.0 | 0.0 | 0.0 | 0.0 | ... | 1.0 | 0.0 | 0.0 | 0.0 | 6.0 | 16.0 | 14.0 | 6.0 | 0.0 | 0.0 |
1794 | 0.0 | 0.0 | 1.0 | 11.0 | 15.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 9.0 | 13.0 | 6.0 | 0.0 | 0.0 |
1795 | 0.0 | 0.0 | 2.0 | 10.0 | 7.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 2.0 | 0.0 | 0.0 | 0.0 | 5.0 | 12.0 | 16.0 | 12.0 | 0.0 | 0.0 |
1796 | 0.0 | 0.0 | 10.0 | 14.0 | 8.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | ... | 8.0 | 0.0 | 0.0 | 1.0 | 8.0 | 12.0 | 14.0 | 12.0 | 1.0 | 0.0 |
1797 rows × 64 columns
'... We can recognize as images'
Example taken from Plotly Online Documentation. https://plotly.com/python/imshow/#displaying-an-image-and-the-histogram-of-color-values
What we understand as the lyrics of Bruno Mars' Versace on the Floor.. Let's take our time tonight, girl Above us all the stars are watchin' There's no place I'd rather be in this world Your eyes are where I'm lost in Underneath the chandelier We're dancin' all alone There's no reason to hide What we're feelin' inside Right now So, baby, let's just turn down the lights and close the door Ooh, I love that dress, but you won't need it anymore No, you won't need it no more Let's just kiss 'til we're naked, baby Versace on the floor Ooh, take it off for me, for me, for me, for me now, girl Versace on the floor Ooh, take it off for me, for me, for me, for me now, girl I'll unzip the back and watch it fall While I kiss your neck and shoulders No, don't be afraid to show it all I'll be right here ready to hold you Girl, you know you're perfect from Your head down to your heels Don't be confused by my smile 'Cause I ain't ever been more for real, for real So just turn down the lights And close the door Ooh, I love that dress, but you won't need it anymore No, you won't need it no more Let's just kiss 'til we're naked, baby Versace on the floor Ooh, take it off for me, for me, for me, for me now, girl Versace on the floor Ooh, take it off for me, for me, for me, for me now, girl Dance It's warmin' up Can you feel it? It's warmin' up Can you feel it? It's warmin' up Can you feel it, baby? It's warmin' up Oh, seems like you're ready for more, more, more Let's just kiss 'til we're naked Versace on the floor Hey, baby Take it off for me, for me, for me, for me now, girl Versace on the floor Ooh, take it off for me, for me, for me, for me now, girl Versace on the floor Floor Floor
The computer can only understand as a row of a spreadsheet
floor | girl | versace | ooh | take | let | baby | kiss | need | warmin | ... | head | heel | hey | hide | hold | inside | know | alone | lose | world | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 9 | 8 | 7 | 7 | 7 | 5 | 5 | 4 | 4 | 4 | ... | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
1 rows × 61 columns
Please enter three sentences Please take your COVID vaccination. It doesn't matter which brand. You are protecting yourself and others against COVID.
covid | brand | matter | others | please | protect | take | vaccination | |
---|---|---|---|---|---|---|---|---|
0 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
How about sound?
Fundamental | 1st Harmonic | 2nd Harmonic | 3rd Harmonic | 4th Harmonic | |
---|---|---|---|---|---|
Frequency | 440.0 | 880 | 1320 | 1760.0 | 2200.0 |
Amplitude | 6.5 | 5 | 3 | 2.4 | 1.0 |
Phase Offset | 0.0 | 1 | 4 | 3.0 | 2.1 |
DUT means Device Under Test
PIN 1 to GND Contact Test | PIN 2 to GND Contact Test | PIN 3 to GND Contact Test | PIN 4 to GND Contact Test | PIN 5 to GND Contact Test | PIN 6 to GND Contact Test | PIN 7 to GND Contact Test | PIN 8 to GND Contact Test | Passed/Failed | |
---|---|---|---|---|---|---|---|---|---|
DUT_1 | 5.0001 | 5.0004 | 5.0004 | 5.0004 | 5.0004 | 5.00040 | 5.0004 | 5.0004 | Passed |
DUT_2 | 5.0001 | 5.0004 | 5.0004 | 5.0004 | 5.0004 | 5.00040 | 5.0004 | 5.0004 | Passed |
DUT_3 | 5.0001 | 5.0004 | 5.0004 | 5.0004 | 5.0004 | 5.00040 | 5.0004 | 5.0004 | Passed |
DUT_4 | 5.0001 | 5.0004 | 5.0004 | 5.0004 | 5.0004 | 5.00040 | 5.0004 | 5.0004 | Passed |
DUT_5 | 5.0001 | 5.0004 | 5.0004 | 5.0004 | 5.0004 | 0.13234 | 5.0004 | 5.0004 | Failed |
DUT_6 | 5.0001 | 5.0004 | 5.0004 | 5.0004 | 5.0004 | 5.00040 | 5.0004 | 5.0004 | Passed |
DUT_7 | 5.0001 | 5.0004 | 5.0004 | 5.0004 | 5.0004 | 5.00040 | 5.0004 | 5.0004 | Passed |
DUT_8 | 5.0001 | 5.0004 | 5.0004 | 5.0004 | 5.0004 | 5.00040 | 5.0004 | 5.0004 | Passed |
All Data end up becoming spreadsheets.
Inspect the condition and make corrections
Are there missing values?
<AxesSubplot:>
Correct the data formats <class 'pandas.core.frame.DataFrame'> RangeIndex: 59946 entries, 0 to 59945 Data columns (total 31 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 age 59946 non-null int64 1 body_type 54650 non-null object 2 diet 35551 non-null object 3 drinks 56961 non-null object 4 drugs 45866 non-null object 5 education 53318 non-null object 6 essay0 54458 non-null object 7 essay1 52374 non-null object 8 essay2 50308 non-null object 9 essay3 48470 non-null object 10 essay4 49409 non-null object 11 essay5 49096 non-null object 12 essay6 46175 non-null object 13 essay7 47495 non-null object 14 essay8 40721 non-null object 15 essay9 47343 non-null object 16 ethnicity 54266 non-null object 17 height 59943 non-null float64 18 income 11504 non-null float64 19 job 51748 non-null object 20 last_online 59946 non-null object 21 location 59946 non-null object 22 offspring 24385 non-null object 23 orientation 59946 non-null object 24 pets 40025 non-null object 25 religion 39720 non-null object 26 sex 59946 non-null object 27 sign 48890 non-null object 28 smokes 54434 non-null object 29 speaks 59896 non-null object 30 status 59946 non-null object dtypes: float64(2), int64(1), object(28) memory usage: 14.2+ MB None 0 2012-06-28-20-30 1 2012-06-29-21-41 2 2012-06-27-09-10 3 2012-06-28-14-22 4 2012-06-27-21-26 Name: last_online, dtype: object
Are there duplicates? (5824, 4)
category | scientific_name | common_names | conservation_status | |
---|---|---|---|---|
8 | Mammal | Canis lupus | Gray Wolf | Endangered |
3020 | Mammal | Canis lupus | Gray Wolf, Wolf | In Recovery |
4448 | Mammal | Canis lupus | Gray Wolf, Wolf | Endangered |
29 | Mammal | Eptesicus fuscus | Big Brown Bat | Species of Concern |
3035 | Mammal | Eptesicus fuscus | Big Brown Bat, Big Brown Bat | Species of Concern |
3150 | Bird | Gavia immer | Common Loon, Great Northern Diver, Great North... | Species of Concern |
172 | Bird | Gavia immer | Common Loon | Species of Concern |
30 | Mammal | Lasionycteris noctivagans | Silver-Haired Bat | Species of Concern |
3037 | Mammal | Lasionycteris noctivagans | Silver-Haired Bat, Silver-Haired Bat | Species of Concern |
4465 | Mammal | Myotis californicus | California Myotis | Species of Concern |
3039 | Mammal | Myotis californicus | California Myotis, California Myotis, Californ... | Species of Concern |
4467 | Mammal | Myotis lucifugus | Little Brown Myotis | Species of Concern |
3042 | Mammal | Myotis lucifugus | Little Brown Bat, Little Brown Myotis, Little ... | Species of Concern |
37 | Mammal | Myotis lucifugus | Little Brown Bat, Little Brown Myotis | Species of Concern |
337 | Bird | Nycticorax nycticorax | Black-Crowned Night-Heron | Species of Concern |
4564 | Bird | Nycticorax nycticorax | Black-Crowned Night Heron | Species of Concern |
3283 | Fish | Oncorhynchus mykiss | Rainbow Trout | Threatened |
3081 | Bird | Pandion haliaetus | Osprey, Western Osprey | Species of Concern |
104 | Bird | Pandion haliaetus | Osprey | Species of Concern |
226 | Bird | Riparia riparia | Bank Swallow | Species of Concern |
3185 | Bird | Riparia riparia | Bank Swallow, Sand Martin | Species of Concern |
3029 | Mammal | Taxidea taxus | American Badger, Badger | Species of Concern |
4457 | Mammal | Taxidea taxus | Badger | Species of Concern |
What if you have thousands of rows and hundreds of columns? How do you summarize it?
Start comparing different groups, and investigate relationships and associations.
The mean difference is: -19.11905597473242 The median difference is: -19.0 pval = ttest_ind(thalach_hd, thalach_no_hd) P-value is 3.456964908430172e-14. Lower than 5% threshold. Difference is statistically significant
Groups and Subgroups: Stratification
OKCupid Data Gender Classifier
Regression Example
Clustering Example
If you have a lot of data in your organization you can harness the power of artificial intelligence.
Cambridge Analytica. Psychometrics
Openness | Conscientiousness | Extraversion | Agreeableness | Neuroticism | Age | Sex | Photos_Qty | Will_vote_for | |
---|---|---|---|---|---|---|---|---|---|
Person_1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Duterte |
Person_2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Roxas |
Person_3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Duterte |
Person_4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Roxas |
Person_5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Duterte |
Person_6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Roxas |
Person_7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Duterte |
Person_8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Roxas |
Qualifying for a Loan
Age | Sex | Marital Status | Age of Spouse | Occupation | Educational Attainment | Income | Has a Car? | Number of Dependents | Will_fail_to_pay_loan | |
---|---|---|---|---|---|---|---|---|---|---|
Customer_1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20% Chance |
Customer_2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60% Chance |
Customer_3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20% Chance |
Customer_4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60% Chance |
Customer_5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20% Chance |
Customer_6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60% Chance |
Customer_7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20% Chance |
Customer_8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60% Chance |
Education
Algebra | Calculus | Vectors | Analytical Geometry | Differential Equations | Feedback and Control Systems | Numerical Methods | Solid Mensuration | Will_fail_Advanced_Math | |
---|---|---|---|---|---|---|---|---|---|
Student_1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 30% Chance |
Student_2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60% Chance |
Student_3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 30% Chance |
Student_4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60% Chance |
Student_5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 30% Chance |
Student_6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60% Chance |
Student_7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 30% Chance |
Student_8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60% Chance |
About Me
Alexander Lacson
Data Scientist with a background in Electronics
Contact
lacsonalexanderz@gmail.com
Portfolio Projects