Create a function that receives two inputs a and b, and returns the product of the a decimal of pi and the b decimal of pi.
i.e,
pi = 3.14159
if a = 2 and b = 4
result = 4 * 5
result = 20
Caveats:
from math import pi
def mult_dec_pi(a, b):
# Add the solution here
result = ''
return result
mult_dec_pi(a=2, b=4)
# 20.0
mult_dec_pi(a=5, b=10)
# 45.0
mult_dec_pi(a=14, b=1)
# 9.0
mult_dec_pi(a=6, b=8)
# 10.0
# Bonus
mult_dec_pi(a=16, b=4)
# 'Error'
Using the given dataset. Estimate a linear regression between Employed and GNP.
$$Employed = b_0 + b_1 * GNP $$$$\hat b = (X^TX)^{-1}X^TY$$$$Y = Employed$$$$X = [1 \quad GNP]$$%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
# Import data
raw_data = """
Year,Employed,GNP
1947,60.323,234.289
1948,61.122,259.426
1949,60.171,258.054
1950,61.187,284.599
1951,63.221,328.975
1952,63.639,346.999
1953,64.989,365.385
1954,63.761,363.112
1955,66.019,397.469
1956,67.857,419.18
1957,68.169,442.769
1958,66.513,444.546
1959,68.655,482.704
1960,69.564,502.601
1961,69.331,518.173
1962,70.551,554.894"""
data = []
for line in raw_data.splitlines()[2:]:
words = line.split(',')
data.append(words)
data = np.array(data, dtype=np.float)
n_obs = data.shape[0]
plt.plot(data[:, 2], data[:, 1], 'bo')
plt.xlabel("GNP")
plt.ylabel("Employed")
Text(0,0.5,'Employed')
Analyze the baby names dataset using pandas
import pandas as pd
# Load dataset
import zipfile
with zipfile.ZipFile('../datasets/baby-names2.csv.zip', 'r') as z:
f = z.open('baby-names2.csv')
names = pd.io.parsers.read_table(f, sep=',')
names.head()
year | name | prop | sex | soundex | |
---|---|---|---|---|---|
0 | 1880 | John | 0.081541 | boy | J500 |
1 | 1880 | William | 0.080511 | boy | W450 |
2 | 1880 | James | 0.050057 | boy | J520 |
3 | 1880 | Charles | 0.045167 | boy | C642 |
4 | 1880 | George | 0.043292 | boy | G620 |
names[names.year == 1993].head()
year | name | prop | sex | soundex | |
---|---|---|---|---|---|
113000 | 1993 | Michael | 0.024010 | boy | M240 |
113001 | 1993 | Christopher | 0.018572 | boy | C623 |
113002 | 1993 | Matthew | 0.017332 | boy | M300 |
113003 | 1993 | Joshua | 0.016268 | boy | J200 |
113004 | 1993 | Tyler | 0.014439 | boy | T460 |
boys = names[names.sex == 'boy'].copy()
girls = names[names.sex == 'girl'].copy()
william = boys[boys['name']=='William']
plt.plot(range(william.shape[0]), william['prop'])
plt.xticks(range(william.shape[0])[::5], william['year'].values[::5], rotation='vertical')
plt.ylim([0, 0.1])
plt.show()
Daniel = boys[boys['name']=='Daniel']
plt.plot(range(Daniel.shape[0]), Daniel['prop'])
plt.xticks(range(Daniel.shape[0])[::5], Daniel['year'].values[::5], rotation='vertical')
plt.ylim([0, 0.1])
plt.show()
Which has been the most popular boy name every decade?
Which has been the most popular girl name?
What is the most popular new girl name? (new is a name that appears only in the 2000's)