(c) 2016 - present. Enplus Advisors, Inc.
import numpy as np
import pandas as pd
df = pd.DataFrame({
'ticker': ['AAPL', 'AAPL', 'MSFT', 'IBM', 'YHOO'],
'date': ['2015-12-30', '2015-12-31', '2015-12-30', '2015-12-30', '2015-12-30'],
'open': [426.23, 427.81, 42.3, 101.65, 35.53]
})
Exercise:
open
column as a Series
using attribute lookupopen
column as a Series
using dict
-style lookupdate
column as a DataFrame
res1a = df.open
res1b = df['open']
res1c = df[['open']]
Exercise:
AAPL
ticker and the date
and open
columns.df1
a new DataFrame
with ticker
as
the index.df2
a new DataFrame
with date
as
the index. Create this DataFrame
from df1
with a single
statement.df2
by the index values.res2a = df.loc[df.ticker == 'AAPL', ['date', 'open']]
df1 = df.set_index('ticker')
df2 = df1.reset_index().set_index('date')
df2_sorted = df2.sort_index()
Exercise:
df
called df3
. Add a new column of NaNs
to df3
called close
. Assign close
the same value as open
for all open
values greater than 100.df3
by its close
values.df3 = df.copy()
# this could be skipped from a functional standpoint, though
# the instructions say to do it
df3['close'] = np.nan
gt100 = df3.open[df3.open > 100]
df3.close = gt100 # you can use dot syntax b/c `close` already exists
df3
ticker | date | open | close | |
---|---|---|---|---|
0 | AAPL | 2015-12-30 | 426.23 | 426.23 |
1 | AAPL | 2015-12-31 | 427.81 | 427.81 |
2 | MSFT | 2015-12-30 | 42.30 | NaN |
3 | IBM | 2015-12-30 | 101.65 | 101.65 |
4 | YHOO | 2015-12-30 | 35.53 | NaN |