Pandas objects can be thought of as enhanced versions of NumPy structured arrays in which the rows and columns are identified with labels rather than simple integer indices.
Tree fundamental Pandas data structures:
the Series,
DataFrame,
and Index.
import pandas as pd
pd.__version__
'1.3.5'
print(help(pd))
Help on package pandas: NAME pandas DESCRIPTION pandas - a powerful data analysis and manipulation library for Python ===================================================================== **pandas** is a Python package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, **real world** data analysis in Python. Additionally, it has the broader goal of becoming **the most powerful and flexible open source data analysis / manipulation tool available in any language**. It is already well on its way toward this goal. Main Features ------------- Here are just a few of the things that pandas does well: - Easy handling of missing data in floating point as well as non-floating point data. - Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects - Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let `Series`, `DataFrame`, etc. automatically align the data for you in computations. - Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data. - Make it easy to convert ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects. - Intelligent label-based slicing, fancy indexing, and subsetting of large data sets. - Intuitive merging and joining data sets. - Flexible reshaping and pivoting of data sets. - Hierarchical labeling of axes (possible to have multiple labels per tick). - Robust IO tools for loading data from flat files (CSV and delimited), Excel files, databases, and saving/loading data from the ultrafast HDF5 format. - Time series-specific functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging. PACKAGE CONTENTS _config (package) _libs (package) _testing (package) _typing _version api (package) arrays (package) compat (package) conftest core (package) errors (package) io (package) plotting (package) testing tests (package) tseries (package) util (package) SUBMODULES _hashtable _lib _tslib offsets FUNCTIONS __getattr__(name) # GH 27101 DATA IndexSlice = <pandas.core.indexing._IndexSlice object> NA = <NA> NaT = NaT __docformat__ = 'restructuredtext' __git_version__ = '66e3805b8cabe977f40c05259cc3fcf7ead5687d' describe_option = <pandas._config.config.CallableDynamicDoc object> get_option = <pandas._config.config.CallableDynamicDoc object> options = <pandas._config.config.DictWrapper object> reset_option = <pandas._config.config.CallableDynamicDoc object> set_option = <pandas._config.config.CallableDynamicDoc object> VERSION 1.3.5 FILE /usr/local/lib/python3.7/dist-packages/pandas/__init__.py None
dir(pd)
['BooleanDtype', 'Categorical', 'CategoricalDtype', 'CategoricalIndex', 'DataFrame', 'DateOffset', 'DatetimeIndex', 'DatetimeTZDtype', 'ExcelFile', 'ExcelWriter', 'Flags', 'Float32Dtype', 'Float64Dtype', 'Float64Index', 'Grouper', 'HDFStore', 'Index', 'IndexSlice', 'Int16Dtype', 'Int32Dtype', 'Int64Dtype', 'Int64Index', 'Int8Dtype', 'Interval', 'IntervalDtype', 'IntervalIndex', 'MultiIndex', 'NA', 'NaT', 'NamedAgg', 'Period', 'PeriodDtype', 'PeriodIndex', 'RangeIndex', 'Series', 'SparseDtype', 'StringDtype', 'Timedelta', 'TimedeltaIndex', 'Timestamp', 'UInt16Dtype', 'UInt32Dtype', 'UInt64Dtype', 'UInt64Index', 'UInt8Dtype', '__builtins__', '__cached__', '__doc__', '__docformat__', '__file__', '__getattr__', '__git_version__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', '_config', '_hashtable', '_is_numpy_dev', '_lib', '_libs', '_np_version_under1p18', '_testing', '_tslib', '_typing', '_version', 'api', 'array', 'arrays', 'bdate_range', 'compat', 'concat', 'core', 'crosstab', 'cut', 'date_range', 'describe_option', 'errors', 'eval', 'factorize', 'get_dummies', 'get_option', 'infer_freq', 'interval_range', 'io', 'isna', 'isnull', 'json_normalize', 'lreshape', 'melt', 'merge', 'merge_asof', 'merge_ordered', 'notna', 'notnull', 'offsets', 'option_context', 'options', 'pandas', 'period_range', 'pivot', 'pivot_table', 'plotting', 'qcut', 'read_clipboard', 'read_csv', 'read_excel', 'read_feather', 'read_fwf', 'read_gbq', 'read_hdf', 'read_html', 'read_json', 'read_orc', 'read_parquet', 'read_pickle', 'read_sas', 'read_spss', 'read_sql', 'read_sql_query', 'read_sql_table', 'read_stata', 'read_table', 'read_xml', 'reset_option', 'set_eng_float_format', 'set_option', 'show_versions', 'test', 'testing', 'timedelta_range', 'to_datetime', 'to_numeric', 'to_pickle', 'to_timedelta', 'tseries', 'unique', 'util', 'value_counts', 'wide_to_long']
A Pandas Series is a one-dimensional array of indexed data where data can be many different things:
a Python dict
an ndarray
a scalar value (like 5)
import pandas as pd; import numpy as np
pd.Series([1, 2, 3, 4, 5])
0 1 1 2 2 3 3 4 4 5 dtype: int64
pd.Series(np.random.randn(5), index=["a", "b", "c", "d", "e"])
a -0.203503 b 1.013045 c 2.181949 d -1.122588 e 1.189388 dtype: float64
pd.Series(np.array([1, 2, 3, 4, 5]))
0 1 1 2 2 3 3 4 4 5 dtype: int64
pd.Series(np.array([1, 2, 3, 4, 5])) + pd.Series(np.array([5, 4, 3, 2, 1]))
0 6 1 6 2 6 3 6 4 6 dtype: int64
d = {"b": 1, "a": 0, "c": 2}; pd.Series(d)
b 1 a 0 c 2 dtype: int64
The Series object is basically interchangeable with a one-dimensional NumPy array. The essential difference is the presence of the index: while the Numpy Array has an implicitly defined integer index used to access the values, the Pandas Series has an explicitly defined index associated with the values.
DataFrame
is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects. It is generally the most commonly used pandas object. Like Series, DataFrame accepts many different kinds of input:
Dict of 1D ndarrays, lists, dicts, or Series
2-D numpy.ndarray
Structured or record ndarray
A Series
Another DataFrame
df = pd.DataFrame(np.random.randn(5,4),index='A B C D E'.split(),columns='W X Y Z'.split())
df
W | X | Y | Z | |
---|---|---|---|---|
A | -1.055250 | -0.255963 | 0.523574 | -0.216817 |
B | -1.587131 | -0.128045 | -0.767518 | -0.055505 |
C | -0.589701 | -1.098669 | 1.061195 | -0.407758 |
D | -0.692038 | -0.886125 | 0.937507 | 0.980382 |
E | -1.218705 | -0.135298 | 1.196934 | 1.009017 |
import pandas as pd
d = {"one": pd.Series([1.0, 2.0, 3.0], index=["a", "b", "c"]),
"two": pd.Series([1.0, 2.0, 3.0, 4.0], index=["a", "b", "c", "d"])
}
df = pd.DataFrame(d); print(type(df)); print('\n'); df.info(); print('\n'); df
<class 'pandas.core.frame.DataFrame'> <class 'pandas.core.frame.DataFrame'> Index: 4 entries, a to d Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 one 3 non-null float64 1 two 4 non-null float64 dtypes: float64(2) memory usage: 96.0+ bytes
one | two | |
---|---|---|
a | 1.0 | 1.0 |
b | 2.0 | 2.0 |
c | 3.0 | 3.0 |
d | NaN | 4.0 |
import pandas as pd
import numpy as np
df=pd.DataFrame([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 20, 22, 34]],
index=[0,1,2],
columns=['A', 'B', 'C', 'D'])
df
A | B | C | D | |
---|---|---|---|---|
0 | 1 | 2 | 3 | 4 |
1 | 5 | 6 | 7 | 8 |
2 | 9 | 20 | 22 | 34 |
# source: Yahoo Finance
stock = pd.DataFrame([['tech', 'AAPL', 157.25, 1.23, 26.11, 2539, 0.58, 184.01],
['tech', 'MSFT', 261.47, 0.93, 26.98, 1942, 0.95, 333.00],
['engr', 'XOM', 93.87, 1.08, 10.27, 391, 3.68, 103.28],
['engr', 'COP', 109.45, 1.38, 8.87, 137, 1,75],
['fin', 'BAC',33.47, 1.40, 10.46, 269, 2.62, 42.41],
['fin', 'JPM',114.51, 1.10, 9.18, 336, 3.52, 138.80],
['fin', 'WMT',335.50, 1.39, 7.53, 114, 3.00, 394.07]])
stock
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
6 | fin | WMT | 335.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
stock.columns=['sector', 'ticker', 'price', 'beta', 'pe', 'mktcap', 'dy', 'target']
stock.index = ["a", "b", "c", "d", "e", "f", "g"]
stock
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
a | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
b | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
c | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
d | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
e | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
f | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
g | fin | WMT | 335.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
stock.info()
<class 'pandas.core.frame.DataFrame'> Index: 7 entries, a to g Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 sector 7 non-null object 1 ticker 7 non-null object 2 price 7 non-null float64 3 beta 7 non-null float64 4 pe 7 non-null float64 5 mktcap 7 non-null int64 6 dy 7 non-null float64 7 target 7 non-null float64 dtypes: float64(5), int64(1), object(2) memory usage: 504.0+ bytes
stock.index = [0, 1, 2, 3, 4, 5, 6]
stock
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
6 | fin | WMT | 335.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
stock.drop('target',axis=1)
sector | ticker | price | beta | pe | mktcap | dy | |
---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 |
6 | fin | WMT | 335.50 | 1.39 | 7.53 | 114 | 3.00 |
# but this wont affect the orginal data
stock
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
6 | fin | WMT | 335.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
# must put down inplace option
stock.drop('target', axis=1, inplace=True)
stock
sector | ticker | price | beta | pe | mktcap | dy | |
---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 |
6 | fin | WMT | 335.50 | 1.39 | 7.53 | 114 | 3.00 |
stock['target'] = [184.01, 333.00, 103.28, 75.00, 42.41, 138.80, 394.07]
stock
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
6 | fin | WMT | 335.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
stock2 = stock.drop(6, axis=0)
stock2
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
newstock = pd.DataFrame([['ind', 184.71, 1.00, 14.61, 98, 2.63, 214.00]],
index=['CAT'],
columns=['sector', 'price', 'beta', 'pe', 'mktcap', 'dy', 'target'])
newstock
sector | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|
CAT | ind | 184.71 | 1.0 | 14.61 | 98 | 2.63 | 214.0 |
stock3 = stock.append(newstock)
stock3
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
6 | fin | WMT | 335.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
CAT | ind | NaN | 184.71 | 1.00 | 14.61 | 98 | 2.63 | 214.00 |
stock3 = pd.concat([stock, newstock])
stock3
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
6 | fin | WMT | 335.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
CAT | ind | NaN | 184.71 | 1.00 | 14.61 | 98 | 2.63 | 214.00 |
stock.loc[6, 'ticker'] = 'GS'
stock
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
6 | fin | GS | 335.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
stock['price'].replace(335.50, 435.50, inplace=True)
stock
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
6 | fin | GS | 435.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
print(stock['ticker']); print('\n');
print(stock[['ticker']]); print('\n');
print(stock.ticker); print('\n');
print(stock.loc[:, 'ticker']); print('\n');
print(stock.iloc[:, 1]); print('\n');
print(stock.iloc[:, -1]); print('\n');
0 AAPL 1 MSFT 2 XOM 3 COP 4 BAC 5 JPM 6 GS Name: ticker, dtype: object ticker 0 AAPL 1 MSFT 2 XOM 3 COP 4 BAC 5 JPM 6 GS 0 AAPL 1 MSFT 2 XOM 3 COP 4 BAC 5 JPM 6 GS Name: ticker, dtype: object 0 AAPL 1 MSFT 2 XOM 3 COP 4 BAC 5 JPM 6 GS Name: ticker, dtype: object 0 AAPL 1 MSFT 2 XOM 3 COP 4 BAC 5 JPM 6 GS Name: ticker, dtype: object 0 184.01 1 333.00 2 103.28 3 75.00 4 42.41 5 138.80 6 394.07 Name: target, dtype: float64
stock.loc[:, 'ticker':'beta']
ticker | price | beta | |
---|---|---|---|
0 | AAPL | 157.25 | 1.23 |
1 | MSFT | 261.47 | 0.93 |
2 | XOM | 93.87 | 1.08 |
3 | COP | 109.45 | 1.38 |
4 | BAC | 33.47 | 1.40 |
5 | JPM | 114.51 | 1.10 |
6 | GS | 435.50 | 1.39 |
stock.loc[:, :]
sector | ticker | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|---|
0 | tech | AAPL | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
1 | tech | MSFT | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
2 | engr | XOM | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
3 | engr | COP | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
4 | fin | BAC | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
5 | fin | JPM | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
6 | fin | GS | 435.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |
stock.loc[0:2, 'beta']
0 1.23 1 0.93 2 1.08 Name: beta, dtype: float64
stock.loc[3, "ticker"]
'COP'
stock.loc[5, "mktcap"] > stock.loc[6, "mktcap"]
True
stock.index
Int64Index([0, 1, 2, 3, 4, 5, 6], dtype='int64')
stock.describe()
price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|
count | 7.000000 | 7.000000 | 7.000000 | 7.000000 | 7.000000 | 7.000000 |
mean | 172.217143 | 1.215714 | 14.200000 | 818.285714 | 2.192857 | 181.510000 |
std | 135.503274 | 0.184829 | 8.492224 | 991.686058 | 1.315025 | 133.400706 |
min | 33.470000 | 0.930000 | 7.530000 | 114.000000 | 0.580000 | 42.410000 |
25% | 101.660000 | 1.090000 | 9.025000 | 203.000000 | 0.975000 | 89.140000 |
50% | 114.510000 | 1.230000 | 10.270000 | 336.000000 | 2.620000 | 138.800000 |
75% | 209.360000 | 1.385000 | 18.285000 | 1166.500000 | 3.260000 | 258.505000 |
max | 435.500000 | 1.400000 | 26.980000 | 2539.000000 | 3.680000 | 394.070000 |
print(stock['price'].mean());
print(stock['beta'].median());
print(stock['dy'].std());
print(stock['mktcap'].max());
172.21714285714285 1.23 1.315024895654252 2539
stock.groupby('sector').mean().round(2)
price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|
sector | ||||||
engr | 101.66 | 1.23 | 9.57 | 264.00 | 2.34 | 89.14 |
fin | 194.49 | 1.30 | 9.06 | 239.67 | 3.05 | 191.76 |
tech | 209.36 | 1.08 | 26.54 | 2240.50 | 0.76 | 258.50 |
stock.set_index('ticker', inplace=True)
stock.loc['JPM', 'mktcap'] > stock.loc['GS', 'mktcap']
True
stock
sector | price | beta | pe | mktcap | dy | target | |
---|---|---|---|---|---|---|---|
ticker | |||||||
AAPL | tech | 157.25 | 1.23 | 26.11 | 2539 | 0.58 | 184.01 |
MSFT | tech | 261.47 | 0.93 | 26.98 | 1942 | 0.95 | 333.00 |
XOM | engr | 93.87 | 1.08 | 10.27 | 391 | 3.68 | 103.28 |
COP | engr | 109.45 | 1.38 | 8.87 | 137 | 1.00 | 75.00 |
BAC | fin | 33.47 | 1.40 | 10.46 | 269 | 2.62 | 42.41 |
JPM | fin | 114.51 | 1.10 | 9.18 | 336 | 3.52 | 138.80 |
GS | fin | 435.50 | 1.39 | 7.53 | 114 | 3.00 | 394.07 |