pandas, the Python Data Analysis Library, is an important resource. pandas will give you many ways to practice your brand of data science, whatever the walk of life.
The pandas.DataFrame type lets you join pandas.Series type columns into a multi-column data table, complete with row and column names of your choice, both re-orderable.
Once you have a DataFrame defined, adding new columns based on the old, getting summary statistics, applying functions, generating visualizations, is all within reach.
import math as clc
clc.sin(clc.radians(90)) # remembering trig
1.0
# and now for something completely different...
import numpy as np
import pandas as pd
Lets stack up a column of polyhedron names, using a kind of jargon or shorthand.
shapes = np.array(["Tetra", "Cubocta", "Icosa", "Cube", "Octa",
"RT5", "RT5+", "RD", "RT", "PD", "Icosa", "Cubocta",
"SuperRT", "Cube"],
dtype=np.str_)
shapes
array(['Tetra', 'Cubocta', 'Icosa', 'Cube', 'Octa', 'RT5', 'RT5+', 'RD', 'RT', 'PD', 'Icosa', 'Cubocta', 'SuperRT', 'Cube'], dtype='<U7')
So far that's a numpy.ndarray
we've created. Now lets bring that into a Series.
shapes_col = pd.Series(shapes, name="Shape")
shapes_col
0 Tetra 1 Cubocta 2 Icosa 3 Cube 4 Octa 5 RT5 6 RT5+ 7 RD 8 RT 9 PD 10 Icosa 11 Cubocta 12 SuperRT 13 Cube Name: Shape, dtype: object
The vertical Series, a column of some data type (dtype), is the building block of the DataFrame, which sets them side by side in a tabular arrangement.
What are the S and E modules? Lets look at some pictures:
24 S modules, 12 left and 12 right, wedge between Octa 4 and the Icosahedron inscribed inside it.
120 E modules, 120 left and 120 right, comprise the rhombic triacontahedron inside of which, and tangent to its 30 faces, is the unit radius ball.
# geometric constants
phi = (1 + clc.sqrt(5))/2
# volumes of specific tetrahedral wedges
Emod = clc.sqrt(2)/8 * 1/phi**3
Emod3 = clc.sqrt(2)/8
emod3 = Emod * (phi**-3)
Smod = (phi**-5) / 2
Sfactor = Smod/Emod
S3 = clc.sqrt(9/8)
# defined to have edges = 2R or 1D
Icosa = 100 * Emod3 + 20 * Emod
PentDodeca = 84 * Emod3 + 12 * Emod # 348 * Emod + 84 * emod3
# volumes corresponding to our shapes
volumes = np.array([1, 2.5, 2.5 * Sfactor**2,
3, 4, 5, 120 * Emod, 6,
7.5, PentDodeca, Icosa, 20,
20 * S3, 24], dtype=np.float)
volumes_col = pd.Series(volumes, name="IVM Volume") # turn np.array into a pd.Series
volumes_col
0 1.000000 1 2.500000 2 2.917961 3 3.000000 4 4.000000 5 5.000000 6 5.007758 7 6.000000 8 7.500000 9 15.350018 10 18.512296 11 20.000000 12 21.213203 13 24.000000 Name: IVM Volume, dtype: float64
vols_table = pd.DataFrame({"Shape": shapes_col, "IVM Volume":volumes_col})
# vols_table.index = shapes_col # the shapes column is the index
vols_table
Shape | IVM Volume | |
---|---|---|
0 | Tetra | 1.000000 |
1 | Cubocta | 2.500000 |
2 | Icosa | 2.917961 |
3 | Cube | 3.000000 |
4 | Octa | 4.000000 |
5 | RT5 | 5.000000 |
6 | RT5+ | 5.007758 |
7 | RD | 6.000000 |
8 | RT | 7.500000 |
9 | PD | 15.350018 |
10 | Icosa | 18.512296 |
11 | Cubocta | 20.000000 |
12 | SuperRT | 21.213203 |
13 | Cube | 24.000000 |
vols_table['XYZ Volume'] = vols_table['IVM Volume'] * 1/S3
vols_table
Shape | IVM Volume | XYZ Volume | |
---|---|---|---|
0 | Tetra | 1.000000 | 0.942809 |
1 | Cubocta | 2.500000 | 2.357023 |
2 | Icosa | 2.917961 | 2.751080 |
3 | Cube | 3.000000 | 2.828427 |
4 | Octa | 4.000000 | 3.771236 |
5 | RT5 | 5.000000 | 4.714045 |
6 | RT5+ | 5.007758 | 4.721360 |
7 | RD | 6.000000 | 5.656854 |
8 | RT | 7.500000 | 7.071068 |
9 | PD | 15.350018 | 14.472136 |
10 | Icosa | 18.512296 | 17.453560 |
11 | Cubocta | 20.000000 | 18.856181 |
12 | SuperRT | 21.213203 | 20.000000 |
13 | Cube | 24.000000 | 22.627417 |
Practice with df.loc[rows, columns]
.
vols_table.iloc[12] # entire row
Shape SuperRT IVM Volume 21.213203 XYZ Volume 20.0 Name: 12, dtype: object
vols_table.iloc[0] # entire row
Shape Tetra IVM Volume 1.0 XYZ Volume 0.942809 Name: 0, dtype: object
df.loc[df['col1'] == value]
vols_table.loc[vols_table['Shape'] == "SuperRT" ] # specific cell
Shape | IVM Volume | XYZ Volume | |
---|---|---|---|
12 | SuperRT | 21.213203 | 20.0 |
Now lets add some constituent modules that may be used to assemble the above shapes.
modules = np.array(["A","B", "T", "E", "S"],
dtype=np.str_)
mods_col = pd.Series(modules, name="Shape")
mod_vols = np.array([1/24, 1/24, 1/24, Emod, (phi**-5) / 2], dtype=np.float)
mod_vols_col = pd.Series(mod_vols, name="IVM Volume")
mods_table = pd.DataFrame({"Shape":mods_col, "IVM Volume":mod_vols_col})
# mods_table.index = mods_col
mods_table
Shape | IVM Volume | |
---|---|---|
0 | A | 0.041667 |
1 | B | 0.041667 |
2 | T | 0.041667 |
3 | E | 0.041731 |
4 | S | 0.045085 |
mods_table['XYZ Volume'] = mods_table['IVM Volume'] * 1/S3
mods_table
Shape | IVM Volume | XYZ Volume | |
---|---|---|---|
0 | A | 0.041667 | 0.039284 |
1 | B | 0.041667 | 0.039284 |
2 | T | 0.041667 | 0.039284 |
3 | E | 0.041731 | 0.039345 |
4 | S | 0.045085 | 0.042507 |
And now it's time to assemble the full table.
pd.concat([mods_table, vols_table])
Shape | IVM Volume | XYZ Volume | |
---|---|---|---|
0 | A | 0.041667 | 0.039284 |
1 | B | 0.041667 | 0.039284 |
2 | T | 0.041667 | 0.039284 |
3 | E | 0.041731 | 0.039345 |
4 | S | 0.045085 | 0.042507 |
0 | Tetra | 1.000000 | 0.942809 |
1 | Cubocta | 2.500000 | 2.357023 |
2 | Icosa | 2.917961 | 2.751080 |
3 | Cube | 3.000000 | 2.828427 |
4 | Octa | 4.000000 | 3.771236 |
5 | RT5 | 5.000000 | 4.714045 |
6 | RT5+ | 5.007758 | 4.721360 |
7 | RD | 6.000000 | 5.656854 |
8 | RT | 7.500000 | 7.071068 |
9 | PD | 15.350018 | 14.472136 |
10 | Icosa | 18.512296 | 17.453560 |
11 | Cubocta | 20.000000 | 18.856181 |
12 | SuperRT | 21.213203 | 20.000000 |
13 | Cube | 24.000000 | 22.627417 |
CH = pd.concat([mods_table, vols_table])
CH = CH.reset_index(drop=True)
CH.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 19 entries, 0 to 18 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Shape 19 non-null object 1 IVM Volume 19 non-null float64 2 XYZ Volume 19 non-null float64 dtypes: float64(2), object(1) memory usage: 584.0+ bytes
# df['new'] = pd.Series(dtype='int')
CH['Comments'] = pd.Series(dtype='str_')
CH
Shape | IVM Volume | XYZ Volume | Comments | |
---|---|---|---|---|
0 | A | 0.041667 | 0.039284 | NaN |
1 | B | 0.041667 | 0.039284 | NaN |
2 | T | 0.041667 | 0.039284 | NaN |
3 | E | 0.041731 | 0.039345 | NaN |
4 | S | 0.045085 | 0.042507 | NaN |
5 | Tetra | 1.000000 | 0.942809 | NaN |
6 | Cubocta | 2.500000 | 2.357023 | NaN |
7 | Icosa | 2.917961 | 2.751080 | NaN |
8 | Cube | 3.000000 | 2.828427 | NaN |
9 | Octa | 4.000000 | 3.771236 | NaN |
10 | RT5 | 5.000000 | 4.714045 | NaN |
11 | RT5+ | 5.007758 | 4.721360 | NaN |
12 | RD | 6.000000 | 5.656854 | NaN |
13 | RT | 7.500000 | 7.071068 | NaN |
14 | PD | 15.350018 | 14.472136 | NaN |
15 | Icosa | 18.512296 | 17.453560 | NaN |
16 | Cubocta | 20.000000 | 18.856181 | NaN |
17 | SuperRT | 21.213203 | 20.000000 | NaN |
18 | Cube | 24.000000 | 22.627417 | NaN |
CH.iloc[0, -1] = '24 make a Tetra'
CH.iloc[1, -1] = 'AAB = BAA = Mite'
CH.iloc[2, -1] = '1/120 RT5'
CH.iloc[3, -1] = '1/120 RT5+'
CH.iloc[4, -1] = '(φ**-5) / 2'
CH.iloc[5, -1] = "edges D, from 4 IVM balls"
CH.iloc[6, -1] = 'some faces flush with Octa 4'
CH.iloc[7, -1] = 'some faces flush with Octa 4'
CH.iloc[8, -1] = 'Duo-Tet, face diagonals = D'
CH.iloc[9, -1] = 'Dual of Cube, edges D'
CH.iloc[10, -1] = '120 T mods'
CH.iloc[11, -1] = '120 E mods'
CH.iloc[12, -1] = 'Rhombic Dodeca, long diagonals = D'
CH.iloc[13, -1] = 'some vertexes shared with RD'
CH.iloc[14, -1] = 'Pentagonal Dodeca, dual of Icosa'
CH.iloc[15, -1] = 'edges D'
CH.iloc[16, -1] = 'edges = D, 1F, 12-balls around nuclear ball'
CH.iloc[17, -1] = 'icosa of edges D + dual PD'
CH.iloc[18, -1] = 'face diagonals = 2D, 2F'
CH
Shape | IVM Volume | XYZ Volume | Comments | |
---|---|---|---|---|
0 | A | 0.041667 | 0.039284 | 24 make a Tetra |
1 | B | 0.041667 | 0.039284 | AAB = BAA = Mite |
2 | T | 0.041667 | 0.039284 | 1/120 RT5 |
3 | E | 0.041731 | 0.039345 | 1/120 RT5+ |
4 | S | 0.045085 | 0.042507 | (φ**-5) / 2 |
5 | Tetra | 1.000000 | 0.942809 | edges D, from 4 IVM balls |
6 | Cubocta | 2.500000 | 2.357023 | some faces flush with Octa 4 |
7 | Icosa | 2.917961 | 2.751080 | some faces flush with Octa 4 |
8 | Cube | 3.000000 | 2.828427 | Duo-Tet, face diagonals = D |
9 | Octa | 4.000000 | 3.771236 | Dual of Cube, edges D |
10 | RT5 | 5.000000 | 4.714045 | 120 T mods |
11 | RT5+ | 5.007758 | 4.721360 | 120 E mods |
12 | RD | 6.000000 | 5.656854 | Rhombic Dodeca, long diagonals = D |
13 | RT | 7.500000 | 7.071068 | some vertexes shared with RD |
14 | PD | 15.350018 | 14.472136 | Pentagonal Dodeca, dual of Icosa |
15 | Icosa | 18.512296 | 17.453560 | edges D |
16 | Cubocta | 20.000000 | 18.856181 | edges = D, 1F, 12-balls around nuclear ball |
17 | SuperRT | 21.213203 | 20.000000 | icosa of edges D + dual PD |
18 | Cube | 24.000000 | 22.627417 | face diagonals = 2D, 2F |
For further reading: