In this notebook we'll have a preliminary poke around in the object
data harvested from the NMA Collection API. I'll focus here on the basic shape/stats of the data, other notebooks will explore the object data over time and space.
If you haven't already, you'll either need to harvest the object
data, or unzip a pre-harvested dataset.
If you haven't used one of these notebooks before, they're basically web pages in which you can write, edit, and run live code. They're meant to encourage experimentation, so don't feel nervous. Just try running a few cells and see what happens!
Some tips:
Is this thing on? If you can't edit or run any of the code cells, you might be viewing a static (read only) version of this notebook. Click here to load a live version running on Binder.
import pandas as pd
import math
from IPython.display import display, HTML, FileLink
from tinydb import TinyDB, Query
from pandas import json_normalize
# Load the harvested data from the json db
db = TinyDB('nma_object_db.json')
records = db.all()
Object = Query()
# Convert to a dataframe
df = pd.DataFrame(records)
df.head()
id | type | title | _meta | additionalType | collection | identifier | medium | extent | physicalDescription | ... | isPartOf | seeAlso | description | hasVersion | temporal | relation | hasPart | educationalSignificance | location | acknowledgement | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 145400 | object | Wahlo and Tribal law by Kevin Gilbert, reprint... | {'modified': '2018-07-09', 'issued': '2011-10-... | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | 251390 | object | Pair of woven shoes made from feathers and hair | {'modified': '2019-01-17', 'issued': '2018-04-... | [Shoes] | {'id': '5244', 'type': 'Collection', 'title': ... | 2000.0014.0495 | [{'type': 'Material', 'title': 'Feather'}, {'t... | {'type': 'Measurement', 'length': 260, 'width'... | Shoes, the soles of which are made from woven ... | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | 124081 | object | Pair of ceremonial shoes | {'modified': '2018-12-04', 'issued': '2006-10-... | NaN | {'id': '1892', 'type': 'Collection', 'title': ... | 1992.0089.0165 | [{'type': 'Material', 'title': 'Feather'}] | {'type': 'Measurement', 'length': 246, 'width'... | A pair of ceremonial shoes made with several m... | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | 21507 | object | Grinding stone | {'modified': '2018-06-19', 'issued': '2014-12-... | [Grinding stones] | {'id': '2229', 'type': 'Collection', 'title': ... | 1985.0288.0109 | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | 142308 | object | 'time CHange' [sic] | {'modified': '2019-04-15', 'issued': '2012-06-... | [Compact discs] | {'id': '3893', 'type': 'Collection', 'title': ... | AR00213.012 | NaN | NaN | A compact disc, housed within a clear and blac... | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 25 columns
How many objects are there?
print('There are {:,} objects in the collection'.format(df.shape[0]))
There are 86,717 objects in the collection
Obviously not every record has a value for every field, let's create a quick count of the number of values in each field.
df.count()
id 86717 type 86717 title 86558 _meta 86717 additionalType 86690 collection 84289 identifier 86692 medium 73952 extent 64199 physicalDescription 86397 significanceStatement 32468 creator 25119 spatial 46773 contributor 40760 isAggregatedBy 4353 isPartOf 10769 seeAlso 467 description 9128 hasVersion 20159 temporal 29597 relation 3096 hasPart 2350 educationalSignificance 201 location 1069 acknowledgement 789 dtype: int64
Let's express those counts as a percentage of the total number of records, and display them as a bar chart using Pandas.
# Get field counts and convert to dataframe
field_counts = df.count().to_frame().reset_index()
# Change column headings
field_counts.columns = ['field', 'count']
# Calculate proportion of the total
field_counts['proportion'] = field_counts['count'].apply(lambda x: x / df.shape[0])
# Style the results as a barchart
field_counts.style.bar(subset=['proportion'], color='#d65f5f').format({'proportion': '{:.2%}'.format})
field | count | proportion | |
---|---|---|---|
0 | id | 86717 | 100.00% |
1 | type | 86717 | 100.00% |
2 | title | 86558 | 99.82% |
3 | _meta | 86717 | 100.00% |
4 | additionalType | 86690 | 99.97% |
5 | collection | 84289 | 97.20% |
6 | identifier | 86692 | 99.97% |
7 | medium | 73952 | 85.28% |
8 | extent | 64199 | 74.03% |
9 | physicalDescription | 86397 | 99.63% |
10 | significanceStatement | 32468 | 37.44% |
11 | creator | 25119 | 28.97% |
12 | spatial | 46773 | 53.94% |
13 | contributor | 40760 | 47.00% |
14 | isAggregatedBy | 4353 | 5.02% |
15 | isPartOf | 10769 | 12.42% |
16 | seeAlso | 467 | 0.54% |
17 | description | 9128 | 10.53% |
18 | hasVersion | 20159 | 23.25% |
19 | temporal | 29597 | 34.13% |
20 | relation | 3096 | 3.57% |
21 | hasPart | 2350 | 2.71% |
22 | educationalSignificance | 201 | 0.23% |
23 | location | 1069 | 1.23% |
24 | acknowledgement | 789 | 0.91% |
One thing you might note is that some of the fields contain nested JSON arrays or objects. For example additionalType
contains a list of object types, while extent
is a dictionary with keys and values. Let's unpack these columns for the second row (index of 1).
df['additionalType'][1][0]
'Shoes'
df['extent'][1]
{'type': 'Measurement', 'length': 260, 'width': 120, 'depth': 40, 'unitText': 'mm'}
df['extent'][1]['length']
260
additionalType
field¶How many objects have values in the additionalType
column?
df.loc[df['additionalType'].notnull()].shape
(86690, 25)
print('{:%} of objects have an additionalType value'.format(df.loc[df['additionalType'].notnull()].shape[0] / df.shape[0]))
99.968864% of objects have an additionalType value
So which ones don't have an additionalType
?
# Just show the first 5 rows
df.loc[df['additionalType'].isnull()].head()
id | type | title | _meta | additionalType | collection | identifier | medium | extent | physicalDescription | ... | isPartOf | seeAlso | description | hasVersion | temporal | relation | hasPart | educationalSignificance | location | acknowledgement | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 145400 | object | Wahlo and Tribal law by Kevin Gilbert, reprint... | {'modified': '2018-07-09', 'issued': '2011-10-... | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | 124081 | object | Pair of ceremonial shoes | {'modified': '2018-12-04', 'issued': '2006-10-... | NaN | {'id': '1892', 'type': 'Collection', 'title': ... | 1992.0089.0165 | [{'type': 'Material', 'title': 'Feather'}] | {'type': 'Measurement', 'length': 246, 'width'... | A pair of ceremonial shoes made with several m... | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1728 | 180161 | object | Awelye- panel 1 by Lily Kngwarreye | {'copyright': '', 'licence': ''} | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1939 | 224632 | object | Glass plate negative of family and horse stand... | {'copyright': '', 'licence': ''} | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3416 | 180165 | object | Awelye- panel 3 by Lily Kngwarreye | {'copyright': '', 'licence': ''} | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 25 columns
How many rows have more than one additionalType
?
df.loc[df['additionalType'].str.len() > 1].shape[0]
1038
Let's have a look at a sample.
df.loc[df['additionalType'].str.len() > 1].head()
id | type | title | _meta | additionalType | collection | identifier | medium | extent | physicalDescription | ... | isPartOf | seeAlso | description | hasVersion | temporal | relation | hasPart | educationalSignificance | location | acknowledgement | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
45 | 202601 | object | Album of Newspaper clippings | {'modified': '2019-04-22', 'issued': '2010-11-... | [Albums, Newspaper clippings] | {'id': '4760', 'type': 'Collection', 'title': ... | 1989.0009.0108 | [{'type': 'Material', 'title': 'Cardboard'}, {... | {'type': 'Measurement', 'height': 345, 'width'... | A brown textured hardback album with gold colo... | ... | NaN | NaN | NaN | NaN | [{'type': 'Event', 'title': '1935', 'startDate... | NaN | NaN | NaN | NaN | NaN |
113 | 256766 | object | Handmade wolf figurine in yellow dress likely ... | {'modified': '2018-12-13', 'issued': '2018-10-... | [Novelty toys, Toys] | {'id': '6773', 'type': 'Collection', 'title': ... | 2013.0038.0556.005 | [{'type': 'Material', 'title': 'Cotton thread'... | {'type': 'Measurement', 'height': 88, 'width':... | A handmade wolf figurine robed in a yellow dre... | ... | NaN | NaN | NaN | NaN | [{'type': 'Event', 'title': '1925 - 1935', 'st... | NaN | NaN | NaN | NaN | NaN |
133 | 223557 | object | Receipt issued to Tirranna Race Club, 1878 | {'modified': '2019-04-23', 'issued': '2017-11-... | [Invoices, Receipts] | {'id': '6139', 'type': 'Collection', 'title': ... | 2012.0019.0170 | [{'type': 'Material', 'title': 'Ink'}, {'type'... | {'type': 'Measurement', 'height': 114, 'width'... | A receipt handwritten on a piece of grey paper... | ... | NaN | NaN | NaN | NaN | [{'type': 'Event', 'title': '1878', 'startDate... | NaN | NaN | NaN | NaN | NaN |
219 | 231018 | object | Cycling jersey worn by Harry Clarke | {'modified': '2019-04-12', 'issued': '2017-03-... | [Gee clamps, Sports clothing] | {'id': '7017', 'type': 'Collection', 'title': ... | 2013.0033.0002 | [{'type': 'Material', 'title': 'Polyester clot... | {'type': 'Measurement', 'height': 610, 'width'... | A short sleeved, striped brown, black and tan ... | ... | NaN | NaN | Brown and yellow cycling jersey worn by Harry ... | [{'id': '131401', 'type': 'StillImage', 'ident... | [{'type': 'Event', 'title': '1988', 'startDate... | NaN | NaN | NaN | NaN | NaN |
301 | 255447 | object | Pair of orange leather dolls shoes with pom pom | {'modified': '2019-04-24', 'issued': '2018-06-... | [Dolls clothing, Shoes] | {'id': '6773', 'type': 'Collection', 'title': ... | 2013.0038.0315 | [{'type': 'Material', 'title': 'Cotton thread'... | {'type': 'Measurement', 'height': 25, 'width':... | A pair of orange leather dolls shoes with one ... | ... | NaN | NaN | NaN | NaN | [{'type': 'Event', 'title': '1925 - 1935', 'st... | NaN | NaN | NaN | NaN | NaN |
5 rows × 25 columns
The additionalType
field contains a nested list of values. Using json_normalize()
or explode()
we can explode these lists, creating a row for each separate value.
# Use json_normalize to expand 'additionalType' into separate rows, adding the id and title from the parent record
# df_types = json_normalize(df.loc[df['additionalType'].notnull()].to_dict('records'), record_path='additionalType', meta=['id', 'title'], errors='ignore').rename({0: 'additionalType'}, axis=1)
# In pandas v.0.25 and above you can just use explode -- this prodices the same result as above
df_types = df.loc[df['additionalType'].notnull()][['id', 'title', 'additionalType']].explode('additionalType')
df_types.head()
id | title | additionalType | |
---|---|---|---|
1 | 251390 | Pair of woven shoes made from feathers and hair | Shoes |
3 | 21507 | Grinding stone | Grinding stones |
4 | 142308 | 'time CHange' [sic] | Compact discs |
5 | 20174 | Ten Days To Live - A supposed sorcery painting. | Bark paintings |
6 | 144359 | 'The Dance of Life (1898-1902)' by Diana Boyer... | Booklets |
Now that we've exploded the type values, we can aggregate them in different ways. Let's look at the 25 most common object types!
df_types['additionalType'].value_counts()[:25]
Mineral samples 6000 Photographs 4742 Stone artefacts 4364 Photographic postcards 4250 Drawings 3755 Postcards 3697 Zoological specimens 2168 Bark paintings 2107 Geological specimens 1993 Engravings 1498 Cartoons 1384 Negatives 1124 Boomerangs 1025 Spears 1012 Percussion and abrading stones 982 Paintings 840 Clubs 747 Mounts 745 Cards 709 Armbands 649 Shells 563 Letters 543 Documents 519 Geophysical survey equipment 509 Posters 497 Name: additionalType, dtype: int64
How many object types only appear once?
type_counts = df_types['additionalType'].value_counts().to_frame().reset_index().rename({'index': 'type', 'additionalType': 'count'}, axis=1)
unique_types = type_counts.loc[type_counts['count'] == 1]
unique_types.shape[0]
639
unique_types.head()
type | count | |
---|---|---|
1854 | Medications | 1 |
1855 | Hollow bits | 1 |
1856 | Television cameras | 1 |
1857 | Art drawings | 1 |
1858 | Electric indicators | 1 |
Let's save the complete list of types as a CSV file.
type_counts.to_csv('nma_object_type_counts.csv', index=False)
display(FileLink('nma_object_type_counts.csv'))
Browsing the CSV I noticed that there was one item with the type Vegetables
. Let's find some more out about it.
# Find in the complete data set
mask = df.loc[df['additionalType'].notnull()]['additionalType'].apply(lambda x: 'Vegetables' in x)
veggie = df.loc[df['additionalType'].notnull()][mask]
veggie
id | type | title | _meta | additionalType | collection | identifier | medium | extent | physicalDescription | ... | isPartOf | seeAlso | description | hasVersion | temporal | relation | hasPart | educationalSignificance | location | acknowledgement | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
21559 | 256742 | object | Wooden toy toad stalk | {'modified': '2019-04-24', 'issued': '2018-10-... | [Toys, Vegetables] | {'id': '6773', 'type': 'Collection', 'title': ... | 2013.0038.0540 | [{'type': 'Material', 'title': 'Paint - non sp... | {'type': 'Measurement', 'height': 65, 'diamete... | A painted wooden toy toad stalk with a red cap... | ... | NaN | NaN | NaN | NaN | [{'type': 'Event', 'title': '1925 - 1935', 'st... | NaN | NaN | NaN | NaN | NaN |
1 rows × 25 columns
We can create a link into the NMA Collections Explorer using the object id
.
display(HTML('<a href="http://collectionsearch.nma.gov.au/?object={}">{}</a>'.format(veggie.iloc[0]['id'], veggie.iloc[0]['title'])))
Does a toad stool count as a vegetable?
extent
field¶The extent
field is a nested object, so once again we'll use json_normalize()
to expand it out into separate columns.
# Without reset_index() the rows are misaligned
df_extent = df.loc[df['extent'].notnull()].reset_index().join(json_normalize(df.loc[df['extent'].notnull()]['extent'].tolist()).add_prefix("extent_"))
df_extent.head()
index | id | type | title | _meta | additionalType | collection | identifier | medium | extent | ... | acknowledgement | extent_type | extent_length | extent_width | extent_depth | extent_unitText | extent_height | extent_diameter | extent_weight | extent_unitTextWeight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 251390 | object | Pair of woven shoes made from feathers and hair | {'modified': '2019-01-17', 'issued': '2018-04-... | [Shoes] | {'id': '5244', 'type': 'Collection', 'title': ... | 2000.0014.0495 | [{'type': 'Material', 'title': 'Feather'}, {'t... | {'type': 'Measurement', 'length': 260, 'width'... | ... | NaN | Measurement | 260.0 | 120.0 | 40.0 | mm | NaN | NaN | NaN | NaN |
1 | 2 | 124081 | object | Pair of ceremonial shoes | {'modified': '2018-12-04', 'issued': '2006-10-... | NaN | {'id': '1892', 'type': 'Collection', 'title': ... | 1992.0089.0165 | [{'type': 'Material', 'title': 'Feather'}] | {'type': 'Measurement', 'length': 246, 'width'... | ... | NaN | Measurement | 246.0 | 190.0 | 45.0 | mm | NaN | NaN | NaN | NaN |
2 | 5 | 20174 | object | Ten Days To Live - A supposed sorcery painting. | {'modified': '2019-04-21', 'issued': '2013-06-... | [Bark paintings] | {'id': '2202', 'type': 'Collection', 'title': ... | 1985.0246.0077 | [{'type': 'Material', 'title': 'Bark'}, {'type... | {'type': 'Measurement', 'length': 574, 'width'... | ... | NaN | Measurement | 574.0 | 185.0 | NaN | mm | NaN | NaN | NaN | NaN |
3 | 6 | 144359 | object | 'The Dance of Life (1898-1902)' by Diana Boyer... | {'modified': '2018-06-18', 'issued': '2012-06-... | [Booklets] | {'id': '3893', 'type': 'Collection', 'title': ... | 2008.0043.0022.001 | [{'type': 'Material', 'title': 'Paper'}, {'typ... | {'type': 'Measurement', 'height': 214, 'width'... | ... | NaN | Measurement | NaN | 150.0 | 5.0 | mm | 214.0 | NaN | NaN | NaN |
4 | 8 | 42084 | object | Child's drawing by Lester Moran, Cabbage Tree ... | {'modified': '2019-10-14', 'issued': '2016-10-... | [Drawings] | {'id': '2261', 'type': 'Collection', 'title': ... | 1991.0024.0027 | [{'type': 'Material', 'title': 'Paint - non sp... | {'type': 'Measurement', 'length': 560, 'width'... | ... | NaN | Measurement | 560.0 | 380.0 | 0.5 | mm | NaN | NaN | NaN | NaN |
5 rows × 35 columns
Let's check to see what types of things are in the extent
field.
df_extent['extent_type'].value_counts()
Measurement 64199 Name: extent_type, dtype: int64
So they're all measurements. Let's have a look at the units being used.
df_extent['extent_unitText'].value_counts()
mm 63504 MM 10 cm 9 m 5 Name: extent_unitText, dtype: int64
df_extent['extent_unitTextWeight'].value_counts()
g 1713 kg 212 lb 5 oz 4 tonne 1 Name: extent_unitTextWeight, dtype: int64
Hmmm, are those measurements really in metres, or might they be meant to be 'mm'? Let's have a look at them.
df_extent.loc[df_extent['extent_unitText'] == 'm'][['id', 'title', 'extent_length', 'extent_width', 'extent_unitText']]
id | title | extent_length | extent_width | extent_unitText | |
---|---|---|---|---|---|
8968 | 202783 | The Percival Project, Gull Twelve, in a manill... | NaN | 230.0 | m |
13210 | 257184 | Fishing line inside envelope | 137.0000 | 110.0 | m |
23356 | 171768 | Fair Breeze | NaN | 138.0 | m |
31845 | 123962 | Gunter's chain | 20.1168 | NaN | m |
63827 | 214193 | Extension tube | 55.0000 | NaN | m |
Other than 'Gunter's chain' it looks like the unit should indeed by 'mm'. We'll need to take that into account in calculations.
Now let's convert all the measurements into a single unit – millimetre for lengths, and gram for weights.
def conversion_factor(unit):
'''
Get the factor required to convery current unit to either mm or g.
'''
factors = {
'mm': 1,
'cm': 10,
'm': 1, # Most should in fact be mm (see above)
'g': 1,
'kg': 1000,
'tonne': 1000000,
'oz': 28.35,
'lb': 453.592
}
try:
factor = factors[unit.lower()]
except KeyError:
factor = 0
return factor
def normalise_measurements(row):
'''
Convert measurements to standard units.
'''
l_factor = conversion_factor(str(row['extent_unitText']))
length = row['extent_length'] * l_factor
width = row['extent_width'] * l_factor
depth = row['extent_depth'] * l_factor
height = row['extent_height'] * l_factor
diameter = row['extent_diameter'] * l_factor
w_factor = conversion_factor(str(row['extent_unitTextWeight']))
weight = row['extent_weight'] * w_factor
return pd.Series([length, width, depth, height, diameter, weight])
# Add normalised measurements to the dataframe
df_extent[['length_mm', 'width_mm', 'depth_mm', 'height_mm', 'diameter_mm', 'weight_g']] = df_extent.apply(normalise_measurements, axis=1)
df_extent.head()
index | id | type | title | _meta | additionalType | collection | identifier | medium | extent | ... | extent_height | extent_diameter | extent_weight | extent_unitTextWeight | length_mm | width_mm | depth_mm | height_mm | diameter_mm | weight_g | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 251390 | object | Pair of woven shoes made from feathers and hair | {'modified': '2019-01-17', 'issued': '2018-04-... | [Shoes] | {'id': '5244', 'type': 'Collection', 'title': ... | 2000.0014.0495 | [{'type': 'Material', 'title': 'Feather'}, {'t... | {'type': 'Measurement', 'length': 260, 'width'... | ... | NaN | NaN | NaN | NaN | 260.0 | 120.0 | 40.0 | NaN | NaN | NaN |
1 | 2 | 124081 | object | Pair of ceremonial shoes | {'modified': '2018-12-04', 'issued': '2006-10-... | NaN | {'id': '1892', 'type': 'Collection', 'title': ... | 1992.0089.0165 | [{'type': 'Material', 'title': 'Feather'}] | {'type': 'Measurement', 'length': 246, 'width'... | ... | NaN | NaN | NaN | NaN | 246.0 | 190.0 | 45.0 | NaN | NaN | NaN |
2 | 5 | 20174 | object | Ten Days To Live - A supposed sorcery painting. | {'modified': '2019-04-21', 'issued': '2013-06-... | [Bark paintings] | {'id': '2202', 'type': 'Collection', 'title': ... | 1985.0246.0077 | [{'type': 'Material', 'title': 'Bark'}, {'type... | {'type': 'Measurement', 'length': 574, 'width'... | ... | NaN | NaN | NaN | NaN | 574.0 | 185.0 | NaN | NaN | NaN | NaN |
3 | 6 | 144359 | object | 'The Dance of Life (1898-1902)' by Diana Boyer... | {'modified': '2018-06-18', 'issued': '2012-06-... | [Booklets] | {'id': '3893', 'type': 'Collection', 'title': ... | 2008.0043.0022.001 | [{'type': 'Material', 'title': 'Paper'}, {'typ... | {'type': 'Measurement', 'height': 214, 'width'... | ... | 214.0 | NaN | NaN | NaN | NaN | 150.0 | 5.0 | 214.0 | NaN | NaN |
4 | 8 | 42084 | object | Child's drawing by Lester Moran, Cabbage Tree ... | {'modified': '2019-10-14', 'issued': '2016-10-... | [Drawings] | {'id': '2261', 'type': 'Collection', 'title': ... | 1991.0024.0027 | [{'type': 'Material', 'title': 'Paint - non sp... | {'type': 'Measurement', 'length': 560, 'width'... | ... | NaN | NaN | NaN | NaN | 560.0 | 380.0 | 0.5 | NaN | NaN | NaN |
5 rows × 41 columns
def calculate_volume(row):
'''
Look for 3 linear dimensions and multiply them to get a volume.
'''
# Create a list of valid linear measurements from the available fields
dimensions = [d for d in [row['length_mm'], row['width_mm'], row['depth_mm'], row['height_mm'], row['diameter_mm']] if not math.isnan(d)]
# If there's only 2 dimensions...
if len(dimensions) == 2:
# Set a default height of 1 for items with only 2 dimensions
dimensions.append(1)
# If there's 3 or more dimensions, multiple the first 3 together
if len(dimensions) >= 3:
volume = dimensions[0] * dimensions[1] * dimensions[2]
else:
volume = 0
return volume
df_extent['volume'] = df_extent.apply(calculate_volume, axis=1)
print('Total length of objects is {:.2f} km'.format(df_extent['length_mm'].sum() / 1000 / 1000))
Total length of objects is 15.38 km
print('Total weight of objects is {:.2f} tonnes'.format(df_extent['weight_g'].sum() / 1000000))
Total weight of objects is 197.16 tonnes
print('Total volume of objects is {:.2f} m\N{SUPERSCRIPT THREE}'.format(df_extent['volume'].sum() / 1000000000))
Total volume of objects is 2911.19 m³
What's the biggest thing?
# Get the object with the largest volume
biggest = df_extent.loc[df_extent['volume'].idxmax()]
# Create a link to Collection Explorer
display(HTML('<a href="http://collectionsearch.nma.gov.au/?object={}">{}</a>'.format(biggest['id'], biggest['title'])))
Created by Tim Sherratt for the GLAM Workbench.
Work on this notebook was supported by the Humanities, Arts and Social Sciences (HASS) Data Enhanced Virtual Lab.