The examples provided in this document only represent a tiny bit of the content of most of the NetCDF files. There are usually many more variables available in a NetCDF file, and therefore many other ways to display data.
Content:
A list of interesting IMOS based Python library functions is available from their Github website
The examples that will be used rely on the following Python packages, which need to be installed:
The examples that will be use in this course have been tested in Python 2.7.3, with numpy 1.6.1, matplotlib 1.1.1, and version 1.0.2 of netCDF4. Later versions of these packages will usually work. Earlier versions may work also.
In order to find a dataset you are interested in, please refer to the portal help: http://help.aodn.org.au/help/?q=node/6. This is a how-to guide that can help users find an IMOS NetCDF file. When downloading your chosen dataset from the portal, choose one of the download options “List of URLs”, or “All source NETCDF files” to obtain netCDF files.
For users who are already familiar with IMOS facilities and datasets, IMOS NetCDF files are also directly accessible via an OPeNDAP catalog at : http://thredds.aodn.org.au/thredds/catalog/IMOS/catalog.html
Most of the examples in the following sections use the ‘Data URL’ of a dataset. If you have downloaded your dataset from the portal, the data URL is the file path to the file on your local machine. If you are using the THREDDS catelog, the file does not have to be downloaded to your local machine first – the OPeNDAP data URL can be parsed into Python. The OPeNDAP data URL is found on the ‘OPeNDAP Dataset Access Form’ page (see http://help.aodn.org.au/help/?q=node/11), inside the box labelled ‘Data URL’ just above the ‘Global Attributes’ field.
Note: the list of URL’s generated by the IMOS portal when choosing that download option can be converted to a list of OPeNDAP data URL’s by replacing string http://data.aodn.org.au/IMOS/opendap with http://thredds.aodn.org.au/thredds/dodsC/IMOS.
The first step consists of opening a NetCDF file, whether this file is available locally or remotely on an OPeNDAP server.
Type in your Python command window (or script):
from netCDF4 import Dataset
aatams_URL = 'http://thredds.aodn.org.au/thredds/dodsC/IMOS/eMII/demos/AATAMS/marine_mammal_ctd-tag/2009_2011_ct64_Casey_Macquarie/ct64-M746-09/IMOS_AATAMS-SATTAG_TSP_20100205T043000Z_ct64-M746-09_END-20101029T071000Z_FV00.nc'
aatams_DATA = Dataset(aatams_URL)
This creates a netCDF Dataset object, through which you can access all the contents of the file.
Let's do it below - as this is a Python commands we will be using a code
cell this time and not a markdown
(text) one.
from netCDF4 import Dataset
aatams_URL = 'http://thredds.aodn.org.au/thredds/dodsC/IMOS/eMII/demos/AATAMS/marine_mammal_ctd-tag/2009_2011_ct64_Casey_Macquarie/ct64-M746-09/IMOS_AATAMS-SATTAG_TSP_20100205T043000Z_ct64-M746-09_END-20101029T071000Z_FV00.nc'
aatams_DATA = Dataset(aatams_URL)
Please refer to the netCDF4 module documentation for a complete description of the Dataset object:: 'http://netcdf4-python.googlecode.com/svn/trunk/docs/netCDF4.Dataset-class.html' (or type help(Dataset) at the Python prompt).
help(aatams_DATA)
In order to see all the global attributes and some other information about the file, type in your command window:
print aatams_DATA
The output will look something like this:
root group (NETCDF3_64BIT file format):
project: Integrated Marine Observing System (IMOS)
conventions: IMOS-1.2
date_created: 2012-09-13T07:27:03Z
title: Temperature, Salinity and Depth profiles in near real time
institution: AATAMS
site: CTD Satellite Relay Data Logger
abstract: CTD Satellite Relay Data Loggers are used to explore how marine mammal behaviour relates to their oceanic environment. Loggers developped at the University of St Andrews Sea Mammal Research Unit transmit data in near real time via the Argo satellite system
source: SMRU CTD Satellite relay Data Logger on marine mammals
…
dimensions: obs, profiles
variables: TIME, LATITUDE, LONGITUDE, TEMP, PRES, PSAL, parentIndex, TIME_quality_control, LATITUDE_quality_control, LONGITUDE_quality_control, TEMP_quality_control, PRES_quality_control, PSAL_quality_control
groups:
print(aatams_DATA)
Global attributes in the netCDF file become attributes of the Dataset object. A list of global attribute names is returned by the ncattrs() method of the object. The dict attribute of the object is a dictionary of all netCDF attribute names and values.
# store the dataset's title in a local variable
title_str = aatams_DATA.title
# list all global attribute names
aatams_DATA.ncattrs()
# store the complete set of attributes in a dictionary (OrderedDict) object (similar to a standard Python dict, but
# maintains the order in which items are entered)
globalAttr = aatams_DATA.__dict__
# now you can also do (same effect as first command above)
title_str = globalAttr['title']
# store the dataset's title in a local variable
title_str = aatams_DATA.title
# list all global attribute names
aatams_DATA.ncattrs()
# store the complete set of attributes in a dictionary (OrderedDict) object (similar to a standard Python dict, but
# maintains the order in which items are entered)
globalAttr = aatams_DATA.__dict__
# now you can also do (same effect as first command above)
title_str = globalAttr['title']
print(title_str)
globalAttr
To list all the variables available in the NetCDF file, type:
aatams_DATA.variables.keys()
Output:
[u'TIME',
u'LATITUDE',
u'LONGITUDE',
u'TEMP',
u'PRES',
u'PSAL',
u'parentIndex',
u'TIME_quality_control',
u'LATITUDE_quality_control',
u'LONGITUDE_quality_control',
u'TEMP_quality_control',
u'PRES_quality_control',
u'PSAL_quality_control']
aatams_DATA.variables.keys()
(The 'u' means each variable name is represented by a Unicode string.)
Each variable is accessed via a Variable
object, in a similar way to the Dataset
object. To access the Temperature
variable :
# netCDF4 Variable object
TEMP = aatams_DATA.variables['TEMP']
# now you can print the variable's attributes and other info
print TEMP
# access variable attributes, e.g. its standard_name
TEMP.standard_name
# extract the data values (as a numpy array)
TEMP[:]
# the variable's dimensions (as a tuple)
TEMP.dimensions
# netCDF4 Variable object
TEMP = aatams_DATA.variables['TEMP']
# now you can print the variable's attributes and other info
print(TEMP)
# access variable attributes, e.g. its standard_name
TEMP.standard_name
# extract the data values (as a numpy array)
TEMP[:]
print(TEMP[:])
print(min(TEMP),max(TEMP))
Following the same approach as the one proposed for the temperature variable extract the salinity one.
# netCDF4 Variable object
PSAL = aatams_DATA.variables['PSAL']
# now you can print the variable's attributes and other info
print(PSAL)
# access variable attributes, e.g. its standard_name
PSAL.standard_name
# extract the data values (as a numpy array)
PSAL[:]
print(PSAL[:])
print(min(PSAL),max(PSAL))
We can now work with and plot these variables and get some information/relationship regarding their temporal and/or spatial evalution for example.
To do so we need to import some useful Python libraries...
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.nonparametric.smoothers_lowess import lowess
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
plt.rcParams['mathtext.fontset'] = 'cm'
Now let us define the temperature and salinity variables more simply:
PSAL[:]
to sal
for salinityTEMP[:]
to temp
for temperaturesal = np.asarray(PSAL[:])
temp = np.asarray(TEMP[:])
We then use a smoothing function lowess
Locally Weighted Scatterplot Smoothing so get a general trend of salinity evolution in relation to temperature from the existing scattered dataset.
ys = lowess(sal, temp, it=5, frac=0.2)
Now we plot the result with matplotlib:
plt.figure(figsize=(10,7))
plt.scatter(temp, sal, s=2, marker='o', facecolor='r', lw = 1,label='T-S dataset')
plt.xlabel('Temperature in Celcius')
plt.ylabel('Salinity units of parts per thousand (1.e-3)')
plt.title(title_str)
plt.plot(ys[:,0],ys[:,1],'k',linewidth=2,label='smoothing')
plt.legend(loc=0, fontsize=10)
plt.show()
plt.close()