The following example introduces you to the Aerosol Index (AI) product from the Sentinel-5P TROPOMI instrument. The Aerosol Index (AI) is a qualitative index indicating the presence of elevated layers of aerosols with significant absorption. The main aerosol types that cause signals detected in the AI are desert dust
, biomass burning
and volcanic ash plumes
. An advantage of the AI is that it can be derived for clear as well as (partly) cloudy ground pixels.
The Copernicus Sentinel-5 Precursor mission is the first Copernicus mission dedicated to atmospheric monitoring. The main objective of the Sentinel-5P mission is to perform atmospheric measurements with high spatio-temporal resolution, to be used for air quality, ozone & UV radiation, and climate monitoring and forecasting.
Sentinel-5P carries the TROPOMI
instrument, which is a spectrometer in the UV-VIS-NIR-SWIR spectral range. TROPOMI
provides measurements on:
Ozone
NO
2
SO
2
Formaldehyde
Aerosol
Carbonmonoxide
Methane
Clouds
Read more information about Sentinel-5P here.
Spatial resolution:
Up to 5.5 km x 3.5 km
(5.5 km in the satellite flight direction and 3.5 km in the perpendicular direction at nadir)
Spatial coverage:Global
Revisit time:less than one day
Data availability:since April 2018
Sentinel-5P Pre-Ops data are disseminated in the netCDF
format and can be downloaded via the Sentinel-5P Pre-Operations Data Hub. You can login with the following credentials:
s5pguest
s5pguest
import xarray as xr
import matplotlib.pyplot as plt
import matplotlib.colors
from matplotlib.cm import get_cmap
from matplotlib.axes import Axes
import cartopy.crs as ccrs
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
import cartopy.feature as cfeature
from cartopy.mpl.geoaxes import GeoAxes
GeoAxes._pcolormesh_patched = Axes.pcolormesh
import warnings
warnings.simplefilter(action = "ignore", category = RuntimeWarning)
warnings.simplefilter(action = "ignore", category = UserWarning)
%run ../functions.ipynb
A Sentinel-5P TROPOMI Aerosol Index Level 2 file is organised in two groups: PRODUCT
and METADATA
. The PRODUCT
group stores the main data fields of the product, including latitude
, longitude
and the variable itself. The METADATA
group provides additional metadata items.
Sentinel-5P TROPOMI variables have the following dimensions:
scanline
: the number of measurements in the granule / along-track dimension indexground_pixel
: the number of spectra in a measurement / across-track dimension indextime
: time reference for the datacorner
: pixel corner indexSentinel-5P TROPOMI data is disseminated in netCDF
. You can load a netCDF
file with the open_dataset()
function of the xarray library. In order to load the variable as part of a Sentinel-5P data files, you have to specify the following keyword arguments:
group='PRODUCT'
: to load the PRODUCT
groupLet us load a Sentinel-5P TROPOMI data file as xarray.Dataset
from 22 February 2021 and inspect the data structure:
file = xr.open_dataset('../../../eodata/dust/part1/1_satellite/sentinel5p/S5P_OFFL_L2__AER_AI_20210206T120713_20210206T134843_17197_01_010400_20210208T015719.nc', group='PRODUCT')
file
<xarray.Dataset> Dimensions: (scanline: 4172, ground_pixel: 450, time: 1, corner: 4) Coordinates: * scanline (scanline) float64 0.0 1.0 ... 4.171e+03 * ground_pixel (ground_pixel) float64 0.0 1.0 ... 449.0 * time (time) datetime64[ns] 2021-02-06 * corner (corner) float64 0.0 1.0 2.0 3.0 latitude (time, scanline, ground_pixel) float32 ... longitude (time, scanline, ground_pixel) float32 ... Data variables: delta_time (time, scanline) datetime64[ns] 2021-02-... time_utc (time, scanline) object '2021-02-06T12:2... qa_value (time, scanline, ground_pixel) float32 ... aerosol_index_354_388 (time, scanline, ground_pixel) float32 ... aerosol_index_340_380 (time, scanline, ground_pixel) float32 ... aerosol_index_354_388_precision (time, scanline, ground_pixel) float32 ... aerosol_index_340_380_precision (time, scanline, ground_pixel) float32 ...
array([0.000e+00, 1.000e+00, 2.000e+00, ..., 4.169e+03, 4.170e+03, 4.171e+03])
array([ 0., 1., 2., ..., 447., 448., 449.])
array(['2021-02-06T00:00:00.000000000'], dtype='datetime64[ns]')
array([0., 1., 2., 3.])
[1877400 values with dtype=float32]
[1877400 values with dtype=float32]
array([['2021-02-06T12:28:48.299000000', '2021-02-06T12:28:49.139000000', '2021-02-06T12:28:49.979000000', ..., '2021-02-06T13:27:10.180000000', '2021-02-06T13:27:11.020000000', '2021-02-06T13:27:11.860000000']], dtype='datetime64[ns]')
array([['2021-02-06T12:28:48.299000Z', '2021-02-06T12:28:49.139000Z', '2021-02-06T12:28:49.979000Z', ..., '2021-02-06T13:27:10.180000Z', '2021-02-06T13:27:11.020000Z', '2021-02-06T13:27:11.860000Z']], dtype=object)
[1877400 values with dtype=float32]
[1877400 values with dtype=float32]
[1877400 values with dtype=float32]
[1877400 values with dtype=float32]
[1877400 values with dtype=float32]
You see that the loaded data object contains of four dimensions and seven data variables:
Dimensions:
scanline
ground_pixel
time
corner
Data variables:
delta_time
: the offset of individual measurements within the granule, given in millisecondstime_utc
: valid time stamp of the dataqa_value
: quality descriptor, varying between 0 (nodata) and 1 (full quality data).aerosol_index_354_388
: Aerosol index from 354 and 388 nmaerosol_index_340_380
: Aerosol index from 340 and 380 nmaerosol_index_354_388_precision
: Precision of aerosol index from 354 and 388 nmaerosol_index_340_380_precision
: Precision of aerosol index from 340 and 380 nmYou can specify one variable of interest by putting the name of the variable into square brackets []
and get more detailed information about the variable. E.g. aerosol_index_354_388
is the 'Aerosol index from 354 and 388 nm' and has three dimensions, time
, scanline
and ground_pixel
respectively.
ai = file['aerosol_index_354_388']
ai
<xarray.DataArray 'aerosol_index_354_388' (time: 1, scanline: 4172, ground_pixel: 450)> [1877400 values with dtype=float32] Coordinates: * scanline (scanline) float64 0.0 1.0 2.0 ... 4.17e+03 4.171e+03 * ground_pixel (ground_pixel) float64 0.0 1.0 2.0 3.0 ... 447.0 448.0 449.0 * time (time) datetime64[ns] 2021-02-06 latitude (time, scanline, ground_pixel) float32 ... longitude (time, scanline, ground_pixel) float32 ... Attributes: units: 1 proposed_standard_name: ultraviolet_aerosol_index comment: Aerosol index from 388 and 354 nm long_name: Aerosol index from 388 and 354 nm radiation_wavelength: [354. 388.] ancillary_variables: aerosol_index_354_388_precision
[1877400 values with dtype=float32]
array([0.000e+00, 1.000e+00, 2.000e+00, ..., 4.169e+03, 4.170e+03, 4.171e+03])
array([ 0., 1., 2., ..., 447., 448., 449.])
array(['2021-02-06T00:00:00.000000000'], dtype='datetime64[ns]')
[1877400 values with dtype=float32]
[1877400 values with dtype=float32]
You can retrieve the array values of the variable with squared brackets: [:,:,:]
. One single time step can be selected by specifying one value of the time dimension, e.g. [0,:,:]
.
ai_0602 = ai[0,:,:]
ai_0602
<xarray.DataArray 'aerosol_index_354_388' (scanline: 4172, ground_pixel: 450)> [1877400 values with dtype=float32] Coordinates: * scanline (scanline) float64 0.0 1.0 2.0 ... 4.17e+03 4.171e+03 * ground_pixel (ground_pixel) float64 0.0 1.0 2.0 3.0 ... 447.0 448.0 449.0 time datetime64[ns] 2021-02-06 latitude (scanline, ground_pixel) float32 ... longitude (scanline, ground_pixel) float32 ... Attributes: units: 1 proposed_standard_name: ultraviolet_aerosol_index comment: Aerosol index from 388 and 354 nm long_name: Aerosol index from 388 and 354 nm radiation_wavelength: [354. 388.] ancillary_variables: aerosol_index_354_388_precision
[1877400 values with dtype=float32]
array([0.000e+00, 1.000e+00, 2.000e+00, ..., 4.169e+03, 4.170e+03, 4.171e+03])
array([ 0., 1., 2., ..., 447., 448., 449.])
array('2021-02-06T00:00:00.000000000', dtype='datetime64[ns]')
[1877400 values with dtype=float32]
[1877400 values with dtype=float32]
Additionally, you can save the attributes units
and longname
, which you can make use of when visualizing the data.
longname = ai_0602.long_name
units = ai_0602.units
longname, units
('Aerosol index from 388 and 354 nm', '1')
The next step is to visualize the dataset. You can use the function visualize_pcolormesh, which makes use of matploblib's function pcolormesh
and the Cartopy library.
With ?visualize_pcolormesh
you can open the function's docstring to see what keyword arguments are needed to prepare your plot.
?visualize_pcolormesh
Signature: visualize_pcolormesh( data_array, longitude, latitude, projection, color_scale, unit, long_name, vmin, vmax, set_global=True, lonmin=-180, lonmax=180, latmin=-90, latmax=90, ) Docstring: Visualizes a xarray.DataArray with matplotlib's pcolormesh function. Parameters: data_array(xarray.DataArray): xarray.DataArray holding the data values longitude(xarray.DataArray): xarray.DataArray holding the longitude values latitude(xarray.DataArray): xarray.DataArray holding the latitude values projection(str): a projection provided by the cartopy library, e.g. ccrs.PlateCarree() color_scale(str): string taken from matplotlib's color ramp reference unit(str): the unit of the parameter, taken from the NetCDF file if possible long_name(str): long name of the parameter, taken from the NetCDF file if possible vmin(int): minimum number on visualisation legend vmax(int): maximum number on visualisation legend set_global(boolean): optional kwarg, default is True lonmin,lonmax,latmin,latmax(float): optional kwarg, set geographic extent is set_global kwarg is set to False File: /tmp/ipykernel_943/2163691773.py Type: function
You can make use of the variables we have defined above:
units
long_name
latitude
longitude
Additionally, you can specify the color scale and minimum and maxium data values.
visualize_pcolormesh(data_array=ai_0602,
longitude=ai_0602.longitude,
latitude=ai_0602.latitude,
projection=ccrs.PlateCarree(),
color_scale='hot_r',
unit=units,
long_name=longname + ' ' + str(ai_0602.time.data)[0:10],
vmin=0,
vmax=1.5)
(<Figure size 1440x720 with 2 Axes>, <GeoAxesSubplot:title={'center':'Aerosol index from 388 and 354 nm 2021-02-06'}>)
The map above shows the aerosol index of one footprint along the entire latitude range. Let us create a geographical subset for Europe, in order to better analyse the Saharan dust event which occured in February 2021 over Europe.
For geographical subsetting, you can use the function generate_geographical_subset. You can use ?generate_geographical_subset
to open the docstring in order to see the function's keyword arguments.
?generate_geographical_subset
Signature: generate_geographical_subset( xarray, latmin, latmax, lonmin, lonmax, reassign=False, ) Docstring: Generates a geographical subset of a xarray.DataArray and if kwarg reassign=True, shifts the longitude grid from a 0-360 to a -180 to 180 deg grid. Parameters: xarray(xarray.DataArray): a xarray DataArray with latitude and longitude coordinates latmin, latmax, lonmin, lonmax(int): lat/lon boundaries of the geographical subset reassign(boolean): default is False Returns: Geographical subset of a xarray.DataArray. File: /tmp/ipykernel_943/3979307327.py Type: function
Define the bounding box information for Europe
latmin = 28.
latmax = 71.
lonmin = -22.
lonmax = 43
Now, let us apply the function generate_geographical_subset to subset the ai_0602
xarray.DataArray. Let us call the new xarray.DataArray
ai_0602_subset
.
ai_0602_subset = generate_geographical_subset(xarray=ai_0602,
latmin=latmin,
latmax=latmax,
lonmin=lonmin,
lonmax=lonmax)
ai_0602_subset
<xarray.DataArray 'aerosol_index_354_388' (scanline: 911, ground_pixel: 450)> array([[ nan, nan, nan, ..., -0.18838878, -0.550947 , -0.5468501 ], [ nan, nan, nan, ..., -0.20505159, -0.361677 , -0.2746366 ], [ nan, nan, nan, ..., -0.30932063, -0.32334223, -0.20750041], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], dtype=float32) Coordinates: * scanline (scanline) float64 2.903e+03 2.904e+03 ... 3.812e+03 3.813e+03 * ground_pixel (ground_pixel) float64 0.0 1.0 2.0 3.0 ... 447.0 448.0 449.0 time datetime64[ns] 2021-02-06 latitude (scanline, ground_pixel) float32 22.49 22.53 ... 71.66 71.63 longitude (scanline, ground_pixel) float32 -11.97 -11.89 ... 13.92 14.17 Attributes: units: 1 proposed_standard_name: ultraviolet_aerosol_index comment: Aerosol index from 388 and 354 nm long_name: Aerosol index from 388 and 354 nm radiation_wavelength: [354. 388.] ancillary_variables: aerosol_index_354_388_precision
array([[ nan, nan, nan, ..., -0.18838878, -0.550947 , -0.5468501 ], [ nan, nan, nan, ..., -0.20505159, -0.361677 , -0.2746366 ], [ nan, nan, nan, ..., -0.30932063, -0.32334223, -0.20750041], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], dtype=float32)
array([2903., 2904., 2905., ..., 3811., 3812., 3813.])
array([ 0., 1., 2., ..., 447., 448., 449.])
array('2021-02-06T00:00:00.000000000', dtype='datetime64[ns]')
array([[22.488323, 22.526548, 22.564207, ..., 28.004581, 28.002975, 28.001284], [22.535732, 22.573975, 22.611652, ..., 28.052963, 28.051344, 28.049639], [22.583122, 22.621386, 22.65908 , ..., 28.101353, 28.099718, 28.097998], ..., [61.79134 , 61.862812, 61.933228, ..., 71.58282 , 71.56084 , 71.53822 ], [61.8246 , 61.896133, 61.96661 , ..., 71.630035, 71.60801 , 71.58535 ], [61.857754, 61.92935 , 61.99989 , ..., 71.67725 , 71.655174, 71.63247 ]], dtype=float32)
array([[-11.969172, -11.887793, -11.807481, ..., 13.89761 , 13.988303, 14.080202], [-11.986401, -11.905004, -11.824673, ..., 13.890745, 13.98148 , 14.073421], [-12.003663, -11.922247, -11.841897, ..., 13.883866, 13.974643, 14.066626], ..., [-45.824085, -45.72454 , -45.62591 , ..., 13.632608, 13.879951, 14.130034], [-45.90402 , -45.80456 , -45.706017, ..., 13.651459, 13.899373, 14.150033], [-45.98421 , -45.884846, -45.786385, ..., 13.670239, 13.918727, 14.169963]], dtype=float32)
Let us visualize the subsetted xarray.DataArray
again. This time, you set the set_global
kwarg to False
and you specify the longitude and latitude bounds specified above.
Additionally, in order to have the time information as part of the title, we add the string of the datetime information to the longname
variable: longname + ' ' + str(ai_0602.time.data)[0:10]
.
visualize_pcolormesh(data_array=ai_0602_subset,
longitude=ai_0602_subset.longitude,
latitude=ai_0602_subset.latitude,
projection=ccrs.PlateCarree(),
color_scale='hot_r',
unit=units,
long_name=longname + ' ' + str(ai_0602_subset.time.data)[0:10],
vmin=0,
vmax=3,
lonmin=lonmin,
lonmax=lonmax,
latmin=latmin,
latmax=latmax,
set_global=False)
(<Figure size 1440x720 with 2 Axes>, <GeoAxesSubplot:title={'center':'Aerosol index from 388 and 354 nm 2021-02-06'}>)
This project is licensed under GNU General Public License v3.0 only and is developed under a Copernicus contract.