The Copernicus Sentinel-5 Precursor mission provides high spatio-temporal resolution measurements of the Earth's atmosphere. Sentinel-5P Level-2 data products include total columns of ozone, sulfur dioxide, nitrogen dioxide, carbon monoxide and formaldehyde, tropospheric columns of ozone, vertical profiles of ozone and cloud & aerosol information. The Planetary Computer's sentinel-5p-l2-netcdf
STAC Collection contains Items for thirteen Sentinel-5P Level-2 products in NetCDF format:
L2__AER_AI
: Ultraviolet aerosol indexL2__AER_LH
: Aerosol layer heightL2__CH4___
: Methane (CH4) total columnL2__CLOUD_
: Cloud fraction, albedo, and top pressureL2__CO____
: Carbon monoxide (CO) total columnL2__HCHO__
: Formaldehyde (HCHO) total columnL2__NO2___
: Nitrogen dioxide (NO2) total columnL2__O3____
: Ozone (O3) total columnL2__O3_TCL
: Ozone (O3) tropospheric columnL2__SO2___
: Sulfur dioxide (SO2) total columnL2__NP_BD3
: Cloud from the Suomi NPP mission, band 3L2__NP_BD6
: Cloud from the Suomi NPP mission, band 6L2__NP_BD7
: Cloud from the Suomi NPP mission, band 7This notebook works with or without an API key, but you will be given more permissive access to the data with an API key. If you are using the Planetary Computer Hub to run this notebook, then your API key is automatically set to the environment variable PC_SDK_SUBSCRIPTION_KEY
for you when your server is started. Otherwise, you can view your keys by signing in to the developer portal. The API key may be manually set via the environment variable PC_SDK_SUBSCRIPTION_KEY
or the following code:
import planetary_computer
planetary_computer.settings.set_subscription_key(<YOUR API Key>)
import cartopy.crs as ccrs
import fsspec
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import planetary_computer
import pystac_client
import xarray as xr
The datasets hosted by the Planetary Computer are available from Azure Blob Storage. We'll use pystac-client to search the Planetary Computer's STAC API for the subset of the data that we care about, and then we'll load the data directly from Azure Blob Storage. We'll specify a modifier
so that we can access the data stored in the Planetary Computer's private Blob Storage Containers. See Reading from the STAC API and Using tokens for data access for more.
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
Let's search for Items containing the formaldehyde product (L2__HCHO__
) over the country of India. We'll further limit our search to an arbitrary collection date of April 2, 2023 and only include data that has been processed "offline" (OFFL
). The geospatial extents of OFFL
Items are much larger than those processed in near real-time (NRTI
).
longitude = 79.109
latitude = 22.746
geometry = {
"type": "Point",
"coordinates": [longitude, latitude],
}
search = catalog.search(
collections="sentinel-5p-l2-netcdf",
intersects=geometry,
datetime="2023-04-02/2023-04-03",
query={"s5p:processing_mode": {"eq": "OFFL"}, "s5p:product_name": {"eq": "hcho"}},
)
items = list(search.get_items())
print(f"Found {len(items)} items:")
Found 2 items:
Let's take a look at the first Item in the list.
f = fsspec.open(items[0].assets["hcho"].href).open()
ds = xr.open_dataset(f, group="PRODUCT", engine="h5netcdf")
ds
<xarray.Dataset> Dimensions: (scanline: 4173, ground_pixel: 450, time: 1, corner: 4, layer: 34) Coordinates: * scanline (scanline) float64 0... * ground_pixel (ground_pixel) float64 ... * time (time) datetime64[ns] ... * corner (corner) float64 0.0... * layer (layer) int32 0 ... 33 Data variables: latitude (time, scanline, ground_pixel) float32 ... longitude (time, scanline, ground_pixel) float32 ... delta_time (time, scanline, ground_pixel) datetime64[ns] ... time_utc (time, scanline) object ... qa_value (time, scanline, ground_pixel) float32 ... formaldehyde_tropospheric_vertical_column (time, scanline, ground_pixel) float32 ... formaldehyde_tropospheric_vertical_column_precision (time, scanline, ground_pixel) float32 ...
Plotting the data in its native coordinate system is not very informative.
varname = "formaldehyde_tropospheric_vertical_column"
data = ds[varname][0, :, :]
vmin, vmax = np.nanpercentile(data, [1, 99])
data.plot(vmin=vmin, vmax=vmax, cmap="viridis");
We'll plot the data in its native geographic coordinate reference system along with continent boundaries for context.
# formaldehyde product (NaN locations are transparent)
lon = ds["longitude"].values.squeeze()
lat = ds["latitude"].values.squeeze()
formaldehyde = data.values
fig = plt.figure(figsize=(15, 15))
ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree())
ax.coastlines()
ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True, alpha=0.5, linestyle="--")
ax.set_extent([-180, 180, -90, 90], crs=ccrs.PlateCarree())
norm = matplotlib.colors.Normalize(vmin=vmin, vmax=vmax)
scatter = plt.scatter(
lon,
lat,
c=formaldehyde,
transform=ccrs.PlateCarree(),
cmap="viridis",
norm=norm,
marker=".",
s=1,
)
fig.colorbar(scatter, pad=0.05, shrink=0.35, label="formaldehyde (mol/m2)")
plt.show()