#!/usr/bin/env python
# coding: utf-8
# ## Accessing Sentinel-5P data with the Planetary Computer STAC API
#
# The Copernicus [Sentinel-5 Precursor](https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-5p) mission provides high spatio-temporal resolution measurements of the Earth's atmosphere. Sentinel-5P Level-2 data products include total columns of ozone, sulfur dioxide, nitrogen dioxide, carbon monoxide and formaldehyde, tropospheric columns of ozone, vertical profiles of ozone and cloud & aerosol information. The Planetary Computer's [`sentinel-5p-l2-netcdf`](https://planetarycomputer.microsoft.com/dataset/sentinel-5p-l2-netcdf) STAC Collection contains Items for thirteen Sentinel-5P Level-2 products in NetCDF format:
#
# * [`L2__AER_AI`](http://www.tropomi.eu/data-products/uv-aerosol-index): Ultraviolet aerosol index
# * [`L2__AER_LH`](http://www.tropomi.eu/data-products/aerosol-layer-height): Aerosol layer height
# * [`L2__CH4___`](http://www.tropomi.eu/data-products/methane): Methane (CH4) total column
# * [`L2__CLOUD_`](http://www.tropomi.eu/data-products/cloud): Cloud fraction, albedo, and top pressure
# * [`L2__CO____`](http://www.tropomi.eu/data-products/carbon-monoxide): Carbon monoxide (CO) total column
# * [`L2__HCHO__`](http://www.tropomi.eu/data-products/formaldehyde): Formaldehyde (HCHO) total column
# * [`L2__NO2___`](http://www.tropomi.eu/data-products/nitrogen-dioxide): Nitrogen dioxide (NO2) total column
# * [`L2__O3____`](http://www.tropomi.eu/data-products/total-ozone-column): Ozone (O3) total column
# * [`L2__O3_TCL`](http://www.tropomi.eu/data-products/tropospheric-ozone-column): Ozone (O3) tropospheric column
# * [`L2__SO2___`](http://www.tropomi.eu/data-products/sulphur-dioxide): Sulfur dioxide (SO2) total column
# * [`L2__NP_BD3`](http://www.tropomi.eu/data-products/auxiliary): Cloud from the Suomi NPP mission, band 3
# * [`L2__NP_BD6`](http://www.tropomi.eu/data-products/auxiliary): Cloud from the Suomi NPP mission, band 6
# * [`L2__NP_BD7`](http://www.tropomi.eu/data-products/auxiliary): Cloud from the Suomi NPP mission, band 7
# ### Environment setup
#
# This notebook works with or without an API key, but you will be given more permissive access to the data with an API key. If you are using the [Planetary Computer Hub](https://planetarycomputer.microsoft.com/compute) to run this notebook, then your API key is automatically set to the environment variable `PC_SDK_SUBSCRIPTION_KEY` for you when your server is started. Otherwise, you can view your keys by signing in to the [developer portal](https://planetarycomputer.developer.azure-api.net/). The API key may be manually set via the environment variable `PC_SDK_SUBSCRIPTION_KEY` or the following code:
#
# ```python
# import planetary_computer
# planetary_computer.settings.set_subscription_key()
# ```
# In[7]:
import cartopy.crs as ccrs
import fsspec
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import planetary_computer
import pystac_client
import xarray as xr
# ### Data access
#
# The datasets hosted by the Planetary Computer are available from [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/). We'll use [pystac-client](https://pystac-client.readthedocs.io/) to search the Planetary Computer's [STAC API](https://planetarycomputer.microsoft.com/api/stac/v1/docs) for the subset of the data that we care about, and then we'll load the data directly from Azure Blob Storage. We'll specify a `modifier` so that we can access the data stored in the Planetary Computer's private Blob Storage Containers. See [Reading from the STAC API](https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/) and [Using tokens for data access](https://planetarycomputer.microsoft.com/docs/concepts/sas/) for more.
# In[8]:
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
# ### Select an area, time, and product type and find STAC Items
#
# Let's search for Items containing the formaldehyde product (`L2__HCHO__`) over the country of India. We'll further limit our search to an arbitrary collection date of April 2, 2023 and only include data that has been processed "offline" (`OFFL`). The geospatial extents of `OFFL` Items are much larger than those processed in near real-time (`NRTI`).
# In[9]:
longitude = 79.109
latitude = 22.746
geometry = {
"type": "Point",
"coordinates": [longitude, latitude],
}
search = catalog.search(
collections="sentinel-5p-l2-netcdf",
intersects=geometry,
datetime="2023-04-02/2023-04-03",
query={"s5p:processing_mode": {"eq": "OFFL"}, "s5p:product_name": {"eq": "hcho"}},
)
items = list(search.get_items())
print(f"Found {len(items)} items:")
# Let's take a look at the first Item in the list.
# In[10]:
f = fsspec.open(items[0].assets["hcho"].href).open()
ds = xr.open_dataset(f, group="PRODUCT", engine="h5netcdf")
ds
# Plotting the data in its native coordinate system is not very informative.
# In[13]:
varname = "formaldehyde_tropospheric_vertical_column"
data = ds[varname][0, :, :]
vmin, vmax = np.nanpercentile(data, [1, 99])
data.plot(vmin=vmin, vmax=vmax, cmap="viridis");
# We'll plot the data in its native geographic coordinate reference system along with continent boundaries for context.
# In[12]:
# formaldehyde product (NaN locations are transparent)
lon = ds["longitude"].values.squeeze()
lat = ds["latitude"].values.squeeze()
formaldehyde = data.values
fig = plt.figure(figsize=(15, 15))
ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree())
ax.coastlines()
ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True, alpha=0.5, linestyle="--")
ax.set_extent([-180, 180, -90, 90], crs=ccrs.PlateCarree())
norm = matplotlib.colors.Normalize(vmin=vmin, vmax=vmax)
scatter = plt.scatter(
lon,
lat,
c=formaldehyde,
transform=ccrs.PlateCarree(),
cmap="viridis",
norm=norm,
marker=".",
s=1,
)
fig.colorbar(scatter, pad=0.05, shrink=0.35, label="formaldehyde (mol/m2)")
plt.show()