#!/usr/bin/env python
# coding: utf-8

# ## Accessing Chloris Biomass data with the Planetary Computer STAC API
# 
# The Chloris Global Biomass 2003–2019 dataset provides estimates of stock and change in aboveground biomass for Earth's terrestrial woody vegetation ecosystems. It covers the period 2003–2019, at annual time steps. The global dataset has an approximate 4.6 km spatial resolution.
# 
# This notebook provides an example of accessing Chloris Biomass data using the Planetary Computer STAC API, inspecting the data assets in the catalog, and doing some simple processing and plotting of the data from the Cloud Optimized GeoTIFF source.

# ### Environment setup
# 
# This notebook works with or without an API key, but you will be given more permissive access to the data with an API key. The Planetary Computer Hub is pre-configured to use your API key.

# In[1]:


import matplotlib.pyplot as plt
import planetary_computer
import rioxarray
import pystac_client


# ### Query for available data

# In[2]:


catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=planetary_computer.sign_inplace,
)
biomass = catalog.search(collections=["chloris-biomass"])

all_items = biomass.get_all_items()

print(f"Returned {len(all_items)} Items")


# Our search returned all 17 items in the collection. Each of the items represent a single year, between 2003 and 2019, at a global scale.
# 
# Let's see what assets are associated with this item:

# In[3]:


# Grab the first item and print the titles of the assets it contains
item = all_items[0]
print(item.id + ":")
print(*[f"- {key}: {asset.title}" for key, asset in item.assets.items()], sep="\n")


# There are 4 assets, though two are duplicates of the same data, provided in Web Mercator projection. One represents the estimate for aboveground woody biomass for the given year (in tonnes), and the other represents the change (in tonnes) from the previous year.
# 
# Let's update our search to include just a specific year.

# In[4]:


datetime = "2016-01-01"
biomass = catalog.search(collections=["chloris-biomass"], datetime=datetime)

# Sign the resulting item so we can access the underlying data assets
item = next(biomass.items())
print(item)


# #### Load the variable of interest
# 
# By inspecting the `raster:bands` array in each asset's extra fields, we can see that this dataset uses a value of 2,147,483,647 for "nodata".
# So we'll provide the `masked=True` option to `rioxarray` to open the data with the nodata converted to NaNs.

# In[5]:


da = rioxarray.open_rasterio(item.assets["biomass"].href, masked=True)

# Transform our data array to a dataset by selecting the only data variable ('band')
# renaming it to something useful ('biomass')
ds = da.to_dataset(dim="band").rename({1: "biomass"})
ds


# #### Downsample and render

# For this global plot, it's ok to lose some detail in our rendering. First we'll downsample the entire dataset by a factor of 10 on each spatial dimension and drop any values above the nodata value.

# In[6]:


get_ipython().run_cell_magic('time', '', 'factor = 10\ncoarse_biomass = (\n    ds.biomass.coarsen(dim={"x": factor, "y": factor}, boundary="trim").mean().compute()\n)\n')


# With our dataset nicely reduced, we can plot the above ground biomass for the planet in the original sinusoidal projection. 

# In[7]:


h, w = coarse_biomass.shape
dpi = 100
fig = plt.figure(frameon=False, figsize=(w / dpi, h / dpi), dpi=dpi)
ax = plt.Axes(fig, [0.0, 0.0, 1.0, 1.0])
ax.set_axis_off()
fig.add_axes(ax)
coarse_biomass.plot(cmap="Greens", add_colorbar=False)

ax.set_title("2016 estimated aboveground biomass")
plt.show();