The National Agriculture Imagery Program (NAIP) provides US-wide, high-resolution aerial imagery, with four spectral bands (R, G, B, IR). NAIP is administered by the Aerial Field Photography Office (AFPO) within the US Department of Agriculture (USDA). Data are captured at least once every three years for each state. This dataset represents NAIP data from 2010-present, in cloud-optimized GeoTIFF format.
This notebook demonstrates the use of the Planetary Computer STAC API to query for NAIP imagery.
This notebook works with or without an API key, but you will be given more permissive access to the data with an API key. The Planetary Computer Hub is pre-configured to use your API key.
import pystac_client
import planetary_computer
# Set the environment variable PC_SDK_SUBSCRIPTION_KEY, or set it here.
# The Hub sets PC_SDK_SUBSCRIPTION_KEY automatically.
# pc.settings.set_subscription_key(<YOUR API Key>)
The datasets hosted by the Planetary Computer are available from Azure Blob Storage. We'll use pystac-client to search the Planetary Computer's STAC API for the subset of the data that we care about, and then we'll load the data directly from Azure Blob Storage. We'll specify a modifier
so that we can access the data stored in the Planetary Computer's private Blob Storage Containers. See Reading from the STAC API and Using tokens for data access for more.
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
This area is in South Jordan, Utah, which the Internet says is one of the fastest-growing cities in the US. Let's see whether we can see some development in this area in the time spanned by our NAIP collection.
area_of_interest = {
"type": "Polygon",
"coordinates": [
[
[-111.9839859008789, 40.5389819819361],
[-111.90502166748045, 40.5389819819361],
[-111.90502166748045, 40.57015381856105],
[-111.9839859008789, 40.57015381856105],
[-111.9839859008789, 40.5389819819361],
]
],
}
This collection includes data from 2010, so we'll search for one image near the beginning of that range and one from 2018.
range_old = "2010-01-01/2013-01-01"
range_new = "2018-01-01/2021-01-01"
search_old = catalog.search(
collections=["naip"], intersects=area_of_interest, datetime=range_old
)
search_new = catalog.search(
collections=["naip"], intersects=area_of_interest, datetime=range_new
)
items_old = search_old.item_collection()
items_new = search_new.item_collection()
print(f"{len(items_old)} Items found in the 'old' range")
print(f"{len(items_new)} Items found in the 'new' range")
4 Items found in the 'old' range 4 Items found in the 'new' range
As seen above, there are multiple items that intersect our area of interest for each year. The following code will choose the item that has the most overlap:
from shapely.geometry import shape
area_shape = shape(area_of_interest)
target_area = area_shape.area
def area_of_overlap(item):
overlap_area = shape(item.geometry).intersection(shape(area_of_interest)).area
return overlap_area / target_area
item_old = sorted(items_old, key=area_of_overlap, reverse=True)[0]
item_new = sorted(items_new, key=area_of_overlap, reverse=True)[0]
Each Item has a rendered_preview
which uses the Planetary Computer's Data API to dynamically create a preview image from the raw data.
from IPython.display import Image
Image(url=item_old.assets["rendered_preview"].href)
Image(url=item_new.assets["rendered_preview"].href)
import rioxarray
ds = rioxarray.open_rasterio(item_old.assets["image"].href).sel(band=[1, 2, 3])
ds
<xarray.DataArray (band: 3, y: 7630, x: 6000)> [137340000 values with dtype=uint8] Coordinates: * band (band) int64 1 2 3 * x (x) float64 4.15e+05 4.15e+05 4.15e+05 ... 4.209e+05 4.209e+05 * y (y) float64 4.491e+06 4.491e+06 ... 4.483e+06 4.483e+06 spatial_ref int64 0 Attributes: AREA_OR_POINT: Area scale_factor: 1.0 add_offset: 0.0