The Leafmap library provides a suite of tools for interactive mapping and visualization in Jupyter Notebooks Leafmap version 0.30.0 and and later offer tools specifically for accessing NASA Earthdata by building on the newly developed NASA Earthaccess library. Earthaccess provides streamlined access to NASA Earthdata and simplifies the authentication and querying process over previously developed approaches.This notebook is designed to leverage tools within Earthaccess and Leafmap to facility easier access and vizualization of OPERA data products for a user-specified area of interest (AOI).
import earthaccess
import leafmap
import pandas as pd
import geopandas as gpd
from shapely import box
from datetime import datetime
A NASA Earthdata Login account is required to download the data used in this tutorial. You can create an account at the link provided. After establishing an account, the code in the next cell will verify authentication. If this is your first time running the notebook, you will be prompted to enter your Earthdata login credentials, which will be saved in ~/.netrc.
leafmap.nasa_data_login()
A tab separated values (TSV) file, made available through the opengeos Github repository, catalogues metadata for more than 9,000 datasets available through NASA Earthdata. In the next cell we load the TSV into a pandas dataframe and view the metadata for the first five (5) Earthdata products
### View Earthdata datasets
earthdata_url = 'https://github.com/opengeos/NASA-Earth-Data/raw/main/nasa_earth_data.tsv'
earthdata_df = pd.read_csv(earthdata_url, sep='\t')
earthdata_df.head()
Note above that the earthdata_df
contains a number of columns with metadata about each available product. the ShortName
column will be used to produce a new dataframe containing only OPERA products. Let's view the available products and their metadata.
opera_df = earthdata_df[earthdata_df['ShortName'].str.contains('OPERA', case=False)]
opera_df
Leafmap provides the capability to create an interactive map in the Jupyter Notebook and to use the cursor to define an area of interest (AOI). When the user defines an AOI, the boundary is saved as an object. Give this a try in the next cell.
m = leafmap.Map(center=[42,-100], zoom=4, basemap='Google Satellite')
m
### This cell initializes the AOI for future use. If the user didn't define an AOI in the previous cell, the AOI defaults to an bounding box over Ridgecrest, California (USA)
if m.user_roi is not None:
AOI = tuple(m.user_roi_bounds())
else:
AOI = (-117.33, 35.541, -117.880, 35.991) #Defaults to Ridgecrest, CA
The earthaccess
library makes it simple to quickly query NASA's Common Metadata Repository (CMR) and return the associated metadata as a Geodataframe. Leafmap
has recently added functionality that builds on earthaccess
to enable interactive viewing of this data.
In the next cell, the user should specify which OPERA product and the date range of interest. The AOI defined previously is used as the boundary in the query.
### Print the available OPERA datasets
print('Available OPERA datasets:', opera_df['ShortName'].values)
dswx_results, dswx_gdf = leafmap.nasa_data_search(
short_name='OPERA_L3_DSWX-HLS_PROVISIONAL_V1',
cloud_hosted=True,
bounding_box= AOI,
temporal=("2023-10-01", str(datetime.now().date())),
count=-1, # use -1 to return all datasets
return_gdf=True,
)
Functionality within earthaccess enables more more asthetic views of the available layers, as well as displaying the thumbnail. These links are clickable and will download in the browser when clicked.
dswx_results[0]
The leafmap.nasa_data_search()
function returns a Geodataframe containing the metadata for all granules which intersect the AOI from the specified time range. Let's look at the first five granules.
dswx_gdf.head()
### Plot the location of the tiles
dswx_gdf.explore(fill=False)
dist_results, dist_gdf = leafmap.nasa_data_search(
short_name='OPERA_L3_DIST-ALERT-HLS_PROVISIONAL_V0',
cloud_hosted=True,
bounding_box= AOI,
temporal=("2023-06-01", str(datetime.now().date())),
count=-1, # use -1 to return all datasets
return_gdf=True,
)
dist_results[0]
dist_gdf.head()
### Plot the location of the tiles
dist_gdf.explore(fill=False)
rtc_results, rtc_gdf = leafmap.nasa_data_search(
short_name='OPERA_L2_RTC-S1_V1',
cloud_hosted=True,
bounding_box= AOI,
temporal=("2023-06-01", str(datetime.now().date())),
count=-1, # use -1 to return all datasets
return_gdf=True,
)
rtc_results[0]
rtc_gdf.head()
rtc_gdf.explore(fill=False)
Note: This will only work for AOIs over North America as this is the extent of CSLC coverage
cslc_results, cslc_gdf = leafmap.nasa_data_search(
short_name='OPERA_L2_CSLC-S1_V1',
cloud_hosted=True,
bounding_box= AOI,
temporal=("2023-06-01", str(datetime.now().date())),
count=-1, # use -1 to return all datasets
return_gdf=True,
)
cslc_results[0]
cslc_gdf.head()
cslc_gdf.explore(fill=False)
Important note: As of Jan. 2024, OPERA RTC and CSLC may not be accessible using the earthaccess library due to additional athentication measures required by ASF DAAC. This will likely be resolved soon, but currently the notebook should be used only to access OPERA products distributed by LP DAAC and PO.DAAC. For additional details regarding the current status of using earthaccess to access data from ASF DAAC, see the linked Github Issue (#439).
Let's download some data from one of our above queries. In the cell below we specify data from the DSWx query, but feel free to modify to any of the others above. *Note: We also filter to include the layer we would like to from the product we would like. So modify this step accordingly if a different product is chosen.
This will be where the files are downloaded. It will be a subdirectory inside of a directory called data
, and the directory name will be the datetime that it was created.
import os
from datetime import datetime
def create_data_directory():
# Get the current date and time
current_datetime = datetime.now().strftime("%m_%d_%Y_%H_%M_%S")
# Define the base directory
base_directory = "data"
# Create the full path for the new directory
new_directory_path = os.path.join(base_directory, f"data_{current_datetime}")
# Create the new directory
os.makedirs(new_directory_path, exist_ok=True)
print(f"Directory '{new_directory_path}' created successfully.")
return new_directory_path
directory_path = create_data_directory()
The below will download the data to your newly created subdirectory. Look on your file system for a directory /data/datetime/
where datetime
is the date and time the directory was created.
dswx_data = leafmap.nasa_data_download(dswx_results[:5], out_dir=directory_path) # Downloads the first 5 granules. Remove [:5] to download all granules or modify to keep as many as you like
We load in data from only band 1 below. If you'd like load data from a different band change the B01
to suit your needs.
import os
# Get the current directory
current_directory = os.getcwd()
# Construct the path to the data directory
data_directory = os.path.join(current_directory, directory_path)
# Create a list of file paths and a list of corresponding dates
images = [os.path.join(data_directory, filename) for filename in os.listdir(data_directory) if os.path.isfile(os.path.join(data_directory, filename)) and 'B01' in filename]
image_dates = [image[25:33] for image in os.listdir(data_directory) if 'B01' in image]
m = leafmap.Map()
m.add_raster(images[0])
m
Create a map of the first and last image within the directory
leafmap.split_map(
left_layer=images[0],
right_layer=images[-1],
left_label="First",
right_label="Last",
label_position="bottom",
zoom=10,
)
m = leafmap.Map()
m.add_raster(images[0])
m
# Save output as a geojson
m.save_draw_features("test_output.geojson")
leafmap.create_timelapse(
images,
out_gif='dswx.gif',
fps=1,
progress_bar_color='blue',
add_text=True,
text_xy=('3%', '3%'),
text_sequence=[str(date) for date in image_dates],
font_size=20,
font_color='red',
mp4=False,
reduce_size=False,
)
leafmap.show_image('dswx.gif',height="450px", width="450px")
m = leafmap.Map()
m.add("nasa_earth_data")
m
m._NASA_DATA_GDF.head()
# leafmap.nasa_data_download(m._NASA_DATA_RESULTS[:5], out_dir="data")
Make an interactive timeslide to cycle through the images. This isn't the perfect solution, as the granules in our list do not have identical footprints.
m = leafmap.Map()
m.add_time_slider(
images,
time_interval=1,
position='bottomright',
zoom_to_layer=True,
)
m
earthaccess
and leafmap
provide means for very simple access to OPERA data and metadata. Additional code may be required for more sophisticated filtering (cloud cover, spatial overlap). This notebook provides a guide for learning more about OPERA data products and simple funtionality for their exploration. For more details and application see the main OPERA Applications Github repository.