This notebook will explore the data we downloaded in the Customize and Access Data
notebook to fulfill the final objective of this tutorial:
We will perform the following steps:
from collections import defaultdict
import h5py
import xarray as xr
import pyproj
import pandas as pd
import pyresample as prs
from matplotlib import pyplot as plt
import numpy as np
# to make sure we find the shell from ! cells
import os
os.environ["PATH"] += os.pathsep + os.pathsep.join(['/bin', '/bin/bash'])
# Helper functions
import tutorial_helper_functions as fn
#Allows plots to be rendered in notebook
%matplotlib inline
During the live tutorial in 2019, we temporarily staged data on Amazon Web Services (AWS). Those commands are copied below for reference. Alternatively, we can access all of the granules needed for this tutorial using the Data access for all datasets
notebook located in this folder.
!aws --no-sign-request s3 cp s3://YOUR_BUCKET/ ./Outputs --recursive
If you have data on a S3 bucket you need to download it with:
!aws --no-sign-request --region=YOUR-REGION s3 sync s3://YOUR-BUCKET/PATH Outputs
This notebook will work with 2 granules from ATL07 and ATL10 for the same segment, along with a granule from the MODIS/Terra Sea Ice Extent 5-Min L2 Swath 1km dataset (MOD29).
We will be focusing on two packages to read in our data:
We will start by defining variables to extract and combine as a single dataframe:
VARIABLES = {
'ATL07': [
'/gt1l/sea_ice_segments/delta_time',
'/gt1l/sea_ice_segments/latitude',
'/gt1l/sea_ice_segments/longitude',
'/gt1l/sea_ice_segments/heights/height_segment_confidence',
'/gt1l/sea_ice_segments/heights/height_segment_height',
'/gt1l/sea_ice_segments/heights/height_segment_quality',
'/gt1l/sea_ice_segments/heights/height_segment_surface_error_est',
'/gt1l/sea_ice_segments/heights/height_segment_length_seg',
'/gt2l/sea_ice_segments/delta_time',
'/gt2l/sea_ice_segments/latitude',
'/gt2l/sea_ice_segments/longitude',
'/gt2l/sea_ice_segments/heights/height_segment_confidence',
'/gt2l/sea_ice_segments/heights/height_segment_height',
'/gt2l/sea_ice_segments/heights/height_segment_quality',
'/gt2l/sea_ice_segments/heights/height_segment_surface_error_est',
'/gt2l/sea_ice_segments/heights/height_segment_length_seg',
'/gt3l/sea_ice_segments/delta_time',
'/gt3l/sea_ice_segments/latitude',
'/gt3l/sea_ice_segments/longitude',
'/gt3l/sea_ice_segments/heights/height_segment_confidence',
'/gt3l/sea_ice_segments/heights/height_segment_height',
'/gt3l/sea_ice_segments/heights/height_segment_quality',
'/gt3l/sea_ice_segments/heights/height_segment_surface_error_est',
'/gt3l/sea_ice_segments/heights/height_segment_length_seg'],
'ATL10': [
'/gt1l/freeboard_beam_segment/beam_freeboard/delta_time',
'/gt1l/freeboard_beam_segment/beam_freeboard/latitude',
'/gt1l/freeboard_beam_segment/beam_freeboard/longitude',
'/gt1l/freeboard_beam_segment/beam_freeboard/beam_fb_confidence',
'/gt1l/freeboard_beam_segment/beam_freeboard/beam_fb_quality_flag',
'/gt1l/freeboard_beam_segment/beam_freeboard/beam_fb_height',
'/gt2l/freeboard_beam_segment/beam_freeboard/delta_time',
'/gt2l/freeboard_beam_segment/beam_freeboard/latitude',
'/gt2l/freeboard_beam_segment/beam_freeboard/longitude',
'/gt2l/freeboard_beam_segment/beam_freeboard/beam_fb_confidence',
'/gt2l/freeboard_beam_segment/beam_freeboard/beam_fb_quality_flag',
'/gt2l/freeboard_beam_segment/beam_freeboard/beam_fb_height',
'/gt3l/freeboard_beam_segment/beam_freeboard/delta_time',
'/gt3l/freeboard_beam_segment/beam_freeboard/latitude',
'/gt3l/freeboard_beam_segment/beam_freeboard/longitude',
'/gt3l/freeboard_beam_segment/beam_freeboard/beam_fb_confidence',
'/gt3l/freeboard_beam_segment/beam_freeboard/beam_fb_quality_flag',
'/gt3l/freeboard_beam_segment/beam_freeboard/beam_fb_height'
]
}
load_icesat2_as_dataframe
function from our functions module to load points from our ATL07 and ATL10 files:¶# Load ATL07 files
atl07_1298 = fn.load_icesat2_as_dataframe('./Outputs/processed_ATL07-01_20190323101341_12980201_005_01.h5', VARIABLES)
atl07_1305 = fn.load_icesat2_as_dataframe('./Outputs/processed_ATL07-01_20190323211343_13050201_005_01.h5', VARIABLES)
atl07 = pd.concat([atl07_1298,atl07_1305]) #concatenate dataframes to use in plot below
# Load single ATL10 file to compare with ATL07
atl10_1305 = fn.load_icesat2_as_dataframe('./Outputs/processed_ATL10-01_20190323211343_13050201_005_02.h5', VARIABLES)
atl07_1298.head() # Print head of dataframe
beam | delta_time | height_segment_confidence | height_segment_height | height_segment_length_seg | height_segment_quality | height_segment_surface_error_est | latitude | longitude | filename | |
---|---|---|---|---|---|---|---|---|---|---|
0 | gt2l | 3.857279e+07 | 0.009128 | -0.138787 | 14.245700 | 1 | 0.018342 | 79.999978 | 152.056408 | processed_ATL07-01_20190323101341_12980201_005... |
1 | gt2l | 3.857279e+07 | 0.008960 | -0.036512 | 13.633933 | 1 | 0.018023 | 79.999917 | 152.056303 | processed_ATL07-01_20190323101341_12980201_005... |
2 | gt2l | 3.857279e+07 | 0.014540 | -0.046638 | 12.929725 | 1 | 0.020470 | 79.999865 | 152.056214 | processed_ATL07-01_20190323101341_12980201_005... |
3 | gt2l | 3.857279e+07 | 0.008803 | 0.036554 | 12.929654 | 1 | 0.027361 | 79.999813 | 152.056125 | processed_ATL07-01_20190323101341_12980201_005... |
4 | gt2l | 3.857279e+07 | 0.008271 | 0.328410 | 11.592193 | 1 | 0.028593 | 79.999766 | 152.056044 | processed_ATL07-01_20190323101341_12980201_005... |
data_granules = [atl07_1298, atl07_1305, atl10_1305] # Create list of each dataframe
for df in data_granules:
df['utc_datetime'] = df['delta_time'].apply(fn.convert_delta_time)
df = df.sort_values(by=['beam', 'utc_datetime'])
data_granules[0].head() # View the head of the first dataframe in list: atl07_1298
beam | delta_time | height_segment_confidence | height_segment_height | height_segment_length_seg | height_segment_quality | height_segment_surface_error_est | latitude | longitude | filename | utc_datetime | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | gt2l | 3.857279e+07 | 0.009128 | -0.138787 | 14.245700 | 1 | 0.018342 | 79.999978 | 152.056408 | processed_ATL07-01_20190323101341_12980201_005... | 2019-03-23 10:39:50.933426 |
1 | gt2l | 3.857279e+07 | 0.008960 | -0.036512 | 13.633933 | 1 | 0.018023 | 79.999917 | 152.056303 | processed_ATL07-01_20190323101341_12980201_005... | 2019-03-23 10:39:50.934481 |
2 | gt2l | 3.857279e+07 | 0.014540 | -0.046638 | 12.929725 | 1 | 0.020470 | 79.999865 | 152.056214 | processed_ATL07-01_20190323101341_12980201_005... | 2019-03-23 10:39:50.935380 |
3 | gt2l | 3.857279e+07 | 0.008803 | 0.036554 | 12.929654 | 1 | 0.027361 | 79.999813 | 152.056125 | processed_ATL07-01_20190323101341_12980201_005... | 2019-03-23 10:39:50.936269 |
4 | gt2l | 3.857279e+07 | 0.008271 | 0.328410 | 11.592193 | 1 | 0.028593 | 79.999766 | 152.056044 | processed_ATL07-01_20190323101341_12980201_005... | 2019-03-23 10:39:50.937079 |
There were 13 granules returned over our time and area of interest. We can use Earthdata Search to easily visualize the coverage of each granule: MODIS Earthdata Search result
Let's choose a single MODIS granule to be used in our analysis below.
Out of the 13 returned, these all have full coverage over our study area:
And after viewing the browse images, this looks to have the best data coverage:
filepath = './Outputs/MOD29_A2019082_0140_061_2020290152422_HEGOUT.hdf' # Define local filepath
mod29 = xr.open_dataset(filepath, engine="netcdf4")
list(mod29.variables)
['Latitude', 'Longitude', 'Sea_Ice_by_Reflectance', 'Sea_Ice_by_Reflectance_Pixel_QA', 'Ice_Surface_Temperature', 'Ice_Surface_Temperature_Pixel_QA']
Before we plot the ATL07 heights over MODIS IST, we need to convert the Ice_Surface_Temperature values to kelvin. According to the MOD29 User Guide the Ice_Surface_Temperature (IST) values are stored as calibrated data. To convert to kelvin, use scale_factor = 0.01 and add_offset = 0.0 in the following equation:
IST = scale_factor × (calibrated data - add_offset)
The valid range for ISTs is 210 K to 313.20 K.
Let's apply this conversion to our xarray:
mod29['Ice_Surface_Temperature'] = mod29['Ice_Surface_Temperature']*(0.01) # Apply calibrated data to kelvin conversion
We can visualize the ICESat-2 track locations and height values overlayed with MODIS Ice Surface Temperature. We should see higher ATL07 heights along fast ice, with lower heights over thin ice and then increasing height over drift ice. We are using the Matplotlib pcolormesh
function to plot our MOD29 xarray as a 2D array. The ATL07 locations are plotted as a scatter plot.
plt.subplots(figsize=(10,5)) # set size
plt.pcolormesh(mod29.Longitude, mod29.Latitude, mod29.Ice_Surface_Temperature, vmin=240, vmax=270, cmap='viridis') # Set range to valid IST values
plt.colorbar(label='MOD29 IST (K)')
plt.scatter(atl07.longitude, atl07.latitude,
c=atl07.height_segment_height, vmax=1.5,
cmap='Reds', alpha=0.6, s=1)
plt.colorbar(label='ATL07 Height (m)')
plt.show()
Now that we've visually inspected height and IST values, let's do some analysis. We will now extract MODIS IST values at each ATL07 point. We'll focus on the 1305 track from here on out.
To easily interpolate MOD29 to ICESat-2 points, we can use the pyresample
library. The radius_of_influence
parameter determines maximum radius to look for nearest neighbor interpolation.
# Define lat lon swath geometry used for interpolation
icesat2_geometry_atl07_1305 = prs.geometry.SwathDefinition(lons=atl07_1305['longitude'], lats=atl07_1305['latitude'])
icesat2_geometry_atl10_1305 = prs.geometry.SwathDefinition(lons=atl10_1305['longitude'], lats=atl10_1305['latitude'])
mod29_geometry = prs.geometry.SwathDefinition(lons=mod29['Longitude'], lats=mod29['Latitude'])
# Add IST values to each pandas dataframe
atl07_1305['mod29_ice_surface_temperature'] = prs.kd_tree.resample_nearest(mod29_geometry, mod29['Ice_Surface_Temperature'].values, icesat2_geometry_atl07_1305, radius_of_influence=1000, fill_value=np.nan)
atl10_1305['mod29_ice_surface_temperature'] = prs.kd_tree.resample_nearest(mod29_geometry, mod29['Ice_Surface_Temperature'].values, icesat2_geometry_atl10_1305, radius_of_influence=1000, fill_value=np.nan)
We can now apply nan's to the MOD29 land mask value:
atl07_1305[atl07_1305['mod29_ice_surface_temperature'] == 50] = np.nan
atl10_1305[atl10_1305['mod29_ice_surface_temperature'] == 50] = np.nan
atl07_1305.head()
beam | delta_time | height_segment_confidence | height_segment_height | height_segment_length_seg | height_segment_quality | height_segment_surface_error_est | latitude | longitude | filename | utc_datetime | mod29_ice_surface_temperature | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | gt2l | 3.861196e+07 | NaN | NaN | 27.770977 | 0.0 | NaN | 72.240845 | 149.29619 | processed_ATL07-01_20190323211343_13050201_005... | 2019-03-23 21:32:36.311340 | 246.369995 |
1 | gt2l | 3.861196e+07 | NaN | NaN | 27.770977 | 0.0 | NaN | 72.240845 | 149.29619 | processed_ATL07-01_20190323211343_13050201_005... | 2019-03-23 21:32:36.311340 | 246.369995 |
2 | gt2l | 3.861196e+07 | NaN | NaN | 27.770977 | 0.0 | NaN | 72.240845 | 149.29619 | processed_ATL07-01_20190323211343_13050201_005... | 2019-03-23 21:32:36.311340 | 246.369995 |
3 | gt2l | 3.861196e+07 | NaN | NaN | 27.770977 | 0.0 | NaN | 72.240845 | 149.29619 | processed_ATL07-01_20190323211343_13050201_005... | 2019-03-23 21:32:36.311340 | 246.369995 |
4 | gt2l | 3.861196e+07 | NaN | NaN | 27.770977 | 0.0 | NaN | 72.240845 | 149.29619 | processed_ATL07-01_20190323211343_13050201_005... | 2019-03-23 21:32:36.311340 | 246.369995 |
We will use pandas plotting which utilizes matplotlib on the backend. The pandas loc
attribute allows access to a group of rows or columns by label.
The 1305 track is moving from low to high latitude: The start time begins at the fast ice, skips over an island, and moves towards warmer thin ice onto thicker drift ice.
# gt1l
fig, ax = plt.subplots(figsize=(10,3))
atl07_1305.loc[atl07_1305['beam'] == 'gt1l'].plot(x='latitude', y='height_segment_height', marker='.', markersize=1, linestyle='None', ax=ax)
atl07_1305.loc[atl07_1305['beam'] == 'gt1l'].plot(x='latitude', y='mod29_ice_surface_temperature', marker='.', markersize=1, linestyle='None', ax=ax, secondary_y=True)
fig.suptitle('ATL07-01_20190323211343_13050201_002_01.h5 gt1l')
ax.set_ylim(-0.2,2.2)
# gt2l
fig, ax = plt.subplots(figsize=(10,3))
atl07_1305.loc[atl07_1305['beam'] == 'gt2l'].plot(x='latitude', y='height_segment_height', marker='.', markersize=1, linestyle='None', ax=ax)
atl07_1305.loc[atl07_1305['beam'] == 'gt2l'].plot(x='latitude', y='mod29_ice_surface_temperature', marker='.', markersize=1, linestyle='None', ax=ax, secondary_y=True)
fig.suptitle('ATL07-01_20190323211343_13050201_002_01.h5 gt2l')
ax.set_ylim(-0.2,2.2)
# gt3l
fig, ax = plt.subplots(figsize=(10,3))
atl07_1305.loc[atl07_1305['beam'] == 'gt3l'].plot(x='latitude', y='height_segment_height', marker='.', markersize=1, linestyle='None', ax=ax)
atl07_1305.loc[atl07_1305['beam'] == 'gt3l'].plot(x='latitude', y='mod29_ice_surface_temperature', marker='.', markersize=1, linestyle='None', ax=ax, secondary_y=True)
fig.suptitle('ATL07-01_20190323211343_13050201_002_01.h5 gt3l')
ax.set_ylim(-0.2,2.2);
Key observations:
Hint: Copy the 5 lines of code for one of the ATL07 and MOD29 single beam plots above. Do not change the first ATL07 height_segment_height
plot line. Replace the following items on the second mod29_ice_surface_temperature
plot line:
atl07_1305
dataframe with atl10_1305
mod29_ice_surface_temperature
with beam_fb_height
as the y values
# Answer below!
# # gt3l
# fig, ax = plt.subplots(figsize=(10,3))
# atl07_1305.loc[atl07_1305['beam'] == 'gt3l'].plot(x='latitude', y='height_segment_height', marker='.', markersize=1, linestyle='None', ax=ax)
# atl10_1305.loc[atl10_1305['beam'] == 'gt3l'].plot(x='latitude', y='beam_fb_height', marker='.', markersize=1, linestyle='None', ax=ax, secondary_y=True)
# fig.suptitle('gt3l ATL07 and ATL10 comparison')
# ax.set_ylim(-0.2,2.2);
How do the ATL07 quality parameters relate to ATL10 coverage? We should see more ATL10 values with better ATL07 quality and visa versa.
The following plots show some examples of these quality flags and error estimates. You can modify plotting variable below to compare with other quality information.
We will filter height_segment_height
based on height_segment_quality
and plot the error estimates. According to the User Guide:
For ATL07 height_segment_quality: a binary indicator (1 = good, 0 = bad) of segment quality
# Available variables to plot for ATL07:
# height_segment_height
# height_segment_confidence
# height_segment_quality
# height_segment_surface_error_est
# height_segment_length_seg
# First Create a condition dictionary with the variables we wish to filter against. Here we want to only show the gt31 beam and good quality (quality = 1)
cond = {
'beam': 'gt1l',
'height_segment_quality': 1
}
# Unfiltered ATL07 heights
fig, ax = plt.subplots(figsize=(10,3))
atl07_1305.loc[(atl07_1305.beam == cond['beam'])].plot(x='latitude', y='height_segment_height', marker='.', markersize=1, linestyle='None', ax=ax)
atl07_1305.loc[(atl07_1305.beam == cond['beam'])].plot(x='latitude', y='mod29_ice_surface_temperature', marker='.', markersize=1, linestyle='None', ax=ax, secondary_y=True)
fig.suptitle('ATL07-01_20190323211343_13050201_002_01.h5 gt1l not filtered')
ax.set_ylim(-0.2,2);
# Filtered ATL07 heights by height_segment_quality
# Create list with the variables we wish to filter against
atl07_cond = (atl07_1305.beam == cond['beam']) & (atl07_1305.height_segment_quality == cond['height_segment_quality'])
fig, ax = plt.subplots(figsize=(10,3))
atl07_1305.loc[atl07_cond].plot(x='latitude', y='height_segment_height', marker='.', markersize=1, linestyle='None', ax=ax) # plot quality and beam condition
atl07_1305.loc[(atl07_1305.beam == cond['beam'])].plot(x='latitude', y='mod29_ice_surface_temperature', marker='.', markersize=1, linestyle='None', ax=ax, secondary_y=True)
plt.title('ATL07-01_20190323211343_13050201_002_01 gt1l filtered')
ax.set_ylim(-0.2,2.2);
# Filtered by quality flag and plot height estimate
atl07_plot_var = 'height_segment_surface_error_est' # set a plot variable that can be easily modified
fig, ax = plt.subplots(figsize=(10,3))
atl07_1305.loc[atl07_cond].plot(x='latitude', y=atl07_plot_var, marker='.', markersize=1, linestyle='None', ax=ax)
atl07_1305.loc[(atl07_1305.beam == cond['beam'])].plot(x='latitude', y='mod29_ice_surface_temperature', marker='.', markersize=1, linestyle='None', ax=ax, secondary_y=True)
plt.title('ATL07-01_20190323211343_13050201_002_01 gt1l error est filtered');
# Plot unfiltered ATL10 heights for comparison
fig, ax = plt.subplots(figsize=(10,3))
atl10_1305.loc[atl10_1305.beam == cond['beam']].plot(x='latitude', y='beam_fb_height', marker='.', markersize=1, linestyle='None', ax=ax)
atl10_1305.loc[(atl10_1305.beam == cond['beam'])].plot(x='latitude', y='mod29_ice_surface_temperature', marker='.', markersize=1, linestyle='None', ax=ax, secondary_y=True)
plt.title('ATL10-01_20190323211343_13050201_002_01 gt1l not filtered')
ax.set_ylim(-0.2,2.2);
What do we see above?
What else can you determine by plotting the other quality flags?
We discard data with MODIS temperatures greater than 250K and plot variables (i.e. quality flags or height)
cond = {
'beam': 'gt1l',
'mod29_ice_surface_temperature': 250.0
}
cond_atl10_tuple = (atl10_1305.beam == cond['beam']) & (atl10_1305.mod29_ice_surface_temperature >= cond['mod29_ice_surface_temperature'])
cond_atl07_tuple = (atl07_1305.beam == cond['beam']) & (atl07_1305.mod29_ice_surface_temperature >= cond['mod29_ice_surface_temperature'])
fig, ax = plt.subplots(figsize=(10,3))
atl07_1305.loc[cond_atl07_tuple].plot(x='latitude', y='height_segment_height', ax=ax, color='lightblue')
atl07_1305.loc[(atl07_1305.beam == cond['beam'])].plot(x='latitude', y='mod29_ice_surface_temperature', ax=ax, secondary_y=True, color='orange')
plt.title('ATL07-01_20190323211343_13050201_002_01 gt1l filtered')
fig, ax = plt.subplots(figsize=(10,3))
atl10_1305.loc[cond_atl10_tuple].plot(x='latitude', y='beam_fb_height', ax=ax)
atl10_1305.loc[(atl10_1305.beam == cond['beam'])].plot(x='latitude', y='mod29_ice_surface_temperature', ax=ax, secondary_y=True, color='orange')
plt.title('ATL10-01_20190323211343_13050201_002_01 gt1l filtered')
Text(0.5, 1.0, 'ATL10-01_20190323211343_13050201_002_01 gt1l filtered')