This notebook will introduce you to the analysis of deep multi-temporal SAR image data stacks in the framework of Jupyter Notebooks. The Jupyter Notebook environment is easy to launch in any web browser for interactive data exploration with provided or new training data. Notebooks are comprised of text written in a combination of executable python code and markdown formatting including latex style mathematical equations. Another advantage of Jupyter Notebooks is that they can easily be expanded, changed, and shared with new data sets or newly available time series steps. Therefore, they provide an excellent basis for collaborative and repeatable data analysis.
We introduce the following data analysis concepts:
Important Note about JupyterHub
Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.
import url_widget as url_w
notebookUrl = url_w.URLWidget()
display(notebookUrl)
from IPython.display import Markdown
from IPython.display import display
notebookUrl = notebookUrl.value
user = !echo $JUPYTERHUB_USER
env = !echo $CONDA_PREFIX
if env[0] == '':
env[0] = 'Python 3 (base)'
if env[0] != '/home/jovyan/.local/envs/rtc_analysis':
display(Markdown(f'<text style=color:red><strong>WARNING:</strong></text>'))
display(Markdown(f'<text style=color:red>This notebook should be run using the "rtc_analysis" conda environment.</text>'))
display(Markdown(f'<text style=color:red>It is currently using the "{env[0].split("/")[-1]}" environment.</text>'))
display(Markdown(f'<text style=color:red>Select the "rtc_analysis" from the "Change Kernel" submenu of the "Kernel" menu.</text>'))
display(Markdown(f'<text style=color:red>If the "rtc_analysis" environment is not present, use <a href="{notebookUrl.split("/user")[0]}/user/{user[0]}/notebooks/conda_environments/Create_OSL_Conda_Environments.ipynb"> Create_OSL_Conda_Environments.ipynb </a> to create it.</text>'))
display(Markdown(f'<text style=color:red>Note that you must restart your server after creating a new environment before it is usable by notebooks.</text>'))
In this notebook we will use the following scientific libraries:
Our first step is to import them:
%%capture
from pathlib import Path
import json # for loads
import math # for ceil
import re
import pandas as pd # for DatetimeIndex
from osgeo import gdal # for Info
gdal.UseExceptions()
import numpy as np # for copy, isnan, log10, ma.masked_where, max, mean, min, percentile, power, unique, var, where
import scipy.signal
%matplotlib inline
import matplotlib.pylab as plb # for figure, grid, rcParams, savefig
import matplotlib.pyplot as plt
from matplotlib import animation
from matplotlib import rc
from ipyfilechooser import FileChooser
from IPython.display import HTML
import opensarlab_lib as asfn
asfn.jupytertheme_matplotlib_format()
This notebook assumes that you've prepared your own data stack of RTC image products over your personal area of interest. This can be done using the Prepare_Data_Stack_Hyp3 and Subset_Data_Stack notebooks.
This notebook expects Radiometric Terrain Corrected (RTC) image products as input, so be sure to select an RTC process when creating the subscription for your input data within HyP3. Prefer a unique orbit geometry (ascending or descending) to keep geometric differences between images low.
Begin by writing a function to retrieve and the absolute paths to each of our tiffs:
def get_tiff_paths(paths):
tiff_paths = !ls $paths | sort -t_ -k5,5
return tiff_paths
Select the directory holding your tiffs
Select
buttonSelect
buttonChange
button to alter your selectionfc = FileChooser('/home/jovyan/notebooks')
display(fc)
Determine the path to the analysis directory containing the tiff directory:
tiff_dir = Path(fc.selected_path)
analysis_dir = tiff_dir.parent
print(f"analysis_dir: {analysis_dir}")
wildcard_path = tiff_dir/"*.tif*"
tiff_paths = get_tiff_paths(wildcard_path)
Write a function to extract the tiff dates from a wildcard path:
def get_date(pth):
date_regex = r'(?<=_)\d{8}(?=T\d{6})'
try:
return re.search(date_regex, str(pth)).group(0)
except AttributeError:
raise Exception(f"Date string not found in {pth}")
Call get_dates() to collect the product acquisition dates:
dates = sorted([get_date(t) for t in tiff_dir.rglob(f'*.tif*')])
print(dates)
Gather the upper-left and lower-right corner coordinates of the data stack:
coords = [[], []]
info = (gdal.Info(str(tiff_paths[0]), options = ['-json']))
info = json.dumps(info)
coords[0] = (json.loads(info))['cornerCoordinates']['upperLeft']
coords[1] = (json.loads(info))['cornerCoordinates']['lowerRight']
print(coords)
Grab the stack's UTM zone. Note that any UTM zone conflicts should already have been handled in the Prepare_Data_Stack_Hyp3 notebook.
info = (gdal.Info(str(tiff_paths[0]), options = ['-json']))
info = json.dumps(info)
info = (json.loads(info))['coordinateSystem']['wkt']
utm = info.split('ID')[-1].split(',')[1][0:-2]
print(f"UTM Zone: {utm}")
Now we stack up the data by creating a virtual raster table with links to all subset data files:
Create the virtual raster table for the subset GeoTiffs:
image_file = Path(f"{analysis_dir}/raster_stack.vrt")
!gdalbuildvrt -separate $image_file $wildcard_path
tindex = pd.DatetimeIndex(dates)
Print the bands and dates for all images in the virtual raster table (VRT):
j = 1
print(f"Bands and dates for {image_file}")
for i in tindex:
print("{:4d} {}".format(j, i.date()), end=' ')
j += 1
if j%5 == 1:
print()
We will open your VRT and visualize some layers using Matplotlib.
img = gdal.Open(str(image_file))
Print the bands, pixels, and lines:
print(f"Number of bands: {img.RasterCount}")
print(f"Number of pixels: {img.RasterXSize}")
print(f"Number of lines: {img.RasterYSize}")
Read in raster data for the first two bands:
raster_1 = img.GetRasterBand(1).ReadAsArray() # change the number passed to GetRasterBand() to
where_are_NaNs = np.isnan(raster_1) # read rasters from different bands
raster_1[where_are_NaNs] = 0
raster_2 = img.GetRasterBand(2).ReadAsArray() #must pass a valid band number to GetRasterBand()
where_are_NaNs = np.isnan(raster_2)
raster_2[where_are_NaNs] = 0
Plot images and histograms for bands 1 and 2:
Note: Depending the histograms plotted by this cell, you may wish to adjust vmax when calling imshow() on ax1 and ax3. Increase the vmax value if the histogram cuts off much of the end of the peak, making your image too bright to see features well. Decrease vmax if the histogram extends much beyond the end of the peak, which will make your image appear dark.
# Setup the pyplot plots
fig = plb.figure(figsize=(18,10)) # Initialize figure with a size
ax1 = fig.add_subplot(221) # 221 determines: 2 rows, 2 plots, first plot
ax2 = fig.add_subplot(222) # 222 determines: 2 rows, 2 plots, second plot
ax3 = fig.add_subplot(223) # 223 determines: 2 rows, 2 plots, third plot
ax4 = fig.add_subplot(224) # 224 determines: 2 rows, 2 plots, fourth plot
# Plot the band 1 image
band_number = 1
ax1.imshow(raster_1,cmap='gray', vmin=0, vmax=0.3) # see note above regarding vmax adjustments
ax1.set_title('Image Band {} {}'.format(band_number, tindex[band_number-1].date()))
# Flatten the band 1 image into a 1 dimensional vector and plot the histogram:
h = ax2.hist(raster_1.flatten(), bins=200, range=(0, 0.3))
ax2.xaxis.set_label_text('Amplitude? (Uncalibrated DN Values)')
ax2.set_title('Histogram Band {} {}'.format(band_number, tindex[band_number-1].date()))
# Plot the band 2 image
band_number = 2
ax3.imshow(raster_2,cmap='gray', vmin=0, vmax=0.3) # see note above regarding vmax adjustments
ax3.set_title('Image Band {} {}'.format(band_number, tindex[band_number-1].date()))
# Flatten the band 2 image into a 1 dimensional vector and plot the histogram:
h = ax4.hist(raster_2.flatten(),bins=200, range=(0,0.3))
ax4.xaxis.set_label_text('Amplitude? (Uncalibrated DN Values)')
ax4.set_title('Histogram Band {} {}'.format(band_number, tindex[band_number-1].date()))
Note, that if your data were generated by HyP3, this step is not necessary! HyP3 performs the full data calibration and provides you with calibrated data in power scale.
If, your data is from a different source, however, calibration may be necessary to ensure that image gray values correspond to proper radar cross section information.
Calibration coefficients for SAR data are often defined in the decibel (dB) scale due to the high dynamic range of the imaging system. For the L-band ALOS PALSAR data at hand, the conversion from uncalibrated DN values to calibrated radar cross section values in dB scale is performed by applying a standard calibration factor of -83 dB.
$\gamma^0_{dB} = 20 \cdot log10(DN) -83$
The data at hand are radiometrically terrain corrected images, which are often expressed as terrain flattened $\gamma^0$ backscattering coefficients. For forest and land cover monitoring applications $\gamma^o$ is the preferred metric.
To apply the calibration constant for your data and export in dB scale, uncomment the following code cell:
#caldB=20*np.log10(rasterstack)-83
While dB-scaled images are often "visually pleasing", they are often not a good basis for mathematical operations on data. For instance, when we compute the mean of observations, it makes a difference whether we do that in power or dB scale. Since dB scale is a logarithmic scale, we cannot simply average data in that scale.
Please note that the correct scale in which operations need to be performed is the power scale. This is critical, e.g. when speckle filters are applied, spatial operations like block averaging are performed, or time series are analyzed.
To convert from dB to power, apply: $\gamma^o_{pwr} = 10^{\frac{\gamma^o_{dB}}{10}}$
#calPwr=np.power(10.,caldB/10.)
Now we are ready to create a time series animation from the calibrated SAR data.
First, create a raster from band 0 and a raster stack from all the images:
band = img.GetRasterBand(1)
raster0 = band.ReadAsArray()
band_number = 0 # Needed for updates
rasterstack = img.ReadAsArray()
img = None
Create and move into a directory in which to store our plots and animations:
product_path = analysis_dir/f'plots_and_animations'
if not product_path.exists():
product_path.mkdir()
Create a masked raster stack:
rs2 = np.ma.masked_where(rasterstack==0, rasterstack)
Generate a matplotlib time-series animation:
%%capture
fig = plt.figure(figsize=(14, 8))
ax = fig.subplots()
ax.axis('off')
vmin = np.percentile(rasterstack.flatten(), 5)
vmax = np.percentile(rasterstack.flatten(), 95)
r0dB = 20 * np.log10(raster0) - 83
im = ax.imshow(raster0, cmap='gray', vmin=vmin, vmax=vmax)
ax.set_title("{}".format(tindex[0].date()))
def animate(i):
ax.set_title("{}".format(tindex[i].date()))
im.set_data(rasterstack[i])
# Interval is given in milliseconds
ani = animation.FuncAnimation(fig, animate, frames=rasterstack.shape[0], interval=400)
Configure matplotlib's RC settings for the animation:
rc('animation', embed_limit=40971520.0) # We need to increase the limit maybe to show the entire animation
Create a javascript animation of the time-series running inline in the notebook:
HTML(ani.to_jshtml())
Delete the dummy png that was saved to the current working directory while generating the javascript animation in the last code cell.
try:
product_path/Path('None0000000.png').unlink()
except FileNotFoundError:
pass
Save the animation (animation.gif):
ani.save(f"{product_path}/animation.gif", writer='pillow', fps=2)
To create the time series of means, we will go through the following steps:
Compute the means:
rs_means_pwr = np.mean(rs2, axis=(1, 2))
Convert resulting mean value time-series to dB scale for visualization:
rs_means_dB = 10.*np.log10(rs_means_pwr)
Plot and save the time series of means (RCSoverTime.png):
try:
plt.rcParams.update({'font.size': 14})
fig = plt.figure(figsize=(16, 4))
ax1 = fig.subplots()
window_length = len(rs_means_pwr)-1
if window_length % 2 == 0:
window_length -= 1
polyorder = math.ceil(window_length*0.1)
yhat = scipy.signal.savgol_filter(rs_means_pwr, window_length, polyorder)
ax1.plot(tindex, yhat, color='red', marker='o', markerfacecolor='white', linewidth=3, markersize=6)
ax1.plot(tindex, rs_means_pwr, color='gray', linewidth=0.5)
plt.grid()
ax1.set_xlabel('Date')
ax1.set_ylabel(r'$\overline{\gamma^o}$ [power]')
plt.savefig(f"{product_path}/RCSoverTime.png", dpi=72, transparent='true')
except ValueError as e:
print(f"Error: polyorder: {polyorder} >= window_length: {window_length}")
raise
The coefficient of variance describes how much the $\sigma_{0}$ or $\gamma_{0}$ measurements in a pixel vary over time. Hence, the coefficient of variance can indicate different vegetation cover and soil moisture regimes in your area.
Write a function to convert our plots into GeoTiffs:
def geotiff_from_plot(source_image, out_filename, extent, utm, cmap=None, vmin=None, vmax=None, interpolation=None, dpi=300):
assert "." not in out_filename, 'Error: Do not include the file extension in out_filename'
assert type(extent) == list and len(extent) == 2 and len(extent[0]) == 2 and len(
extent[1]) == 2, 'Error: extent must be a list in the form [[upper_left_x, upper_left_y], [lower_right_x, lower_right_y]]'
plt.figure()
plt.axis('off')
plt.imshow(source_image, cmap=cmap, vmin=vmin, vmax=vmax, interpolation=interpolation)
temp = f"{out_filename}_temp.png"
plt.savefig(temp, dpi=dpi, transparent='true', bbox_inches='tight', pad_inches=0)
cmd = f"gdal_translate -of Gtiff -a_ullr {extent[0][0]} {extent[0][1]} {extent[1][0]} {extent[1][1]} -a_srs EPSG:{utm} {temp} {out_filename}.tiff"
!{cmd}
try:
Path(temp).unlink()
except FileNotFoundError:
pass
Plot the Coefficient of Variance Map and save it as a png (Coeffvar.png):
test = np.var(rasterstack,0)
mtest = np.mean(rasterstack[rasterstack.nonzero()],0)
coeffvar = test/(mtest+0.001)
plt.rcParams.update({'font.size': 14})
fig = plt.figure(figsize=(13, 10))
ax = fig.subplots()
ax.axis('off')
vmin = np.percentile(coeffvar.flatten(), 5)
vmax = np.percentile(coeffvar.flatten(), 95)
ax.set_title('Coefficient of Variance Map')
im = ax.imshow(coeffvar, cmap='jet', vmin=vmin, vmax=vmax)
fig.colorbar(im, ax=ax)
plt.savefig(f"{product_path}/Coeffvar.png", dpi=300, transparent='true')
Save the coefficient of variance map as a GeoTiff (Coeffvar.tiff):
%%capture
geotiff_from_plot(coeffvar, f"{product_path}/Coeffvar", coords, utm, cmap='jet', vmin=vmin, vmax=vmax)
This is an example how to threshold the derived coefficient of variance map. This can be useful, e.g., to detect areas of active agriculture.
Plot and save the coefficient of variance histogram and CDF (thresh_coeff_var_histogram.png):
plt.rcParams.update({'font.size': 14})
fig = plt.figure(figsize=(14, 6)) # Initialize figure with a size
ax1 = fig.add_subplot(121) # 121 determines: 2 rows, 2 plots, first plot
ax2 = fig.add_subplot(122)
# Second plot: Histogram
# IMPORTANT: To get a histogram, we first need to *flatten*
# the two-dimensional image into a one-dimensional vector.
h = ax1.hist(coeffvar.flatten(), bins=200, range=(0, 0.03))
ax1.xaxis.set_label_text('Coefficient of Variation')
ax1.set_title('Coeffvar Histogram')
plt.grid()
n, bins, patches = ax2.hist(coeffvar.flatten(), bins=200, range=(0, 0.03), cumulative='True', density='True', histtype='step', label='Empirical')
ax2.xaxis.set_label_text('Coefficient of Variation')
ax2.set_title('Coeffvar CDF')
plt.grid()
plt.savefig(f"{product_path}/thresh_coeff_var_histogram.png", dpi=72, transparent='true')
Plot the Threshold Coefficient of Variance Map and save it as a png (Coeffvarthresh.png):
plt.rcParams.update({'font.size': 14})
outind = np.where(n > 0.80)
threshind = np.min(outind)
thresh = bins[threshind]
coeffvarthresh = np.copy(coeffvar)
coeffvarthresh[coeffvarthresh < thresh] = 0
coeffvarthresh[coeffvarthresh > 0.1] = 0
fig = plt.figure(figsize=(13, 10))
ax = fig.subplots()
ax.axis('off')
vmin = np.percentile(coeffvar.flatten(), 5)
vmax = np.percentile(coeffvar.flatten(), 95)
ax.set_title(r'Thresholded Coeffvar Map [$\alpha=95\%$]')
im = ax.imshow(coeffvarthresh, cmap='jet', vmin=vmin, vmax=vmax)
bar = fig.colorbar(im, ax=ax)
plt.savefig(f"{product_path}/Coeffvarthresh.png", dpi=300, transparent='true')
Save the Threshold Coefficient of Variance Map as a GeoTiff (Coeffvarthresh.tiff):
%%capture
geotiff_from_plot(coeffvarthresh, f"{product_path}/Coeffvarthresh", coords, utm, cmap='jet', vmin=vmin, vmax=vmax)
*Time_Series_From_Prepared_Stack.ipynb - Version 1.4.2 - February 2024
Version Changes: