In this notebook we will perform time series-based surface change analysis on a time series of permanent TLS point clouds of the sandy beach at Kijkduin for a timespan of around 6 months (Vos et al., 2022). An introduction to the case study and dataset can be found here.
The objective is to assess surface dynamics using Principal Component Analysis (PCA; following Frantzen et al., 2023). Look into the related article for comparison of possible surface dynamics of the use case and help for deciding on suitable parameters, etc.
The workflow is introduced throughout this notebook. Also, make use of the software documentations!
This task is solved using Python with the py4dgeo
library.
You can use CloudCompare or GIS Software (e.g. QGIS) to check the data and visualize your results.
The dataset is a subsampled version of the original time series, using 12-hourly epochs of point clouds and spatial subsampling to 50 cm. The dataset can be downloaded (module3.zip) from the E-learning course E-TRAINEE. In the data directory kijkduin
, you find the prepared input point clouds and a core points point cloud, which is manually cleaned from noise.
E-TRAINEE is an e-learning course on Time Series Analysis in Remote Sensing for Understanding Human-Environment Interactions. This course has been developed by research groups from four partner universities – Charles University, Heidelberg University, University of Innsbruck, and University of Warsaw.
Prepare the analysis by compiling the list of files (epochs) and read the timestamps from the file names (format YYMMDD_hhmmss
) into datetime
objects. Use the point cloud files and timestamps to create a py4dgeo SpatiotemporalAnalysis
object. For this you need to instantiate the M3C2 algorithm. You can use the point cloud file 170115_150816_aoi_50cm.laz
as core points. Explore the point cloud properties in CloudCompare:
Hint: In this flat topography and predominant provess of sand deposition and erosion, it can be suitable to orient the normals purely vertically. In this case, they do not need to be computed, and you can customize the py4dgeo algorithm accordingly.
Use the first point cloud in the time series (list of files) as reference epoch. You can assume a registration error of 1.9 cm for the M3C2 distance calculation (cf. Vos et al., 2022).
Explore the spatiotemporal change information by visualizing the changes at a selected epoch and visualizing the time series at a selected location.
First, we start by setting up the Python environment and data:
# Import required modules
import py4dgeo
import os
import numpy as np
from datetime import datetime
import pooch
# Download data from zenodo and set path to point cloud folder
p = pooch.Pooch(base_url="doi:10.5281/zenodo.10003574/", path=pooch.os_cache("py4dgeo"))
p.load_registry_from_doi()
p.fetch("module3.zip", processor=pooch.Unzip())
pc_dir = p.path / "module3.zip.unzip" / "module3" / "kijkduin" / "pointclouds"
# List of point clouds (time series)
pc_list = os.listdir(pc_dir)
pc_list[:5] # print the first elements
In the list of point cloud files you can see that we have one laz file per epoch available. The file name contains the timestamp of the epoch, respectively, in format YYMMDD_hhmmss
. To use this information for our analysis, we read the timestamp information from the file names into datetime
objects:
# Read the timestamps from file names
timestamps = []
for f in pc_list:
if not f.endswith(".laz"):
continue
# Get the timestamp from the file name
timestamp_str = "_".join(f.split(".")[0].split("_")[1:]) # yields YYMMDD_hhmmss
# Convert string to datetime object
timestamp = datetime.strptime(timestamp_str, "%y%m%d_%H%M%S")
timestamps.append(timestamp)
timestamps[:5]
Now we use the point cloud files and timestamp information to create a SpatiotemporalAnalysis
object.
analysis = py4dgeo.SpatiotemporalAnalysis(pc_dir / "kijkduin.zip", force=True)
As reference epoch, we use the first epoch in our time series:
# Specify the reference epoch
reference_epoch_file = pc_dir / pc_list[0]
# Read the reference epoch and set the timestamp
reference_epoch = py4dgeo.read_from_las(reference_epoch_file)
reference_epoch.timestamp = timestamps[0]
# Set the reference epoch in the spatiotemporal analysis object
analysis.reference_epoch = reference_epoch
For epochs to be added, we now configure the M3C2 algorithm to derive the change values. We would like to set the normals purely vertically, so we define a customized computation of cylinder directions
:
# Inherit from the M3C2 algorithm class to define a custom direction algorithm
class M3C2_Vertical(py4dgeo.M3C2):
def directions(self):
return np.array([0, 0, 1]) # vertical vector orientation
# Specify corepoints, here all points of the reference epoch
analysis.corepoints = reference_epoch.cloud[::]
# Specify M3C2 parameters for our custom algorithm class
analysis.m3c2 = M3C2_Vertical(
cyl_radii=(1.0,), max_distance=10.0, registration_error=0.019
)
Now we add all the other epochs with their timestamps:
# Create a list to collect epoch objects
epochs = []
for e, pc_file in enumerate(pc_list[1:]):
if not pc_file.endswith("laz"):
continue
epoch_file = pc_dir / pc_file
epoch = py4dgeo.read_from_las(epoch_file)
epoch.timestamp = timestamps[e]
epochs.append(epoch)
# Add epoch objects to the spatiotemporal analysis object
analysis.add_epochs(*epochs)
# Print the spatiotemporal analysis data for 3 corepoints and 5 epochs, respectively
print(f"Space-time distance array:\n{analysis.distances[:3,:5]}")
print(
f"Uncertainties of M3C2 distance calculation:\n{analysis.uncertainties['lodetection'][:3, :5]}"
)
print(f"Timestamp deltas of analysis:\n{analysis.timedeltas[:5]}")
We visualize the changes in the scene for a selected epoch, together with the time series of surface changes at a selected location. The location here was selected separately in CloudCompare (as the corepoint id):
cp_idx_sel = 15162 # selected core point index
epoch_idx_sel = 28 # selected epoch index
# Import plotting module
import matplotlib.pyplot as plt
# Allow interactive rotation in notebook
%matplotlib inline
# Create the figure
fig = plt.figure(figsize=(15, 5))
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)
# Get the corepoints
corepoints = analysis.corepoints.cloud
# Get change values of last epoch for all corepoints
distances = analysis.distances
distances_epoch = [d[epoch_idx_sel] for d in distances]
# Get the time series of changes at a specific core point locations
coord_sel = analysis.corepoints.cloud[cp_idx_sel]
timeseries_sel = distances[cp_idx_sel]
# Get the list of timestamps from the reference epoch timestamp and timedeltas
timestamps = [t + analysis.reference_epoch.timestamp for t in analysis.timedeltas]
# Plot the scene
d = ax1.scatter(
corepoints[:, 0],
corepoints[:, 1],
c=distances_epoch[:],
cmap="seismic_r",
vmin=-1.5,
vmax=1.5,
s=1,
zorder=1,
)
plt.colorbar(d, format=("%.2f"), label="Distance [m]", ax=ax1, pad=0.15)
# Add the location of the selected coordinate
ax1.scatter(
coord_sel[0],
coord_sel[1],
facecolor="yellow",
edgecolor="black",
s=100,
zorder=2,
label="Selected location",
marker="*",
)
ax1.legend()
# Configure the plot layout
ax1.set_xlabel("X [m]")
ax1.set_ylabel("Y [m]")
ax1.set_aspect("equal")
ax1.set_title(
"Changes at %s"
% (analysis.reference_epoch.timestamp + analysis.timedeltas[epoch_idx_sel])
)
# Plot the time series
ax2.scatter(timestamps, timeseries_sel, s=5, color="black", label="Raw")
ax2.plot(timestamps, timeseries_sel, color="blue", label="Smoothed")
ax2.set_xlabel("Date")
# Add the epoch of the plotted scene
ax2.scatter(
timestamps[epoch_idx_sel],
timeseries_sel[epoch_idx_sel],
facecolor="yellow",
marker="*",
edgecolor="black",
s=100,
color="red",
label="Selected epoch",
)
ax2.legend()
# Format the date labels
import matplotlib.dates as mdates
dtFmt = mdates.DateFormatter("%b-%d")
plt.gca().xaxis.set_major_formatter(dtFmt)
# configure the plot layout
ax2.set_ylabel("Distance [m]")
ax2.grid()
ax2.set_ylim(-0.3, 1.6)
ax2.set_title("Time series at selected location")
plt.tight_layout()
plt.show()
The map of changes in the scene shows us linear structures of sand deposition near the coast which can be interpreted as sand bars (with knowledge about coastal processes). This is confirmed by the surface behavior over time, expressed in the time series plot. However, the time series is quite noisy especially in this part of the beach, which is regularly covered by water during high tides (leading to missing data) and also varies strongly in surface moisture (influencing the LiDAR range measurement and causing noise). We therefore continue with temporal smoothing of the time series.
In the following, we will use principal component analysis (PCA; also referred to as empirical orthogonal functions) as a tool to summarize the change patterns within a spatiotemporal dataset. The method is based on this master thesis, to be published in Frantzen et al. (2023).
For this technique, the 4D data shall be represented as a two-dimensional array in which the rows correspond to different epochs, and the columns correspond to each individual [x, y] coordinate where a z or surface change value is represented.
Reshape the data from the SpatiotemporalAnalysis
object so that all coordinates are represented in one axis, i.e. an array holding [time, space]:
X = distances.transpose()
Find coordinates without nan values, and create a new array without nan values:
idxs = np.argwhere(
~np.isnan(X).any(axis=0)
) # store the column indices where actual data is present
X_no_nan = X[:, idxs.flatten()].reshape((X.shape[0], -1))
Compute the loading and scores of the prinicpal components in 'NaN-free' X:
from sklearn.decomposition import PCA
pca = PCA()
scores = pca.fit_transform(X_no_nan)
result = pca.components_
Plot the principal component loadings and scores along with their fraction of explained variance:
coords_no_nan = analysis.corepoints.cloud[idxs]
for i in range(1): # range(pca.n_components_):
fig, axes = plt.subplots(1, 2, figsize=(10, 4))
sat_val = (
np.abs(np.nanpercentile(result[i], 10))
+ np.abs(np.nanpercentile(result[i], 90)) / 2
)
s = axes[0].scatter(
coords_no_nan[:, 0, 0],
coords_no_nan[:, 0, 1],
c=result[i],
clim=(-sat_val, sat_val),
cmap="PiYG",
)
axes[0].set_aspect("equal")
axes[1].plot(scores[i])
axes[1].set_xlabel("Epoch")
axes[1].set_ylabel("Score")
fig.suptitle(
f"PC loading (left) and scores (right) for principal component {i} with EVF {pca.explained_variance_ratio_[i]}"
)
plt.colorbar(s, ax=axes[0])
plt.tight_layout()
fig.show()