Notebook

This notebook looks for large wind events in Sandheads historical wind data. We are looking for a wind storm that did not result in a surge. We would like to ensure that the model does not make a surge if one did not actually occur.

This codes uses a combination of Doug's class for reading Sandheads data and Kate's methodology for finding ssh anamolies.

In [1]:

from __future__ import division, print_function

import arrow
from dateutil import tz
import numpy as np
%matplotlib inline
from matplotlib import pylab
import matplotlib.pyplot as plt

In [2]:

class SOGWindTimeseries(object):
    """Time series object that knows how to read SOG-forcing wind data files.
    """
    def __init__(self, datafile):
        self.datafile = datafile
        
    def read_data(self):
        """Read time stamps, cross- and along-strait wind components
        from data file, and calculate the wind speeds
        """
        self.timestamps, self.cross_winds, self.along_winds = [], [], []
        with open(self.datafile, 'rt') as data:
            for timestamp, cross_wind, along_wind in self._interesting(data):
                self.timestamps.append(timestamp)
                self.cross_winds.append(cross_wind)
                self.along_winds.append(along_wind)
        self.timestamps = np.array(self.timestamps)
        self.cross_strait = np.array(self.cross_winds)
        self.along_strait = np.array(self.along_winds)
        self.speeds = np.sqrt(self.cross_strait**2 + self.along_strait**2)
        
    def _interesting(self, data):
        """Generator to extract interesting values from the data stream.
        """
        tzinfo = tz.gettz('Canada/Pacific')
        for row in data:
            day, month, year, hour, cross_wind, along_wind = row.split()
            day, month, year = map(int, (day, month, year))
            hour = float(hour)
            timestamp = arrow.get(year, month, day, int(hour), tzinfo=tzinfo)
            cross_wind, along_wind = map(float, (cross_wind, along_wind))
            yield timestamp, cross_wind, along_wind

In [3]:

wind = SOGWindTimeseries('/ocean/nsoontie/SOG-projects/SOG-forcing/wind/SH_pp_08Feb2014.dat')
wind.read_data()

wind.timestamps[0], wind.timestamps[-1]

Out[3]:

(<Arrow [1967-01-01T00:00:00-08:00]>, <Arrow [2014-02-08T23:00:00-08:00]>)

In [4]:

wind.timestamps.shape

Out[4]:

(412944,)

In [5]:

imax = np.argmax(wind.speeds)
imax, wind.timestamps[imax], wind.speeds[imax]

Out[5]:

(72246, <Arrow [1975-03-30T06:00:00-08:00]>, 27.199829429796065)

First, we should think about what kinds of wind speeds are significant. The December 15, 2006 great wind event has speeds up to 25 m/s. This event did see a large sea surface height anamoly (greater than 40 cm) but this did not occur at high tide and hance had no significant flooding.

Some other surge events in the SFU review saw average wind speeds between 6m/s and 18m/s.

Let's use 20 m/s as a threshold value. This is equivalent to 72 km/hr so is quite windy. I will idenitfy winds events greater than this value and then cross reference them with the large ssh anomalies that Kate found. I will select dates with large wind events but not large ssh anomaly.

Something to consider: The duration of the wind event may matter. So I may have to search for the threshold and duration conditions.

Looking at dates after 2001 because of CGRF forcing runs from 2002-2010.

In [6]:

wind_thres = 20 #threshold for wind speed
yr_thres = 2001 #data must be later than 2000

ind_thres = np.where(wind.speeds > wind_thres)
winds=wind.speeds[ind_thres]
dates_thres=wind.timestamps[ind_thres]

ds={}
for i in range(0,dates_thres.shape[0]):
    if dates_thres[i].year > yr_thres:
        ds[i]=dates_thres[i]
       

In [7]:

ds

Out[7]:

{125: <Arrow [2002-05-19T22:00:00-07:00]>,
 126: <Arrow [2003-01-03T01:00:00-08:00]>,
 127: <Arrow [2003-10-28T18:00:00-08:00]>,
 128: <Arrow [2006-11-15T09:00:00-08:00]>,
 129: <Arrow [2006-12-11T14:00:00-08:00]>,
 130: <Arrow [2006-12-15T03:00:00-08:00]>,
 131: <Arrow [2007-01-05T23:00:00-08:00]>,
 132: <Arrow [2007-01-09T15:00:00-08:00]>,
 133: <Arrow [2007-11-12T06:00:00-08:00]>,
 134: <Arrow [2007-11-12T07:00:00-08:00]>,
 135: <Arrow [2007-11-12T08:00:00-08:00]>,
 136: <Arrow [2008-01-14T22:00:00-08:00]>,
 137: <Arrow [2009-11-18T18:00:00-08:00]>,
 138: <Arrow [2009-11-18T19:00:00-08:00]>,
 139: <Arrow [2010-01-18T03:00:00-08:00]>,
 140: <Arrow [2010-01-18T04:00:00-08:00]>,
 141: <Arrow [2010-01-18T05:00:00-08:00]>,
 142: <Arrow [2010-01-18T06:00:00-08:00]>,
 143: <Arrow [2010-03-29T01:00:00-07:00]>,
 144: <Arrow [2010-04-02T12:00:00-07:00]>,
 145: <Arrow [2010-04-02T13:00:00-07:00]>,
 146: <Arrow [2010-04-02T14:00:00-07:00]>,
 147: <Arrow [2010-04-02T17:00:00-07:00]>,
 148: <Arrow [2010-04-02T18:00:00-07:00]>,
 149: <Arrow [2012-03-12T03:00:00-07:00]>,
 150: <Arrow [2012-03-12T04:00:00-07:00]>,
 151: <Arrow [2012-03-12T06:00:00-07:00]>,
 152: <Arrow [2012-03-12T07:00:00-07:00]>,
 153: <Arrow [2012-03-12T09:00:00-07:00]>,
 154: <Arrow [2012-11-18T21:00:00-08:00]>}

Some possible days with more than one hour of high wind:

Nov 12, 2007 - flagged by Kate

Nov 18, 2009

Jan 18, 2010 - flagged by Kate

Apr 2, 2010

Mar 12, 2010 - flagged by Kate and out of range anyways

This leaves Nov 18, 2009 and Apr 2, 2010 as potential dates. I may want to check on the CGRF data to compare with observations.

My preference would be to examine the Nov 18, 2009 case. We know that storm surges are more likely to occur in the winter so this would be a good test that our model does not predict a storm when one did not occur.

Let's check the wind at Sandheads during that week to confirm this is a good case.

In [8]:

import pytz 
times=[]
for i in range(0,wind.timestamps.shape[0]):
    t=wind.timestamps[i]
    dt=t.datetime
    tim=dt.astimezone(pytz.timezone('UTC'))
    times.append(tim)


times[1]

Out[8]:

datetime.datetime(1967, 1, 1, 9, 0, tzinfo=<UTC>)

In [9]:

import datetime
fig, ax = plt.subplots(1)
fig.autofmt_xdate()

# use a more precise date string for the x axis locations in the
# toolbar
import matplotlib.dates as mdates
ax.fmt_xdata = mdates.DateFormatter('%Y-%m-%d')

pylab.plot(times,wind.speeds)
ax.set_xlim([datetime.datetime(2009,11,15), datetime.datetime(2009,11,20)])
pylab.xlabel('Time (UTC)')
pylab.ylabel('Wind speed (m/s)')

Out[9]:

<matplotlib.text.Text at 0x106d1dd0>

Let's check on water level at Point Atkinson during this event.

In [10]:

start='15-Nov-2009'
end='20-Nov-2009'
emin=-3
emax=3
plt.figure(figsize=(18,12))

obs = {'PointAtkinson': 7795, 'Victoria': 7120, 'PatriciaBay': 7277, 'CampbellRiver': 8074}
datums = {'PointAtkinson': 3.09, 'Victoria': 1.881, 'PatriciaBay': 2.256, 'CampbellRiver': 2.916}
wlev_pred_xtide={}
wlev_meas={}

        
import datetime
import pandas as pd
import pytz
from salishsea_tools import tidetools
key='PointAtkinson'
statID_PA =obs[key]
filename = 'wlev_' +str(statID_PA) + '_' + start +'_' +end +'.csv'
tidetools.get_dfo_wlev(statID_PA,start,end)
def dateParserMeasured(s):
    #convert the string to a datetime object
    unaware = datetime.datetime.strptime(s, "%Y/%m/%d %H:%M")
    #add in the local time zone (Canada/Pacific)
    aware = unaware.replace(tzinfo=pytz.timezone('Canada/Pacific'))
    #convert to UTC
    return aware.astimezone(pytz.timezone('UTC'))
    
wlev_meas[key] = pd.read_csv(filename,skiprows=7,parse_dates=[0],date_parser=dateParserMeasured)
wlev_meas[key] = wlev_meas[key].rename(columns={'Obs_date': 'time', 'SLEV(metres)': 'slev'})
pylab.plot(wlev_meas[key].time, wlev_meas[key].slev-datums[key],'k*',label='observed') #subtracted off the datum from t_xtide
i=i+1
    
#plotting t_xtide predictions
filename = key+ ' (2)_xtide_prediction_' + start+'_'+end+'.csv'
def dateParserMeasured2(s):
    #convert the string to a datetime object
    unaware = datetime.datetime.strptime(s, "%d-%b-%Y %H:%M:%S ")
    #add in the local time zone (Canada/Pacific)
    aware = unaware.replace(tzinfo=pytz.timezone('Canada/Pacific'))
    #convert to UTC
    return aware.astimezone(pytz.timezone('UTC'))

wlev_pred_xtide[key] = pd.read_csv(filename,delimiter='\t',parse_dates=[0],date_parser=dateParserMeasured2)
wlev_pred_xtide[key] = wlev_pred_xtide[key].rename(columns={'Time_Local ': 'time', ' wlev_pred ': 'slev'})
pylab.plot(wlev_pred_xtide[key].time, wlev_pred_xtide[key].slev-datums[key], label='t_xtide')
pylab.grid()

pylab.legend(loc=0)

Out[10]:

<matplotlib.legend.Legend at 0x10d9ed10>

This confirms that the high wind event does not result in a large ssh anomaly. Although the anomaly on Nov 17 is pretty large but doesn't last more than 12 hours. This date does not come up in the lit searches.