Eric Gagliano (egagli@uw.edu)
Updated: February 20th, 2024
Thanks for checking out this notebook! My hope is that this repository makes it easier to retrieve daily SNOTEL and CCSS data without having to do clunky downloads and conversions. Snow depth / SWE / PRCPSA are in meters, temperatures are in celsius. The only required packages you'll need are geopandas and pandas, the rest of the imports are for applications :)
import pandas as pd
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
import datetime
import tqdm
import contextily as ctx
endDate
columnall_stations_gdf = gpd.read_file('https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/all_stations.geojson').set_index('code')
all_stations_gdf = all_stations_gdf[all_stations_gdf['csvData']==True]
all_stations_gdf
name | network | elevation_m | latitude | longitude | state | HUC | mgrs | mountainRange | beginDate | endDate | csvData | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
code | |||||||||||||
301_CA_SNTL | Adin Mtn | SNOTEL | 1886.712036 | 41.235828 | -120.791924 | California | 180200021403 | 10TFL | Great Basin Ranges | 1983-10-01 | 2100-01-01 | True | POINT (-120.79192 41.23583) |
907_UT_SNTL | Agua Canyon | SNOTEL | 2712.719971 | 37.522171 | -112.271179 | Utah | 160300020301 | 12SUG | Colorado Plateau | 1994-10-01 | 2100-01-01 | True | POINT (-112.27118 37.52217) |
916_MT_SNTL | Albro Lake | SNOTEL | 2529.840088 | 45.597229 | -111.959023 | Montana | 100200050701 | 12TVR | Central Montana Rocky Mountains | 1996-09-01 | 2100-01-01 | True | POINT (-111.95902 45.59723) |
1267_AK_SNTL | Alexander Lake | SNOTEL | 48.768002 | 61.749668 | -150.889664 | Alaska | 190205051106 | 05VPJ | NaN | 2014-08-28 | 2100-01-01 | True | POINT (-150.88966 61.74967) |
908_WA_SNTL | Alpine Meadows | SNOTEL | 1066.800049 | 47.779572 | -121.698471 | Washington | 171100100501 | 10TET | Cascade Range | 1994-09-01 | 2100-01-01 | True | POINT (-121.69847 47.77957) |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
SLT | Slate Creek | CCSS | 1737.360000 | 41.043980 | -122.480103 | California | 180200050304 | 10TEL | Klamath Mountains | 2004-10-01 | 2024-02-12 | True | POINT (-122.48010 41.04398) |
SLI | Slide Canyon | CCSS | 2804.160000 | 38.091234 | -119.431881 | California | 180400090501 | 11SKC | Sierra Nevada | 2005-10-01 | 2024-02-20 | True | POINT (-119.43188 38.09123) |
SLK | South Lake | CCSS | 2926.080000 | 37.175903 | -118.562660 | California | 180901020601 | 11SLB | Sierra Nevada | 2004-10-01 | 2024-02-20 | True | POINT (-118.56266 37.17590) |
STL | State Lakes | CCSS | 3169.920000 | 36.926483 | -118.573250 | California | 180300100305 | 11SLA | Sierra Nevada | 2018-11-01 | 2024-02-20 | True | POINT (-118.57325 36.92648) |
TMR | Tamarack Summit | CCSS | 2301.240000 | 37.163750 | -119.200531 | California | 180400060903 | 11SLB | Sierra Nevada | 2011-01-01 | 2024-02-20 | True | POINT (-119.20053 37.16375) |
969 rows × 13 columns
GeoDataFrame.explore()
on the all_stations_gdf
geodataframe to interactively view the stations¶all_stations_gdf.astype(dict(beginDate=str, endDate=str)).explore(column='network',cmap='bwr')
pd.read_csv()
with index_col='datetime'
and parse_dates=True
so we interpret the datetime column as pandas datetime objectsstation_id = '679_WA_SNTL'
paradise_snotel = pd.read_csv(f'https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/data/{station_id}.csv',index_col='datetime', parse_dates=True)
paradise_snotel
TAVG | TMIN | TMAX | SNWD | WTEQ | PRCPSA | |
---|---|---|---|---|---|---|
datetime | ||||||
1980-10-01 | NaN | NaN | NaN | NaN | 0.0000 | 0.0000 |
1980-10-02 | NaN | NaN | NaN | NaN | 0.0000 | 0.0000 |
1980-10-03 | NaN | NaN | NaN | NaN | 0.0000 | 0.0000 |
1980-10-04 | NaN | NaN | NaN | NaN | 0.0000 | 0.0000 |
1980-10-05 | NaN | NaN | NaN | NaN | 0.0000 | 0.0000 |
... | ... | ... | ... | ... | ... | ... |
2024-02-16 | -5.1 | -7.5 | -2.5 | 2.2860 | 1.0033 | 0.0076 |
2024-02-17 | -2.6 | -4.8 | -0.4 | 2.3368 | 1.0109 | 0.0025 |
2024-02-18 | 0.6 | -2.1 | 2.6 | 2.2860 | 1.0135 | 0.0152 |
2024-02-19 | 1.3 | -0.2 | 2.8 | 2.3368 | 1.0287 | 0.0025 |
2024-02-20 | NaN | NaN | NaN | 2.3114 | 1.0312 | NaN |
15848 rows × 6 columns
Series.plot()
f,ax=plt.subplots(figsize=(12,5))
paradise_snotel['SNWD'].plot(ax=ax,label='snow depth')
paradise_snotel['WTEQ'].plot(ax=ax,label='snow water equivalent')
ax.set_xlim(pd.to_datetime(['2017-10-01','2018-09-30']))
ax.grid()
ax.legend()
ax.set_xlabel('time')
ax.set_ylabel('snow depth / SWE [meters]')
ax.set_title('Snow depth and SWE at Paradise, WA \n(water year 2018)')
f.tight_layout()
datetime_to_DOWY()
shown below to convert datetimes to day of water year and add a dedicated DOWY columndef datetime_to_DOWY(date):
try:
if date.month < 10 or (date.month == 10 and date.day < 1):
start_of_water_year = pd.Timestamp(year=date.year-1, month=10, day=1)
else:
start_of_water_year = pd.Timestamp(year=date.year, month=10, day=1)
return (date - start_of_water_year).days + 1
except:
return np.nan
def datetime_to_WY(date):
if date.month < 10 or (date.month == 10 and date.day < 1):
return date.year
else:
return date.year + 1
paradise_snotel['DOWY'] = paradise_snotel.index.map(datetime_to_DOWY)
stat_list = ['min','max','mean','std','median']
paradise_snotel_DOWY_snwd_stats = paradise_snotel.groupby('DOWY').agg(stat_list)['SNWD']
paradise_snotel_DOWY_snwd_stats
min | max | mean | std | median | |
---|---|---|---|---|---|
DOWY | |||||
1 | 0.0 | 0.2286 | 0.014111 | 0.053862 | 0.0 |
2 | 0.0 | 0.2032 | 0.011289 | 0.047895 | 0.0 |
3 | 0.0 | 0.2286 | 0.014111 | 0.053862 | 0.0 |
4 | 0.0 | 0.1270 | 0.012700 | 0.032888 | 0.0 |
5 | 0.0 | 0.1270 | 0.012700 | 0.034022 | 0.0 |
... | ... | ... | ... | ... | ... |
362 | 0.0 | 0.0254 | 0.001411 | 0.005987 | 0.0 |
363 | 0.0 | 0.0254 | 0.001411 | 0.005987 | 0.0 |
364 | 0.0 | 0.0254 | 0.002822 | 0.008214 | 0.0 |
365 | 0.0 | 0.0762 | 0.005644 | 0.018595 | 0.0 |
366 | 0.0 | 0.0000 | 0.000000 | 0.000000 | 0.0 |
366 rows × 5 columns
today = datetime.datetime.today().strftime('%Y-%m-%d')
current_WY = slice(f'{int(today[0:4])-1}-10-01',f'{today}')
current_WY_paradise_snotel = paradise_snotel[current_WY.start:current_WY.stop]
f,ax=plt.subplots(figsize=(12,7))
for stat,stat_color in zip(['min','max','mean','median'],['red','blue','mediumpurple','mediumseagreen']):
ax.plot(paradise_snotel_DOWY_snwd_stats.index, paradise_snotel_DOWY_snwd_stats[stat], label=stat, color=stat_color, linewidth=3)
ax.fill_between(paradise_snotel_DOWY_snwd_stats.index, paradise_snotel_DOWY_snwd_stats['mean'] - paradise_snotel_DOWY_snwd_stats['std'], paradise_snotel_DOWY_snwd_stats['mean'] + paradise_snotel_DOWY_snwd_stats['std'], color='mediumpurple', alpha=0.3, label='mean +/- 1 std')
ax.scatter(current_WY_paradise_snotel.DOWY,current_WY_paradise_snotel.SNWD, marker='o', color= 'black', label='Current WY')
ax.set_xlim([0,366])
ax.set_ylim([0,6])
ax.grid()
ax.legend()
ax.set_title(f'Current snow depth against historical snow depth stats by DOWY at Paradise, WA\n({paradise_snotel.index.min().date()} - {paradise_snotel.index.max().date()})')
ax.set_xlabel('Day of Water Year [Oct 1 - Sept 30]')
ax.set_ylabel('Snow depth [meters]')
f.tight_layout()
Looks like we're below the mean for snow depth for today's DOWY.
pd.DataFrame.from_dict()
¶station_list = ['356_CA_SNTL','BLK']
station_dict = {}
for station in station_list:
tmp = pd.read_csv(f'https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/data/{station}.csv',index_col='datetime',parse_dates=True)['WTEQ']
station_dict[station] = tmp
stations_swe_df = pd.DataFrame.from_dict(station_dict)#.dropna()
stations_swe_df
356_CA_SNTL | BLK | |
---|---|---|
datetime | ||
1980-07-09 | NaN | NaN |
1980-07-10 | NaN | NaN |
1980-07-11 | NaN | NaN |
1980-07-12 | NaN | NaN |
1980-07-13 | NaN | NaN |
... | ... | ... |
2024-02-16 | 0.4064 | 0.4064 |
2024-02-17 | 0.4089 | 0.4089 |
2024-02-18 | 0.4166 | 0.4166 |
2024-02-19 | 0.4267 | 0.4267 |
2024-02-20 | 0.4902 | 0.4902 |
15932 rows × 2 columns
parse_dates=True
, these should line up!f,ax=plt.subplots(figsize=(20,7))
stations_swe_df.plot(ax=ax,color=['red','blue'])
ax.legend()
ax.set_xlabel('time')
ax.set_ylabel('snow water equivalent [meters]')
ax.set_title('SNOTEL and CCSS SWE at Blue Lakes, CA \nmaybe these are the same? :)')
f.tight_layout()
#ax.set_xlim(['2018-08-01','2020-01-01'])
DataFrame.corr()
stations_swe_df.corr()
356_CA_SNTL | BLK | |
---|---|---|
356_CA_SNTL | 1.000000 | 0.999053 |
BLK | 0.999053 | 1.000000 |
These correlation values, along with the time series above, makes me think these are way too similar... no way these would agree this much even if the stations were right next to each other!
contextily
for a basemapf,ax=plt.subplots(figsize=(7,7))
all_stations_gdf[all_stations_gdf.index=='356_CA_SNTL'].to_crs('EPSG:32611').plot(ax=ax, color='red',label='356_CA_SNTL')
all_stations_gdf[all_stations_gdf.index=='BLK'].to_crs('EPSG:32611').plot(ax=ax, color='blue',label='BLK')
ax.set_xlim([244200,245700])
ax.set_ylim([4276900,4278700])
ctx.add_basemap(ax, crs='EPSG:32611', source=ctx.providers.Esri.WorldImagery)
ax.legend()
f.tight_layout()
Interesting locations :) Based on correlation and location, I'm going to say these are the same! Wonder what those tiny differences are about...
!pip install -q lxml
import requests
from io import StringIO
csv = 'https://cdec.water.ca.gov/misc/SnowSensors.html'
response = requests.get(csv)
same_stations_df = pd.read_html(StringIO(response.content.decode('utf-8')))[0].set_index('ID').sort_index()
same_stations_df = same_stations_df[same_stations_df.nunique(axis=1) > 1]
same_stations_gdf = gpd.GeoDataFrame(same_stations_df, geometry=gpd.points_from_xy(same_stations_df['Longitude'], same_stations_df['Latitude']))
same_stations_gdf.crs = "EPSG:4326"
same_stations_gdf = same_stations_gdf[same_stations_gdf['Operator Agency']=='Natural Resources Conservation Service']
same_stations_gdf
Name | Elev (feet) | Latitude | Longitude | April 1 Avg (inches) | Operator Agency | geometry | |
---|---|---|---|---|---|---|---|
ID | |||||||
ADM | ADIN MOUNTAIN | 6200 | 41.237 | -120.792 | 13.6 | Natural Resources Conservation Service | POINT (-120.79200 41.23700) |
BLK | BLUE LAKES | 8000 | 38.613 | -119.931 | 33.1 | Natural Resources Conservation Service | POINT (-119.93100 38.61300) |
BMW | BIG MEADOWS (SCS) | 8700 | 39.458 | -119.946 | 25.7 | Natural Resources Conservation Service | POINT (-119.94600 39.45800) |
BSK | BURNSIDE LAKE | 8129 | 38.719 | -119.894 | -9999.0 | Natural Resources Conservation Service | POINT (-119.89400 38.71900) |
CDP | CEDAR PASS | 7100 | 41.583 | -120.303 | 18.1 | Natural Resources Conservation Service | POINT (-120.30300 41.58300) |
CSL | CENT SIERRA SNOW LAB | 6900 | 39.325 | -120.367 | 33.6 | Natural Resources Conservation Service | POINT (-120.36700 39.32500) |
CWF | CROWDER FLAT | 5100 | 41.893 | -120.752 | -9999.0 | Natural Resources Conservation Service | POINT (-120.75200 41.89300) |
CXS | CARSON PASS | 8353 | 38.692 | -120.002 | -9999.0 | Natural Resources Conservation Service | POINT (-120.00200 38.69200) |
DSS | DISMAL SWAMP | 7050 | 41.993 | -120.165 | 29.2 | Natural Resources Conservation Service | POINT (-120.16500 41.99300) |
EBB | EBBETTS PASS | 8700 | 38.561 | -119.808 | 38.8 | Natural Resources Conservation Service | POINT (-119.80800 38.56100) |
EP5 | ECHO PEAK 5 | 7800 | 38.849 | -120.079 | 39.5 | Natural Resources Conservation Service | POINT (-120.07900 38.84900) |
FLL | FALLEN LEAF LAKE | 6250 | 38.932 | -120.056 | 7.0 | Natural Resources Conservation Service | POINT (-120.05600 38.93200) |
HGM | HAGANS MEADOW | 8000 | 38.853 | -119.940 | 16.5 | Natural Resources Conservation Service | POINT (-119.94000 38.85300) |
HVN | HEAVENLY VALLEY | 8800 | 38.929 | -119.917 | 28.1 | Natural Resources Conservation Service | POINT (-119.91700 38.92900) |
IDC | INDEPENDENCE CAMP | 7000 | 39.453 | -120.299 | 21.8 | Natural Resources Conservation Service | POINT (-120.29900 39.45300) |
IDP | INDEPENDENCE LAKE (SCS) | 8450 | 39.435 | -120.322 | 41.4 | Natural Resources Conservation Service | POINT (-120.32200 39.43500) |
INN | INDEPENDENCE CREEK | 6500 | 39.494 | -120.293 | 12.7 | Natural Resources Conservation Service | POINT (-120.29300 39.49400) |
LBD | LOBDELL LAKE | 9200 | 38.440 | -119.377 | 17.3 | Natural Resources Conservation Service | POINT (-119.37700 38.44000) |
LVM | LEAVITT MEADOWS | 7200 | 38.305 | -119.552 | 8.0 | Natural Resources Conservation Service | POINT (-119.55200 38.30500) |
LVT | LEAVITT LAKE | 9600 | 38.282 | -119.621 | -9999.0 | Natural Resources Conservation Service | POINT (-119.62100 38.28200) |
MNT | MONITOR PASS | 8350 | 38.670 | -119.615 | -9999.0 | Natural Resources Conservation Service | POINT (-119.61500 38.67000) |
MRL | MARLETTE LAKE | 8000 | 39.173 | -119.905 | 21.1 | Natural Resources Conservation Service | POINT (-119.90500 39.17300) |
MSK | MOUNT ROSE SKI AREA | 8900 | 39.326 | -119.902 | 38.5 | Natural Resources Conservation Service | POINT (-119.90200 39.32600) |
PSN | POISON FLAT | 7900 | 38.501 | -119.631 | 16.2 | Natural Resources Conservation Service | POINT (-119.63100 38.50100) |
RP2 | RUBICON PEAK 2 | 7500 | 39.001 | -120.140 | 29.1 | Natural Resources Conservation Service | POINT (-120.14000 39.00100) |
SDW | SUMMIT MEADOW | 9313 | 38.398 | -119.536 | -9999.0 | Natural Resources Conservation Service | POINT (-119.53600 38.39800) |
SPS | SONORA PASS BRIDGE | 8750 | 38.318 | -119.601 | 26.0 | Natural Resources Conservation Service | POINT (-119.60100 38.31800) |
SPT | SPRATT CREEK | 6150 | 38.666 | -119.817 | 4.5 | Natural Resources Conservation Service | POINT (-119.81700 38.66600) |
SQV | SQUAW VALLEY GOLD COAST | 8200 | 39.194 | -120.276 | 46.5 | Natural Resources Conservation Service | POINT (-120.27600 39.19400) |
TCC | TAHOE CITY CROSS | 6750 | 39.171 | -120.155 | 16.0 | Natural Resources Conservation Service | POINT (-120.15500 39.17100) |
TK2 | TRUCKEE 2 | 6400 | 39.300 | -120.194 | 14.3 | Natural Resources Conservation Service | POINT (-120.19400 39.30000) |
VRG | VIRGINIA LAKES RIDGE | 9300 | 38.077 | -119.234 | 20.3 | Natural Resources Conservation Service | POINT (-119.23400 38.07700) |
WC3 | WARD CREEK 3 | 6750 | 39.136 | -120.219 | 39.4 | Natural Resources Conservation Service | POINT (-120.21900 39.13600) |
f,ax=plt.subplots(figsize=(10,10))
same_stations_gdf.to_crs("EPSG:32611").plot(ax=ax,color='red',label='CCSS station operated by NRCS')
ccss_stations_gdf = all_stations_gdf[all_stations_gdf['network']=='CCSS'].to_crs("EPSG:32611")
ccss_stations_gdf[~ccss_stations_gdf.index.isin(same_stations_gdf.index)].plot(ax=ax,color='blue',label='CCSS station not operated by NRCS')
ctx.add_basemap(ax, crs='EPSG:32611', source=ctx.providers.Esri.WorldImagery)
ax.legend()
f.tight_layout()
There are 33 of these! These are likely also listed as SNOTEL stations...
pd.DataFrame.from_dict()
¶ccss_stations_gdf
of only CCSS stations from all_stations_gdf
by creating an index where network equals CCSSccss_stations_snwd_df
ccss_stations_gdf = all_stations_gdf[all_stations_gdf['network']=='CCSS']
ccss_stations_gdf
name | network | elevation_m | latitude | longitude | state | HUC | mgrs | mountainRange | beginDate | endDate | csvData | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
code | |||||||||||||
FRW | Farewell Gap | CCSS | 2895.60 | 36.415211 | -118.578979 | California | 180300070202 | 11SLA | Sierra Nevada | 2004-10-01 | 2024-02-20 | True | POINT (-118.57898 36.41521) |
BNK | Bonanza King | CCSS | 1965.96 | 41.083118 | -122.631271 | California | 180102110304 | 10TEL | Klamath Mountains | 2011-01-03 | 2024-02-20 | True | POINT (-122.63127 41.08312) |
CDP | Cedar Pass | CCSS | 2164.08 | 41.583000 | -120.303000 | California | 180200020603 | 10TGM | Great Basin Ranges | NaT | NaT | True | POINT (-120.30300 41.58300) |
CRL | Charlotte Lake | CCSS | 3169.92 | 36.777523 | -118.426010 | California | 180300100206 | 11SLA | Sierra Nevada | 2004-10-01 | 2023-12-11 | True | POINT (-118.42601 36.77752) |
CHM | Chilkoot Meadow | CCSS | 2179.32 | 37.408390 | -119.492188 | California | 180400061101 | 11SKB | Sierra Nevada | 2006-10-01 | 2024-02-20 | True | POINT (-119.49219 37.40839) |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
SLT | Slate Creek | CCSS | 1737.36 | 41.043980 | -122.480103 | California | 180200050304 | 10TEL | Klamath Mountains | 2004-10-01 | 2024-02-12 | True | POINT (-122.48010 41.04398) |
SLI | Slide Canyon | CCSS | 2804.16 | 38.091234 | -119.431881 | California | 180400090501 | 11SKC | Sierra Nevada | 2005-10-01 | 2024-02-20 | True | POINT (-119.43188 38.09123) |
SLK | South Lake | CCSS | 2926.08 | 37.175903 | -118.562660 | California | 180901020601 | 11SLB | Sierra Nevada | 2004-10-01 | 2024-02-20 | True | POINT (-118.56266 37.17590) |
STL | State Lakes | CCSS | 3169.92 | 36.926483 | -118.573250 | California | 180300100305 | 11SLA | Sierra Nevada | 2018-11-01 | 2024-02-20 | True | POINT (-118.57325 36.92648) |
TMR | Tamarack Summit | CCSS | 2301.24 | 37.163750 | -119.200531 | California | 180400060903 | 11SLB | Sierra Nevada | 2011-01-01 | 2024-02-20 | True | POINT (-119.20053 37.16375) |
130 rows × 13 columns
%%time
station_dict = {}
for station in tqdm.tqdm(ccss_stations_gdf.index):
try:
tmp = pd.read_csv(f'https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/data/{station}.csv',index_col='datetime',parse_dates=True)['SNWD']
station_dict[station] = tmp
except:
print(f'failed to retrieve {station}')
ccss_stations_snwd_df = pd.DataFrame.from_dict(station_dict).dropna(how='all')
100%|██████████| 130/130 [00:36<00:00, 3.60it/s]
CPU times: user 8.12 s, sys: 593 ms, total: 8.72 s Wall time: 37.2 s
ccss_stations_snwd_df
FRW | BNK | CDP | CRL | CHM | HHM | HNT | MUM | SDF | RTL | ... | QUA | LLP | FOR | GEM | WTM | SLT | SLI | SLK | STL | TMR | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
datetime | |||||||||||||||||||||
2004-10-01 | 0.3048 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0000 | NaN | ... | NaN | 0.2032 | NaN | NaN | NaN | -0.0254 | NaN | -0.0254 | NaN | NaN |
2004-10-02 | 0.3556 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0000 | NaN | ... | NaN | 0.2540 | NaN | NaN | NaN | 0.0254 | NaN | -0.0254 | NaN | NaN |
2004-10-03 | 0.2540 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0000 | NaN | ... | NaN | 0.2540 | NaN | NaN | NaN | 0.0254 | NaN | -0.0254 | NaN | NaN |
2004-10-04 | 0.3302 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0000 | NaN | ... | NaN | 0.0254 | NaN | NaN | NaN | 0.0254 | NaN | -0.0508 | NaN | NaN |
2004-10-05 | 0.3302 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | -0.0000 | NaN | ... | NaN | 0.2286 | NaN | NaN | NaN | 0.0254 | NaN | -0.0508 | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2024-02-16 | 1.3208 | 1.7780 | NaN | NaN | 1.6764 | NaN | 1.2700 | NaN | 2.3114 | NaN | ... | NaN | NaN | NaN | 0.0 | 1.0922 | NaN | 1.9050 | 0.8128 | 1.2192 | 1.1684 |
2024-02-17 | 1.2954 | 1.7272 | NaN | NaN | 1.6510 | NaN | 1.2192 | NaN | 2.2606 | NaN | ... | NaN | 3.4290 | NaN | 0.0 | 1.0668 | NaN | 1.8288 | 0.7366 | 1.1684 | 1.1430 |
2024-02-18 | 1.3208 | 1.8542 | NaN | NaN | 1.7018 | NaN | 4.4958 | NaN | 2.4384 | NaN | ... | NaN | 3.6830 | NaN | 0.0 | 1.0922 | NaN | 1.9558 | 0.7366 | 1.1938 | 1.1684 |
2024-02-19 | 1.4224 | 2.0574 | NaN | NaN | 1.7018 | NaN | 1.2192 | NaN | 2.6162 | NaN | ... | NaN | 4.0132 | NaN | 0.0 | 1.1176 | NaN | 2.1336 | 3.7846 | 1.2954 | 1.1176 |
2024-02-20 | 1.7272 | 2.4130 | NaN | NaN | 1.7272 | NaN | 1.2446 | NaN | NaN | NaN | ... | NaN | 4.3180 | NaN | 0.0 | 1.4224 | NaN | 2.1082 | 1.0414 | 1.6510 | 1.2192 |
7075 rows × 130 columns
ccss_stations_snwd_df
like before, groupby DOWY, apply a median, and divide the observation on April 1st, 2023 by the DOWY 183 (April 1) medianccss_stations_gdf
ccss_stations_snwd_df['DOWY'] = ccss_stations_snwd_df.index.map(datetime_to_DOWY)
ccss_stations_gdf.loc[:,'april2023_percent_norm'] = 100*(ccss_stations_snwd_df['2023-04-01':'2023-04-01'].squeeze() / ccss_stations_snwd_df.groupby('DOWY').median().loc[183])
/srv/conda/envs/notebook/lib/python3.11/site-packages/geopandas/geodataframe.py:1538: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy super().__setitem__(key, value)
ccss_stations_gdf = ccss_stations_gdf.dropna(subset='april2023_percent_norm')
ccss_stations_gdf = ccss_stations_gdf[ccss_stations_gdf['april2023_percent_norm']<7500]
.sort_values()
function to get an idea of percent normal snow depth valuesDataFrame.head()
and DataFrame.tail()
to see highest and lowest percent normal snow depth valuesccss_stations_gdf.sort_values('april2023_percent_norm',ascending=False).head()
name | network | elevation_m | latitude | longitude | state | HUC | mgrs | mountainRange | beginDate | endDate | csvData | geometry | april2023_percent_norm | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
code | ||||||||||||||
FLL | Fallen Leaf Lake | CCSS | 1905.0000 | 38.932000 | -120.056000 | California | 160501010401 | 10SGJ | NaN | 2004-10-01 | 2024-02-19 | True | POINT (-120.05600 38.93200) | 1000.000000 |
DPO | Devils Postpile | CCSS | 2307.0312 | 37.629410 | -119.084671 | California | 180400060402 | 11SLB | Sierra Nevada | 2007-10-01 | 2024-02-20 | True | POINT (-119.08467 37.62941) | 647.272727 |
RCK | Rock Creek Lakes | CCSS | 2956.5600 | 37.457275 | -118.735023 | California | 180901020301 | 11SLB | Sierra Nevada | 2004-10-01 | 2024-02-20 | True | POINT (-118.73502 37.45728) | 566.666667 |
TMR | Tamarack Summit | CCSS | 2301.2400 | 37.163750 | -119.200531 | California | 180400060903 | 11SLB | Sierra Nevada | 2011-01-01 | 2024-02-20 | True | POINT (-119.20053 37.16375) | 546.428571 |
CWD | Cottonwood Lakes | CCSS | 3093.7200 | 36.483829 | -118.177559 | California | 180901030404 | 11SLA | Sierra Nevada | 2004-10-01 | 2024-02-20 | True | POINT (-118.17756 36.48383) | 473.076923 |
ccss_stations_gdf.sort_values('april2023_percent_norm',ascending=False).tail()
name | network | elevation_m | latitude | longitude | state | HUC | mgrs | mountainRange | beginDate | endDate | csvData | geometry | april2023_percent_norm | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
code | ||||||||||||||
LLP | Lower Lassen Peak | CCSS | 2541.4224 | 40.466602 | -121.508110 | California | 180201560301 | 10TFK | Cascade Range | 2004-10-01 | 2024-02-20 | True | POINT (-121.50811 40.46660) | 162.541806 |
GIN | Gin Flat | CCSS | 2148.8400 | 37.766887 | -119.774907 | California | 180400080306 | 11SKB | Sierra Nevada | 2004-10-01 | 2024-02-20 | True | POINT (-119.77491 37.76689) | 157.575758 |
TNY | Tenaya Lake | CCSS | 2484.1200 | 37.837581 | -119.449875 | California | 180400080107 | 11SKB | Sierra Nevada | 2004-10-01 | 2024-02-20 | True | POINT (-119.44988 37.83758) | 154.867257 |
TUN | Tunnel Guard Station | CCSS | 2712.7200 | 36.366349 | -118.289116 | California | 180300020101 | 11SLA | Sierra Nevada | 2022-12-01 | 2024-02-20 | True | POINT (-118.28912 36.36635) | 100.000000 |
BGP | Big Pine Creek | CCSS | 2987.0400 | 37.127815 | -118.476967 | California | 180901020709 | 11SLB | Sierra Nevada | 2004-10-01 | 2024-02-20 | True | POINT (-118.47697 37.12781) | 0.000000 |
f,ax=plt.subplots(1,2,figsize=(12,9),gridspec_kw={'width_ratios': [2, 1]})
ccss_stations_gdf.to_crs('EPSG:32611').plot(ax=ax[0], column='april2023_percent_norm',legend=True,vmin=0,vmax=500,cmap='gnuplot',edgecolor='black',s=100)
ctx.add_basemap(ax[0], crs='EPSG:32611', source=ctx.providers.Esri.WorldImagery, attribution='')
ax[0].set_title('station locations')
ax[1].scatter(ccss_stations_gdf.elevation_m,ccss_stations_gdf.april2023_percent_norm,c=ccss_stations_gdf.april2023_percent_norm,cmap='gnuplot',vmin=0,vmax=500,edgecolors='black',s=100)
ax[1].axhline(y=100,linestyle='--',color='black')
ax[1].set_title('station percent normal snow depth vs elevation')
ax[1].set_xlabel('elevation [m]')
ax[1].set_ylabel('percent of normal snow depth [%]')
f.suptitle(f'California percent normal snow depth for April 1st, 2023')
f.tight_layout()
Looks like a lot more snow than usual to me!
pd.DataFrame.from_dict()
¶all_stations_swe_df
%%time
station_dict = {}
for station in tqdm.tqdm(all_stations_gdf.index):
try:
tmp = pd.read_csv(f'https://raw.githubusercontent.com/egagli/snotel_ccss_stations/main/data/{station}.csv',index_col='datetime',parse_dates=True)['WTEQ']
station_dict[station] = tmp
except:
print(f'failed to retrieve {station}')
100%|██████████| 969/969 [04:28<00:00, 3.61it/s]
CPU times: user 56 s, sys: 3.71 s, total: 59.7 s Wall time: 4min 28s
all_stations_swe_df = pd.DataFrame.from_dict(station_dict)
all_stations_swe_df
301_CA_SNTL | 907_UT_SNTL | 916_MT_SNTL | 1267_AK_SNTL | 908_WA_SNTL | 1189_AK_SNTL | 1062_AK_SNTL | 1070_AK_SNTL | 302_OR_SNTL | 1000_OR_SNTL | ... | QUA | LLP | FOR | GEM | WTM | SLT | SLI | SLK | STL | TMR | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
datetime | |||||||||||||||||||||
1909-04-13 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1954-12-01 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0635 | NaN |
1954-12-02 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0635 | NaN |
1954-12-03 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0635 | NaN |
1954-12-04 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.1143 | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2024-02-16 | 0.2235 | 0.1829 | 0.2184 | 0.1981 | 0.4724 | 0.127 | 0.2642 | 0.2946 | 0.2388 | 0.7214 | ... | NaN | 1.4554 | 0.2438 | NaN | 0.3152 | 0.4968 | 0.4696 | 0.1478 | 0.3183 | 0.3546 |
2024-02-17 | 0.2438 | 0.1829 | 0.2184 | 0.1981 | 0.4775 | 0.127 | 0.2642 | 0.2946 | 0.2413 | 0.7290 | ... | NaN | 1.4783 | 0.2438 | NaN | 0.3152 | 0.4999 | 0.4702 | 0.1478 | 0.3193 | 0.3546 |
2024-02-18 | 0.2565 | 0.1829 | 0.2210 | 0.2032 | 0.4801 | 0.127 | 0.2642 | 0.2972 | 0.2438 | 0.7417 | ... | NaN | 1.5255 | NaN | NaN | 0.3152 | 0.5547 | 0.4867 | 0.1478 | 0.3228 | 0.3668 |
2024-02-19 | 0.2642 | 0.1829 | 0.2210 | 0.2032 | 0.4826 | 0.127 | 0.2642 | 0.2972 | 0.2413 | 0.7544 | ... | NaN | 1.5926 | NaN | NaN | 0.3150 | 0.6157 | 0.5121 | 0.1600 | 0.3358 | 0.3912 |
2024-02-20 | 0.2692 | 0.1829 | 0.2210 | 0.2134 | 0.4826 | 0.127 | 0.2667 | 0.3023 | 0.2413 | 0.7645 | ... | NaN | 1.6939 | NaN | NaN | 0.3569 | 0.7071 | 0.5283 | 0.1875 | 0.3927 | 0.4277 |
23800 rows × 969 columns
all_stations_swe_df
:all_stations_swe_df = all_stations_swe_df.loc[slice('1966-10-01','2023-09-30')]
all_stations_swe_df['WY'] = all_stations_swe_df.index.map(datetime_to_WY)
/tmp/ipykernel_5958/4163226604.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy all_stations_swe_df['WY'] = all_stations_swe_df.index.map(datetime_to_WY)
all_stations_swe_df = all_stations_swe_df[all_stations_swe_df>=0]
all_stations_swe_diff_df = all_stations_swe_df.diff().abs()
all_stations_swe_df[all_stations_swe_diff_df>0.20] = np.nan
def check_missing_data(group):
nov_to_apr_mask = group.index.month.isin([11, 12, 1, 2, 3, 4])
filtered_group = group[nov_to_apr_mask]
missing_data_counts = filtered_group.isnull().sum()
columns_to_nan = missing_data_counts[missing_data_counts > 30].index
group[columns_to_nan] = np.nan
return group
def check_zero_swe(group):
for month in [1, 2, 3]:
month_mask = group.index.month == month
zero_swe_columns = group[month_mask].eq(0).all()
columns_to_nan = zero_swe_columns[zero_swe_columns].index
group[columns_to_nan] = np.nan
return group
all_stations_swe_df = all_stations_swe_df.groupby('WY').apply(check_missing_data).droplevel(0)
all_stations_swe_df = all_stations_swe_df.groupby('WY').apply(check_zero_swe).droplevel(0)
all_stations_swe_df
301_CA_SNTL | 907_UT_SNTL | 916_MT_SNTL | 1267_AK_SNTL | 908_WA_SNTL | 1189_AK_SNTL | 1062_AK_SNTL | 1070_AK_SNTL | 302_OR_SNTL | 1000_OR_SNTL | ... | LLP | FOR | GEM | WTM | SLT | SLI | SLK | STL | TMR | WY | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
datetime | |||||||||||||||||||||
1966-10-01 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1967.0 |
1966-10-02 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1967.0 |
1966-10-03 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1967.0 |
1966-10-04 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1967.0 |
1966-10-05 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1967.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2023-09-26 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0051 | 0.0 | ... | 0.0 | 0.0 | NaN | 0.0 | 0.0 | NaN | 0.0 | NaN | 0.0 | 2023.0 |
2023-09-27 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0051 | 0.0 | ... | 0.0 | 0.0 | NaN | 0.0 | 0.0 | NaN | 0.0 | NaN | 0.0 | 2023.0 |
2023-09-28 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0051 | 0.0 | ... | 0.0 | 0.0 | NaN | 0.0 | 0.0 | NaN | 0.0 | NaN | 0.0 | 2023.0 |
2023-09-29 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0051 | 0.0 | ... | 0.0 | 0.0 | NaN | 0.0 | 0.0 | NaN | 0.0 | NaN | 0.0 | 2023.0 |
2023-09-30 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0051 | 0.0 | ... | 0.0 | 0.0 | NaN | 0.0 | 0.0 | NaN | 0.0 | NaN | 0.0 | 2023.0 |
20763 rows × 970 columns
DataFrame.groupby()
to find the date of max SWE per year and convert to a DOWY value, store in all_stations_dowy_max_swe_df
np.polyfit()
on a melted version ofDataFrame.describe()
all_stations_dowy_max_swe_df = all_stations_swe_df.groupby('WY').idxmax().applymap(datetime_to_DOWY)
stations_before_WY1982 = all_stations_gdf[all_stations_gdf.beginDate<'1981-10-01']
dowy_max_swe_melted = pd.melt(all_stations_dowy_max_swe_df.reset_index(),id_vars='WY').dropna()
dowy_max_swe_melted_before_WY1982 = dowy_max_swe_melted[dowy_max_swe_melted['variable'].isin(stations_before_WY1982.index)]
slope, intercept = np.polyfit(dowy_max_swe_melted_before_WY1982.WY,dowy_max_swe_melted_before_WY1982.value,1)
lr_years = np.unique(dowy_max_swe_melted.WY)
describe = all_stations_dowy_max_swe_df.T.describe()
describe
WY | 1967.0 | 1968.0 | 1969.0 | 1970.0 | 1971.0 | 1972.0 | 1973.0 | 1974.0 | 1975.0 | 1976.0 | ... | 2014.0 | 2015.0 | 2016.0 | 2017.0 | 2018.0 | 2019.0 | 2020.0 | 2021.0 | 2022.0 | 2023.0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 15.000000 | 23.00000 | 31.000000 | 39.000000 | 40.000000 | 40.000000 | 48.000000 | 50.000000 | 53.000000 | 55.000000 | ... | 891.000000 | 870.000000 | 896.000000 | 926.000000 | 922.000000 | 929.000000 | 932.000000 | 928.000000 | 928.000000 | 920.000000 |
mean | 211.600000 | 196.73913 | 185.741935 | 197.641026 | 188.050000 | 186.375000 | 195.729167 | 195.720000 | 213.000000 | 194.800000 | ... | 179.783389 | 156.452874 | 175.334821 | 179.018359 | 183.367679 | 185.812702 | 182.918455 | 180.346983 | 176.734914 | 192.546739 |
std | 13.113788 | 19.17199 | 14.906302 | 25.243537 | 29.203047 | 33.502918 | 18.525129 | 17.996644 | 11.560876 | 18.673015 | ... | 28.288029 | 36.552523 | 27.632212 | 25.712310 | 18.666580 | 19.338004 | 26.485426 | 18.764201 | 36.686443 | 13.905714 |
min | 188.000000 | 170.00000 | 158.000000 | 148.000000 | 106.000000 | 94.000000 | 134.000000 | 161.000000 | 187.000000 | 154.000000 | ... | 55.000000 | 33.000000 | 88.000000 | 104.000000 | 84.000000 | 95.000000 | 60.000000 | 119.000000 | 89.000000 | 86.000000 |
25% | 208.500000 | 179.50000 | 176.000000 | 167.000000 | 179.000000 | 159.500000 | 184.750000 | 186.250000 | 207.000000 | 184.500000 | ... | 164.000000 | 133.000000 | 167.000000 | 159.000000 | 175.000000 | 170.000000 | 183.000000 | 175.000000 | 162.000000 | 187.000000 |
50% | 215.000000 | 204.00000 | 182.000000 | 210.000000 | 191.500000 | 202.000000 | 197.000000 | 193.000000 | 216.000000 | 202.000000 | ... | 188.000000 | 156.000000 | 181.000000 | 180.000000 | 180.000000 | 186.000000 | 190.000000 | 178.000000 | 174.000000 | 189.000000 |
75% | 218.500000 | 209.00000 | 191.500000 | 212.500000 | 209.250000 | 211.250000 | 207.000000 | 200.000000 | 221.000000 | 211.000000 | ... | 190.000000 | 177.000000 | 185.000000 | 202.000000 | 200.000000 | 199.000000 | 200.000000 | 189.250000 | 205.000000 | 205.000000 |
max | 228.000000 | 238.00000 | 219.000000 | 227.000000 | 235.000000 | 226.000000 | 225.000000 | 238.000000 | 240.000000 | 218.000000 | ... | 238.000000 | 239.000000 | 232.000000 | 235.000000 | 224.000000 | 247.000000 | 229.000000 | 228.000000 | 250.000000 | 234.000000 |
8 rows × 57 columns
f,ax=plt.subplots(2,1,figsize=(10,6),sharex=True,gridspec_kw={'height_ratios': [3, 2]})
describe.loc['50%'].plot(ax=ax[0],label='median')
ax[0].fill_between(describe.columns,describe.loc['25%'],describe.loc['75%'],alpha=0.3,label='IQR')
ax[0].plot(lr_years,np.array(lr_years)*slope+intercept,'k--',label=f'Trend (slope={slope:.2f} Days/Year)')
#ax[0].set_xlim([1967,2023])
ax[0].legend()
describe.loc['count'].plot(ax=ax[1])
ax[0].set_title('Trend in DOWY of max SWE')
ax[1].set_title('Number of active stations')
Text(0.5, 1.0, 'Number of active stations')
all_stations_gdf
all_stations_gdf.loc[:,'dowy_max_swe_trend'] = all_stations_dowy_max_swe_df.apply(lambda y: np.polyfit(y.dropna().index.values, y.dropna(), 1)[0] if len(y.dropna()) >= 30 else np.nan)
f,ax=plt.subplots(figsize=(10,5.5))
all_stations_gdf.plot(column='dowy_max_swe_trend',ax=ax,legend=True,cmap='RdBu_r',edgecolor='k',markersize=20,vmin=-1,vmax=1,legend_kwds={'label':'[days/year]\n(Red is later in the year, blue is earlier)'})
ctx.add_basemap(ax, crs='EPSG:4326', source=ctx.providers.Esri.WorldImagery, attribution='')
ax.set_title('Trend in DOWY of max SWE\n(only stations with 30+ years of data)')
Text(0.5, 1.0, 'Trend in DOWY of max SWE\n(only stations with 30+ years of data)')
f,ax=plt.subplots()
ax.hist(all_stations_gdf['dowy_max_swe_trend'],bins=50)
ax.axvline(x=0,color='red')
ax.set_xlim([-1.5,1.5])
ax.set_xlabel('trend [days/year]')
ax.set_ylabel('count')
ax.set_title('Distribution of trends in DOWY of max SWE')
Text(0.5, 1.0, 'Distribution of trends in DOWY of max SWE')
DataFrame.groupby()
our geodataframe by mountain range to calculate mountain range specific stats, store in mountain_range_trend_df
mountain_range_count = all_stations_gdf.dropna().groupby('mountainRange')['dowy_max_swe_trend'].count()
mountain_range_median = all_stations_gdf.dropna().groupby('mountainRange')['dowy_max_swe_trend'].median()
mountain_range_mean = all_stations_gdf.dropna().groupby('mountainRange')['dowy_max_swe_trend'].mean()
mountain_range_std = all_stations_gdf.dropna().groupby('mountainRange')['dowy_max_swe_trend'].std()
mountain_range_trend_df = pd.concat([mountain_range_count,mountain_range_median,mountain_range_mean,mountain_range_std],axis=1)
mountain_range_trend_df.columns = ['station_count','median','mean','std']
mountain_range_trend_df
station_count | median | mean | std | |
---|---|---|---|---|
mountainRange | ||||
Alaska Intermountain Ranges | 5 | -0.298031 | -0.266592 | 0.112821 |
Cascade Range | 63 | 0.072125 | 0.138312 | 0.346018 |
Central Montana Rocky Mountains | 48 | -0.083318 | -0.100606 | 0.193399 |
Colorado Plateau | 37 | -0.282213 | -0.348372 | 0.205635 |
Columbia Mountains | 7 | 0.143695 | 0.082937 | 0.164431 |
Columbia Plateau | 24 | -0.138161 | -0.029927 | 0.309255 |
Great Basin Ranges | 37 | -0.209381 | -0.238700 | 0.206061 |
Greater Yellowstone Rockies | 51 | -0.044327 | -0.064588 | 0.175889 |
Idaho-Bitterroot Rocky Mountains | 57 | -0.036092 | -0.021299 | 0.242944 |
Klamath Mountains | 5 | -0.122516 | 0.019894 | 0.407079 |
Olympic Mountains | 1 | 0.267235 | 0.267235 | NaN |
Oregon Coast Range | 1 | 0.163264 | 0.163264 | NaN |
Sierra Nevada | 81 | -0.177622 | -0.176293 | 0.283223 |
South-Central Alaska | 7 | -0.073497 | -0.029532 | 0.230334 |
Southern Rocky Mountains | 90 | -0.275773 | -0.277181 | 0.258862 |
Southwest Basins and Ranges | 4 | -0.752793 | -0.740468 | 0.252450 |
Western Rocky Mountains | 58 | -0.191897 | -0.171475 | 0.199642 |
f,ax=plt.subplots(1,2,sharey=True,gridspec_kw={'width_ratios': [1, 3]})
mountain_range_trend_df['station_count'].plot.barh(ax=ax[0])
mountain_range_trend_df['median'].plot.barh(ax=ax[1],cmap='RdBu')
ax[0].set_xlabel('[#]')
ax[1].set_xlabel('[days/year]')
ax[0].set_ylabel('')
ax[0].set_title('station count')
ax[1].set_title('trend')
f.suptitle('Trend in DOWY of max SWE by mountain range')
Text(0.5, 0.98, 'Trend in DOWY of max SWE by mountain range')
mountain_range_trend_df
using DataFrame.join()
with mountain range geometries from GMBA Mountain Inventory v2url = f'https://data.earthenv.org/mountains/standard/GMBA_Inventory_v2.0_standard_300.zip'
gmba_gdf = gpd.read_file('zip+'+url)
mountain_range_trend_gdf = gpd.GeoDataFrame(mountain_range_trend_df.join(gmba_gdf[['MapName','geometry']].set_index('MapName')))
f,ax=plt.subplots(1,2,figsize=(14,7))
mountain_range_trend_gdf.plot(ax=ax[0],column='station_count',vmin=0,vmax=100,cmap='viridis',legend=True,edgecolor='k',legend_kwds={'label':'[#]'})
mountain_range_trend_gdf.plot(ax=ax[1],column='median',vmin=-0.3,vmax=0.3,cmap='RdBu_r',legend=True,edgecolor='k',legend_kwds={'label':'[days/year]\n(Red is later in the year, blue is earlier)'})
for axs in ax:
ctx.add_basemap(ax=axs, crs=mountain_range_trend_gdf.crs, source=ctx.providers.Esri.WorldImagery, attribution=False)
ctx.add_basemap(ax=axs, crs=mountain_range_trend_gdf.crs, source=ctx.providers.Esri.WorldImagery, attribution=False)
axs.set_xlim([-125,-104])
axs.set_ylim([27,55])
ax[0].set_title('count')
ax[1].set_title('trend')
f.suptitle('Trend in DOWY of max SWE by mountain range\n(only stations with 30+ years of data)')
Text(0.5, 0.98, 'Trend in DOWY of max SWE by mountain range\n(only stations with 30+ years of data)')
m = mountain_range_trend_gdf.explore(column='median',cmap='RdBu_r',vmin=-0.5,vmax=0.5)
all_stations_gdf.astype(dict(beginDate=str, endDate=str)).explore(m=m,column='dowy_max_swe_trend',cmap='RdBu_r',vmin=-0.3,vmax=0.3)
Looks like these trends are different by region, but relatively consistent within region. The majority of regions show the timing of maximum SWE happening earlier in the year, with notable exceptions being mountain ranges in the Pacific Northwest which show a reverse trend with smaller magnitude. Also of importance is the number of stations and their spatial dsitribution in each region, as 2 of the 4 regions (Olympic Mountains and Oregon Coast Range) showing the timing of maximum SWE happening later in the year only have one station each with a 30+ year record.