import pandas as pd
import mikeio
ds = mikeio.read("data/Oresund_ts.dfs0")
ds
type(ds)
The mikeio read function returns a Dataset
which is a container of DataArray
s.
A DataArray
can be selected by name or by index.
da = ds["Drogden: Surface elevation"] # or ds.Drogden_Surface_elevation or ds[2]
da
Upon read
, specific items can be selected with the items
argument using name or index.
ds = mikeio.read("data/Oresund_ts.dfs0", items=[0,2,3])
ds
Wildcards can be used to select multiple items:
ds = mikeio.read("data/Oresund_ts.dfs0", items="*Surf*")
ds
A specific time subset can be using .sel:
ds.sel(time=slice("2018-03-04","2018-03-04 12:00"))
Or with positional indexing using .isel:
ds.isel(time=slice(10,20))
The Dataset and DataArray have a number of useful attributes like time
, items
, ndims
, shape
, values
(only DataArray) etc
ds.time
ds.items
da.item
da.shape
da.values
The time series can be plotted with the plot method.
ds.plot();
A simple timeseries Dataset can easily be converted to a Pandas DataFrame.
df = ds.to_pandas()
df
Often, time series data will come from a csv or an Excel file. Here is an example of how to read a csv file with pandas and then write the pandas DataFrame to a dfs0 file.
df = pd.read_csv("data/naples_fl.csv", skiprows=1, parse_dates=True, index_col=0)
df
You will probably have the need to parse certain a specific data formats many times, then it is a good idea to create a function.
def read_ncei_obs(filename):
# old name : new name
mapping = {'TAVG (Degrees Fahrenheit)': 'temperature_avg_f',
'TMAX (Degrees Fahrenheit)': 'temperature_max_f',
'TMIN (Degrees Fahrenheit)': 'temperature_min_f',
'PRCP (Inches)': 'prec_in'}
df_renamed = (
pd.read_csv(filename, skiprows=1, parse_dates=True, index_col=0)
.rename(columns=mapping)
)
sel_cols = mapping.values() # No need to repeat ['temperature_avg_f',...]
df_selected = df_renamed[sel_cols]
return df_selected
df = read_ncei_obs("data/naples_fl.csv")
df.head()
df.tail()
df.shape
Convert temperature to Celsius and precipitation to mm.
df_final = df.assign(temperature_max_c=(df['temperature_max_f'] - 32)/1.8,
prec_mm=df['prec_in'] * 25.4)
df_final.head()
df_final.loc['2021'].plot();
Creating a dfs0 file from a dataframe is pretty straightforward.
Dataset
ds = mikeio.from_pandas(df_final)
ds
Dataset
to a dfs0 file.ds.to_dfs("output/naples_fl.dfs0")
Let's read it back in again...
saved_ds = mikeio.read("output/naples_fl.dfs0")
saved_ds
By default, EUM types are undefined. But it can be specified. Let's select a few colums.
df2 = df_final[['temperature_max_c', 'prec_in']]
df2.head()
from mikeio import ItemInfo, EUMType, EUMUnit
ds2 = mikeio.from_pandas(df2,
items=[
ItemInfo(EUMType.Temperature),
ItemInfo(EUMType.Precipitation_Rate, EUMUnit.inch_per_day)]
)
ds2
from mikeio.eum import ItemInfo, EUMType, EUMUnit
EUMType.search("wind")
EUMType.Wind_speed.units
What is the best EUM Type for "peak wave direction"? What is the default unit?
# insert your code here
df = pd.read_csv("data/precipitation.csv", parse_dates=True, index_col=0)
df.head()
from mikecore.DfsFile import DataValueType
(mikeio.from_pandas(df, items=ItemInfo(EUMType.Precipitation_Rate, EUMUnit.mm_per_hour, data_value_type=DataValueType.MeanStepBackward))
.to_dfs("output/precipitation.dfs0")
)
ds = mikeio.read("output/precipitation.dfs0", items=[1,4]) # select item by item number (starting from zero)
ds
ds = mikeio.read("output/precipitation.dfs0", items=["Precipitation station 5","Precipitation station 1"]) # or by name (in the order you like it)
ds
Read all items to a variable ds. Select "Precipitation station 3" - which different ways can you select this item?
# insert your code here
import utils
utils.sysinfo()