This notebook is intended for live tutorial sessions about TriScale.
Here is the self-study version.
To get started, we need to import a few Python modules. All the TriScale-specific functions are part of one module called triscale
.
import os
from pathlib import Path
import pandas as pd
import numpy as np
import triscale
TriScale's API contains three functions for data analysis
analysis_metric()
analysis_kpi()
analysis_variability()
These functions have a similar structure. They take as input a data sample and the definition of the metric, KPI, or variability score to compute, respectively. Each function performs the corresponding analysis and returns its results, together with some optional data visualizations.
Here are minimal examples:
# Some random data
Y = np.random.sample(100)
X = np.arange(len(Y))
data = np.array([X,Y])
df = pd.DataFrame(np.transpose(data), columns=['x','y'])
# Minimal definition of a TriScale metric
metric = {
'measure': 50, # Integer: interpreted as a percentile
}
# Basic call of analysis_metric
triscale.analysis_metric(
df,
metric,
plot=True,
);
# Minimal definition of a TriScale KPI
KPI = {
'percentile': 75,
'confidence': 95,
'bounds': [0,1]
}
# Basic call of analysis_metric
triscale.analysis_kpi(
df.y.values,
KPI,
to_plot=['series','autocorr','horizontal']
);
# Minimal definition of a TriScale KPI
score = {
'percentile': 75, # the 25th-75th percentiles range
'confidence': 95,
'bounds': [0,1]
}
# Basic call of analysis_metric
triscale.analysis_variability(
df.y.values,
score,
to_plot=['horizontal']
);
For more details about any function, refer to the docstring, as shown below.
help(triscale.analysis_kpi)
Note. The functions return the KPI/score values as well as the result of the corresponding independence test. One must not forget to get it and take extra care in the rest of the analysis shall the test returned
False
.
We have collected data for a comparative evaluation of congestion-control schemes using the Pantheon platform. We use some of these data to illutrate the main analysis functions of TriScale.
For a more extensive description of the data collection and analysis, you can check the complete case study notebook or the TriScale paper itself.
In a nutshell, the dataset contains
Let us first load and visualise the dataset.
# Load and display the entire dataset
df = pd.read_csv(Path('ExampleData/metrics_wo_convergence.csv'))
display(df)
We can easily extract the lists of schemes
used and dates
identifying each series.
# Extract the list of congestion control schemes
schemes = df.cc.unique()
print(schemes)
Let's create a short get_metrics
function to easily extract all metrics values for one scheme and one metric
(e.g., the throughput
of bbr
).
def get_metrics(df, scheme, metric):
'''Parse the dataset to extract the series of metric values f
or one scheme and all series of runs.
'''
# Initialize output
metric_data = []
# List of dates (identifies the series)
dates = df.datetime.unique()
# For each series
for date in dates:
# Setup the data filter
filter = (
(df.cc == scheme) &
(df.datetime == date)
)
# Filter
df_filtered = df.where(filter).dropna()
# Store metrics values for that series
metric_data.append(list(df_filtered[(metric+'_value')].values))
# Return the desired metric data
return metric_data
The definition of KPI and variability score we use are provided below.
# KPIs
KPI_tput = {'percentile': 25,
'confidence': 75,
'name': 'KPI Throughput',
'unit': 'Mbit/s',
'bounds': [0,120], # expected value range
'tag': 'throughput' # do not change the tag
}
KPI_delay = {'percentile': 75,
'confidence': 75,
'name': 'KPI One-way delay',
'unit': 'ms',
'bounds': [0,100], # expected value range
'tag': 'delay' # do not change the tag
}
Note. We aim to estimate the 25th percentile for the
throughput
, where higher is better. It is the opposite for the delay. Thus, both KPIs aim to estimate the performance expected in at least 75% of the runs.
# Variability scores
score_tput = {'percentile': 50,
'confidence': 75,
'name': 'Throughput',
'unit': 'Mbit/s',
'bounds': [0,120], # expected value range
'tag': 'throughput' # do not change the tag
}
score_delay = {'percentile': 50,
'confidence': 75,
'name': 'One-way delay',
'unit': 'ms',
'bounds': [0,100], # expected value range
'tag': 'delay' # do not change the tag
}
As an example, let us analyze the throughput
of the TCP BBR
scheme.
#####################################
# Extract the metrics values for the 5 series of 10 runs
#####################################
scheme = 'bbr' # valid options: print(df.cc.unique())
metric = 'throughput' # valid options are 'throughput' and 'delay'
metric_data = get_metrics(df, scheme, metric)
# Initialize an empty list to collect the KPI values
KPI_values = []
if metric == 'throughput':
KPI = KPI_tput
score = score_tput
if metric == 'delay':
KPI = KPI_delay
score = score_delay
#####################################
## Step 1. Compute the KPIs
#####################################
for series_data in metric_data:
indep_test_passed, KPI_value = triscale.analysis_kpi(
series_data,
KPI)
if indep_test_passed:
KPI_values.append(KPI_value)
# Print the (valid) KPI values
s = '%i valid KPIs obtained\n> ' % len(KPI_values)
for k in KPI_values:
s += '%0.2f ' % k
s += '\n in %s\n' % KPI['unit']
print(s)
#####################################
## Step 2. Compute the variability score
#####################################
(indep_test_passed,
upper_bound,
lower_bound,
var_score,
rel_score) = triscale.analysis_variability(
KPI_values,
score
)
if not indep_test_passed:
var_score *= -1
print('Variability score: %0.2f %s' % (var_score, score['unit']))
First of all, can we change the KPI definition? Recall that we have 5 series of 10 runs in our dataset.
Hint. Remember the
triscale.experiment_sizing
function? :-)
########## YOUR CODE HERE ###########
# ...
#####################################
Let's now explore the dataset a little further!
delay
metric of the bbr
congestion-control scheme?Modify the definition of the variability score to estimate the median 'percentile': 50
instead of
the 25-75th percentile range.
Optional (and harder) questions:
triscale.experiment_sizing(90,75,verbose=True);
triscale.experiment_sizing(75,95,verbose=True);
# "Best" options
triscale.experiment_sizing(87,75,verbose=True);
triscale.experiment_sizing(75,94,verbose=True);
With 10 runs,
The trade-off is using these "best" KPIs is that there is no margin for poor runs: the KPI estimate will always be the largest (resp. smallest) collected value. Moreover, if one run should fail, or not converge, then one would not have enough runs left to compute the desired KPI!
Simply change the metric definition in the code bloc above from throughput
to delay
.
It leads to the following output:
5 valid KPIs obtained
> 87.08 86.51 86.31 87.17 85.74
in ms
Variability score: 1.43 ms
Change the percentile in score definitions from 75 to 50, and re-run the analysis. The output is the same: the scores are not affected by the change in score definitions.
One would expect that a two-sided estimate of the median would be narrower than the estimate of the 25-75th percentile internal. While this is generally true, having only 5 series of runs is not enough to make a difference. This can be seen with the robustness
parameter from triscale.experiment_sizing
:
>>> triscale.experiment_sizing(50,75,robustness=1)
(5,6)
Hence, for a two-sided confidence interval for the median, one needs at least 6 samples in order to "exclude" one.
Next step: Seasonal Components
Back to repo