This notebook is intended for self-study of TriScale.
Here is the version for live sessions.
This notebook contains tutorial materials for TriScale.
More specifically, this notebook presents TriScale's data analysis functions,
leading to the computation of variability scores, which serve to quantify replicability.
If you don't know about Jupyter Notebooks and how to interact with them,
fear not! We compiled everything that you need to know here: Notebook Basics :-)
For more details about TriScale, you may refer to the paper.
To get started, we need to import a few Python modules.
All the TriScale-specific functions are part of one module called triscale
.
import os
from pathlib import Path
import pandas as pd
import numpy as np
import triscale
Alright, we are ready to analyse some data!
In this notebook, we consider that experiments have been designed, that the
corresponding data has been collected, and we focus on the data analysis.
TriScale's methodology is structured around three time scales:
TriScale's API provides one function per time scale, which we will look at in the next sections.
In TriScale, metrics measure performance dimension during one run. The computation of metrics
is implmented in the analysis_metric()
function, which takes two compulsory arguments:
The raw data can be passed as a file path (i.e., a string) or as a Pandas DataFrame.
where x
data is expected in the first column and y
data in the second column.
data
must contain columns named x
and y
.The metric definition is provided as a dictionary, with only the measure
key being compulsory.
This key defines "what is the computation to be performed" on the data; in other word, what the "metric" is.
Currently supported measures are:
mean
;minimum
;maximum
.The analysis_metric()
function returns 3 outputs:
The following cell illustrates the basic usage of analysis_metric()
function.
# Input data file path
data = Path('ExampleData/raw_data.csv') # one-way delay of a full-throttled flow using TCP BBR
# Definition of a TriScale metric
metric = {
'measure': 50, # Integer: interpreted as a percentile
'unit' : 'ms', # For display only
}
has_converged, metric_measure, plot = triscale.analysis_metric(
str(data),
metric)
print('Run metric: %0.2f %s' % (metric_measure, metric['unit']))
Passing the optional argument plot=True
automatically displays the plot of the raw data.
has_converged, metric_measure, plot = triscale.analysis_metric(
str(data),
metric,
plot=True,
)
Note. As presented here, the
analysis_metric()
function is not very interesting:
it "only" returns some percentile of an array... The function is more useful when the metric
attempts to estimate the long-term performance of the system; that is, the value one
would obtain shall the run last longer/more data points be collected.
When this is the case, TriScale performs a convergence test on the data, which can be
triggered in analysis_metric()
function by passing the optional convergence
parameter.
The study of convergence goes beyond the scope of this tutorial; refer to the paper for more details.
In TriScale, key performance indicators (KPIs) measure performance dimensions across a series of runs.
Performing multiple runs allows to mitigate the inherent variability of the experimental conditions.
KPIs capture this variability by estimating percentiles of the (unknown) metric distributions.
Concretely, a TriScale KPI is a one-sided confidence interval of a percentile
e.g., a lower bound for the 25th percentile of a metric, estimated with a 95% confidence level.
The computation of KPIs is implmented in the analysis_kpi()
function, which takes
two compulsory arguments:
The metric data can be passed as a list or an NumPy array. The KPI definition is provided
as a dictionary with three compulsory keys:
percentile
($0<P<100$);confidence
($0<C<100$);bounds
.The KPI bounds
are the expected extremal values for the metric. This information is used
during the independence test (see below). If the metrics bounds are unknown, simply pass
the minimum and maximum metric values as bounds.
The analysis_kpi()
function returns the output of two computations:
True
(test is passed)or False
(it is not);
2. It computes the KPI and returns its value.
Why the independence test? The metric data must be iid for the KPI to be a valid
estimate of the underlying metric distribution. Note however that, in general,
independence is a property of the data collection process, not of the data.
Unfortunately though, in many practical cases in networking, independence cannot be
guaranteed; for example, because there is some correlations in the interference
conditions between sucessive experiments.
In such a case, one can perform an empirical test for independence; essentially,
this test assesses whether the level of correlation in the data appears sufficiently
low such that the data can be reasonably assumed to be iid.
The following cell illustrates the basic usage of analysis_kpi()
function.
# Input data file path
data = Path('ExampleData/metric_data.csv') # Failure recovery time, in seconds
# Read data in a Pandas DataFrame
df = pd.read_csv(data, header=0, names=['metric'])
# KPI definition
KPI = {
'percentile': 75,
'confidence': 95,
'bounds': [0,10],
'unit': 's'
}
# Computes the KPI
indep_test_passed, KPI_value = triscale.analysis_kpi(
df.metric.values,
KPI,
)
# Output
if indep_test_passed:
print('The metric data appears iid.')
print('KPI value: %0.2f %s' % (KPI_value, KPI['unit']))
else:
print('The metric data does not appear iid.')
Since the metric data appears to be iid, we can interpret the KPI value as follows:
With a confidence level of 95%, the 75th percentile on the metric is smaller or equal to 1.92s.
In other words, with a probability of 95%, the performance metric is smaller or equal to 1.92s
in at least three quarters of the runs.
Even when the independence test fails, the KPI value is computed and returned. However,
the user must then be aware that the resulting KPI is not a trustworthy estimate of the
corresponding percentile (i.e., it is not a valid confidence interval).
For more details about the implementation of the empirical independence test, refer to the paper.
Moreover, if there are not enough data points to compute the desired KPI, the function returns np.nan
(not-a-number).
Optionally, the analysis_kpi()
function produces and displays 3 plots:
series
)autocorr
)horizontal
)This is illustrated below.
# Computes the KPI and plot
indep_test_passed, KPI_value = triscale.analysis_kpi(
df.metric.values,
KPI,
to_plot=['series','autocorr','horizontal']
)
Sequels are repetitions of series of runs. TriScale's variability scores measure the variations
of KPI values across sequels. Hence, sequels aim to detect long-term variations of KPIs and,
ultimately, to quantify the replicability of an experiment.
Concretely, a variability score is composed of two one-sided CI for a symmetric pair of percentiles;
e.g., a 75% confidence interval for the 25-75th percentiles. The underlying computations are
the same as for the KPIs values. The computation of variability scores is implmented in the
analysis_variability()
function, which takes two compulsory arguments:
The KPI data can be passed as a list or an NumPy array. The variability score definition is provided
as a dictionary with three compulsory keys:
percentile
($0<P<100$);confidence
($0<C<100$);bounds
.The bounds
are the expected extremal values for the KPI. Like for the analysis_kpi()
function
the bounds are used during the independence test. If the metrics bounds are unknown, simply pass
the minimum and maximum metric values as bounds.
The analysis_variability()
function returns the output of two computations:
True
(test is passed) or False
(it is not);Moreover, if there are not enough data points to compute the desired variability score, the function
returns np.nan
(not-a-number).
The same plotting options as for analysis_kpi()
are available, as illustrated below.
# Input data file path
data = 'ExampleData/kpi_data.csv' # failure recovery time, in seconds
# Read data in a Pandas DataFrame
df = pd.read_csv(data, header=0, names=['kpi'])
# Score definition
score = {
'percentile': 25, # the 25th-75th percentiles range
'confidence': 95,
'bounds': [0,10],
'unit': 's'
}
# Computes the variability score
(indep_test_passed,
upper_bound,
lower_bound,
var_score,
rel_score) = triscale.analysis_variability(
df.kpi.values,
score,
to_plot=['series','horizontal']
)
# Output
if indep_test_passed:
print('The KPI data appears iid.')
print('Variability score: %0.2f %s' % (var_score, score['unit']))
else:
print('The KPI data does not appear iid.')
Since the KPI data appears to be iid, we can interpret the variability score as follows:
With a confidence level of 95%, the inter-quartile (25th-75th perc) range on the KPIs is smaller or equal to 0.4s.
In other words, with a probability of 95%, across all series, the middle 50% of KPI values differ by 0.4s or less.
We have collected data for a comparative evaluation of congestion-control schemes using the
Pantheon platform. More details about the experiment setup can be found in the TriScale paper.
We performed five series of ten runs each, for 16 different schemes. For the purpose of this tutorial,
we provide a dataset containing pre-computed metric values for each run. It contains two metrics:
You task consists in analysing this dataset using TriScale. The goal is to compute and compare the
variability scores of different congestion-control schemes. We will focus on the throughput metric only
(an arbitrary choice).
Let us first load and visualise the dataset.
# Load and display the entire dataset
df = pd.read_csv(Path('ExampleData/metrics_wo_convergence.csv'))
display(df)
We can easily extract the lists of schemes
used and dates
identifying each series.
# Extract the list of congestion control schemes
schemes = df.cc.unique()
print(schemes)
The following cell contains a simple function to filter this dataset and extract metric values
per scheme and per series.
def get_metrics(df, scheme, metric):
'''Parse the dataset to extract the series of metric values f
or one scheme and all series of runs.
'''
# Initialize output
metric_data = []
# List of dates (identifies the series)
dates = df.datetime.unique()
# For each series
for date in dates:
# Setup the data filter
filter = (
(df.cc == scheme) &
(df.datetime == date)
)
# Filter
df_filtered = df.where(filter).dropna()
# Store metrics values for that series
metric_data.append(list(df_filtered[(metric+'_value')].values))
# Return the desired metric data
return metric_data
We will use this function to easily extract all metrics values for one scheme and one metric
(e.g., the throughput
of bbr
).
The definition of KPI and variability score to use for our analysis are provided below.
# KPI
KPI = {'percentile': 25,
'confidence': 75,
'name': 'KPI Throughput',
'unit': 'Mbit/s',
'bounds':[0,120], # expected value range
}
# Variability score
score = {'percentile': 75,
'confidence': 75,
'name': 'Throughput',
'unit': 'Mbit/s',
'bounds':[0,120], # expected value range
}
Note. We aim to estimate the 25th percentile for the
throughput
, where higher is better.
Thus, the KPI provides the performance expected in at least 75% of the runs.
You are now all set to analyse this dataset!
In the following cell, we provide a skeletton code which you should complete and excute in order to
answer the following questions:
throughput
metric of the bbr
congestion-control scheme?Modify the definition of the variability score to estimate the median 'percentile': 50
instead of
the 25-75th percentile range.
Optional (and harder) questions:
# Extract and display the metrics values for the 5 series of 10 runs
scheme = 'bbr'
metric = 'throughput' # valid options are 'throughput' and 'delay'
metric_data = get_metrics(df, scheme, metric)
# Initialize an empty list to collect the KPI values
KPI_values = []
## Step 1. Compute the KPIs
for series_data in metric_data:
########## YOUR CODE HERE ###########
# - compute the KPI value
# - if the independence test is passed,
# store the value in the KPI list
#####################################
# Print the (valid) KPI values
s = '%i valid KPIs obtained\n> ' % len(KPI_values)
for k in KPI_values:
s += '%0.2f ' % k
s += '\n in %s\n' % KPI['unit']
print(s)
## Step 2. Compute the variability score
########## YOUR CODE HERE ###########
# - compute the variability score
# - print the result depending on the outcome
# - if there are not enough KPIs, print `nan`
# - if the independence test fails, print the score value *(-1)
# - else, print the score value
#####################################
########## YOUR CODE HERE ###########
# - compute the KPI value
# - if the independence test is passed,
# store the value in the KPI list
indep_test_passed, KPI_value = triscale.analysis_kpi(
series_data,
KPI)
if indep_test_passed:
KPI_values.append(KPI_value)
#
#####################################
########## YOUR CODE HERE ###########
# - compute the variability score
# - print the result depending on the outcome
# - if there are not enough KPIs, print `nan`
# - if the independence test fails, print the score value *(-1)
# - else, print the score value
(indep_test_passed,
upper_bound,
lower_bound,
var_score,
rel_score) = triscale.analysis_variability(
KPI_values,
score
)
if not indep_test_passed:
var_score *= -1
print('Variability score: %0.2f %s' % (var_score, score['unit']))
#
#####################################
These code blocks lead to the following output:
5 valid KPIs obtained
> 105.07 105.04 104.68 104.92 105.08
in Mbit/s
Variability score: 0.41 Mbit/s
Next step: Seasonal Components
Back to repo