This example shows how you can use SEM to manage an ns-3 simulation campaign. We
will be working with the wifi-multi-tos
simulation script: this ns-3 program
creates a WiFi network, measures the aggregate throughput and prints it out to
the standard output.
First of all, let's import some relevant libraries. We will be using sem
to
run our simulations and parse the outputs, and matplotlib
and seaborn
to
plot the results.
import sem
import pprint
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("white")
Most of our interaction with sem
will happen through an object of type
CampaignManager
. When we create our campaign, we need to tell sem
where
to find our ns-3 installation, what script we want to run, and where it should
save our results.
ns_path = 'ns-3'
script = 'wifi-multi-tos'
campaign_dir = "results"
campaign = sem.CampaignManager.new(ns_path,
script,
campaign_dir,
overwrite=True,
max_parallel_processes=8)
At time of campaign creation, sem
compiles ns-3, asks the ns-3 script we
want to use for its command line parameters, and looks at the hash of the commit
the ns-3 repository is currently at. This data is saved and printed out when we
inspect the campaign object.
print(campaign)
--- Campaign info --- script: wifi-multi-tos params: {'nWifi': 4.0, 'distance': 1.0, 'simulationTime': 10.0, 'useRts': False, 'mcs': 7.0, 'channelWidth': None, 'useShortGuardInterval': False} HEAD: 6aae3d04c51af2f60daabc9dd9da120a838623c0 Runner type: <class 'sem.parallelrunner.ParallelRunner'> -----------
Now that we created our simulation campaign object, we can use it to run some
simulations. In order to do this, we need to specify what parameter combinations
we are interested in running. The wifi-multi-tos
script supports the
following command line parameters:
nWifi
: Number of STAs to simulatedistance
: Distance of STAs from the APuseRts
: Whether to enable RTS/CTSuseShortGuardInterval
: Whether to use the short guard intervalmcs
: Modulation Coding Scheme to usechannelWidth
: Channel Width in MHzsimulationTime
: How long to simulate forSay we are interested in seeing how a certain MCS is affected by the distance
between the STA and the AP. In this case, we might want to run simulations with
different values of the mcs
and distance
parameters. To do this, we
create a Python dictionary where the keys are the name of the parameters, and
the values are lists of parameter values. When we pass this dictionary to our
campaign
object's run_missing_simulations
function, sem
will run
simulations for each combination of the specified parameter space.
We also create a runs
variable to specify how many randomized experiments we
want sem
to perform for each parameter combination.
params = {
'nWifi': [1],
'distance': list(range(0, 80, 10)),
'useRts': [True],
'useShortGuardInterval': [True],
'mcs': list(range(0, 8, 2)),
'channelWidth': [20],
'simulationTime': [4],
}
runs = 2
campaign.run_missing_simulations(params, runs=runs)
Running simulations: 100%|██████████| 64/64 [00:11<00:00, 5.45simulation/s]
Results are now saved in the campaign object's database. We can access them
through the campaign's database object, which offers a get_complete_results
function:
print("There are %s results in the database\n" % len(list(campaign.db.get_complete_results())))
example_result = campaign.db.get_complete_results()[2]
print("This is an example result:\n")
pprint.pprint(example_result)
There are 64 results in the database This is an example result: {'meta': {'elapsed_time': 0.43055272102355957, 'exitcode': 0, 'id': '8433a788-0074-4653-92c1-4a4362c10d33'}, 'output': {'stderr': '', 'stdout': 'Aggregated throughput: 0 Mbit/s\n'}, 'params': {'RngRun': 47, 'channelWidth': 20, 'distance': 50, 'mcs': 6, 'nWifi': 1, 'simulationTime': 4, 'useRts': True, 'useShortGuardInterval': True}}
We can see that there are 64 results in the database, corresponding to the simulations we ran earlier. A single result is simply a dictionary, with three keys:
meta
, whose value is a dictionary containing information on the simulation executionoutput
, whose value is a dictionary containing a list of files created by the simulation, in addition to entries containing stdout and stderrparams
, whose value is the parameter specification that was used to obtain this resultWe can thus see the output produced by the simulation easily:
print(example_result['output']['stdout'])
Aggregated throughput: 0 Mbit/s
If we are interested in running one of such results manually, using waf from our
ns-3 installation, sem
provides a function that outputs the appropriate
command automatically:
waf_command = sem.utils.get_command_from_result(script, example_result)
print("Use this command to reproduce the example result:\n%s" % waf_command)
waf_command_debug = sem.utils.get_command_from_result(script, example_result, debug=True)
print("Or obtain a debug command by setting the debug flag to true:\n%s" % waf_command_debug)
Use this command to reproduce the example result: python3 waf --run "wifi-multi-tos --nWifi=1 --distance=50 --useRts=True --useShortGuardInterval=True --mcs=6 --channelWidth=20 --simulationTime=4 --RngRun=47" Or obtain a debug command by setting the debug flag to true: python3 waf --run wifi-multi-tos --command-template="gdb --args %s --nWifi=1 --distance=50 --useRts=True --useShortGuardInterval=True --mcs=6 --channelWidth=20 --simulationTime=4 --RngRun=47"
In fact, we can see that we get the same result if we execute the script through waf
in a shell:
!echo "Executing $waf_command ..."
!cd $ns_path && $waf_command
Executing python3 waf --run wifi-multi-tos --nWifi=1 --distance=50 --useRts=True --useShortGuardInterval=True --mcs=6 --channelWidth=20 --simulationTime=4 --RngRun=47 ... Waf: Entering directory `/home/davide/Work/sem/examples/ns-3/build/optimized' Waf: Leaving directory `/home/davide/Work/sem/examples/ns-3/build/optimized' Build commands will be stored in build/optimized/compile_commands.json 'build' finished successfully (0.304s) Aggregated throughput: 0 Mbit/s
Now that the results of the simulations we are interested in running are saved in the database, it's time to obtain some plots. In order to do this, we have to transform strings like:
print(example_result['output']['stdout'])
Aggregated throughput: 0 Mbit/s
into values we can plot. In this case, since we are interested in the aggregate throughput, it will be enough to get the second-to-last word of the stdout string and convert it to a float. We can do this with a function that takes the result as input and outputs the measured throughput. For reasons we will explain later, let's have this function return a list containing our throughput:
@sem.utils.output_labels(['Throughput [Mbit/s]'])
@sem.utils.only_load_some_files(['stdout'])
def get_average_throughput(result):
return [float(result['output']['stdout'].split(" ")[-2])]
We can test the function is working properly on our example result:
get_average_throughput(example_result)
[0.0]
Looks ok! sem
can accept functions defined like this, and use them on all
the available results to produce neatly formatted and easy-to-use data
structures. Let's create a Pandas DataFrame using the
get_results_as_dataframe
function:
# Use the parsing function to create a Pandas dataframe
results = campaign.get_results_as_dataframe(get_average_throughput,
params=params)
Let's now inspect the results
DataArray:
display(results)
nWifi | distance | useRts | useShortGuardInterval | mcs | channelWidth | simulationTime | RngRun | Throughput [Mbit/s] | |
---|---|---|---|---|---|---|---|---|---|
0 | 1 | 0 | True | True | 0 | 20 | 4 | 1 | 5.90272 |
1 | 1 | 0 | True | True | 0 | 20 | 4 | 0 | 5.91155 |
2 | 1 | 0 | True | True | 2 | 20 | 4 | 3 | 15.82990 |
3 | 1 | 0 | True | True | 2 | 20 | 4 | 2 | 15.79750 |
4 | 1 | 0 | True | True | 4 | 20 | 4 | 5 | 29.16030 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
59 | 1 | 70 | True | True | 2 | 20 | 4 | 59 | 0.00000 |
60 | 1 | 70 | True | True | 4 | 20 | 4 | 60 | 0.00000 |
61 | 1 | 70 | True | True | 4 | 20 | 4 | 61 | 0.00000 |
62 | 1 | 70 | True | True | 6 | 20 | 4 | 62 | 0.00000 |
63 | 1 | 70 | True | True | 6 | 20 | 4 | 63 | 0.00000 |
64 rows × 9 columns
Now that we have our results collected in such a data structure, plotting comes
very naturally using seaborn
:
sns.catplot(data=results,
x='distance',
y='Throughput [Mbit/s]',
hue='mcs',
kind='point')
plt.show()
Say we are now interested in seeing the effect of the useRts
and useShortGuardInterval
parameters. We just need to run some additional simulations and re-export results:
params = {
'nWifi': [1],
'distance': [10],
'useRts': [False, True],
'useShortGuardInterval': [False, True],
'mcs': list(range(0, 8, 2)),
'channelWidth': [20],
'simulationTime': [4],
}
runs = 2
campaign.run_missing_simulations(params, runs=runs)
Running simulations: 100%|██████████| 24/24 [00:05<00:00, 4.21simulation/s]
sem
will only perform the necessary simulations that have not already been executed to explore the new parameter space. Now, let's re-use the previous parsing function to extract the throughput for the new results:
# Use the parsing function to create a Pandas dataframe
results = campaign.get_results_as_dataframe(get_average_throughput,
params=params)
#display(results)
As we can see, only 32 results are now parsed by the function, since we specified the parameter space we are interested in when we called get_results_as_dataframe
. Now, let's move on to plotting.
sns.catplot(data=results,
x='mcs',
y='Throughput [Mbit/s]',
col='useRts',
hue='useShortGuardInterval',
kind='point')
plt.show()
Once initial plots are obtained, it's easy to adjust the number of repetitions using the runs argument, and get cleaner results.
Let's see how the number of WiFi devices impacts the distribution of the aggregate throughput:
params = {
'nWifi': list(range(1, 6)),
'distance': [10],
'useRts': [False],
'useShortGuardInterval': [False],
'mcs': [6],
'channelWidth': [20],
'simulationTime': [4],
}
runs = 20
campaign.run_missing_simulations(params, runs=runs)
results = campaign.get_results_as_dataframe(get_average_throughput,
params=params)
sns.catplot(data=results,
x='nWifi',
y='Throughput [Mbit/s]',
kind='box')
plt.show()
Running simulations: 100%|██████████| 98/98 [00:46<00:00, 2.13simulation/s]
Finally, let's see the impact of the useRts
parameter. Note the usage of the lambda
function to specify we are not interested in testing [False, True]
when we have a single STA.
params = {
'nWifi': list(range(1, 6)),
'distance': [10],
'useRts': lambda p: [False] if p['nWifi'] == 1 else [False, True],
'useShortGuardInterval': [False],
'mcs': [6],
'channelWidth': [20],
'simulationTime': [4],
}
runs = 20
campaign.run_missing_simulations(params, runs=runs)
results = campaign.get_results_as_dataframe(get_average_throughput,
params=params)
sns.catplot(data=results,
x='nWifi',
y='Throughput [Mbit/s]',
hue='useRts',
split=True,
kind='violin')
plt.show()
Running simulations: 100%|██████████| 80/80 [01:03<00:00, 1.26simulation/s]
ranges = {
# We fix these arguments
'nWifi': [1],
'channelWidth': [20],
# These are the arguments we are interested in
'distance': {'min': 0, 'max': 100},
'useRts': [False, True],
'useShortGuardInterval': [False, True],
'mcs': list(range(0, 8)),
'simulationTime': {'min': 1, 'max': 4},
}
print(sem.utils.compute_sensitivity_analysis(campaign, get_average_throughput, ranges, samples=30))
Running simulations: 100%|██████████| 360/360 [00:40<00:00, 8.96simulation/s]
{'S1': array([ 0.68503299, 0.02138148, -0.00080491, 0.44651507, -0.00088851]), 'S1_conf': array([0.39242959, 0.02033584, 0.02050351, 0.42737775, 0.00464737]), 'ST': array([8.90244032e-01, 4.55955495e-03, 4.58472799e-03, 5.40526063e-01, 1.85347892e-04]), 'ST_conf': array([6.19237535e-01, 4.46792103e-03, 3.62193471e-03, 3.67095422e-01, 1.96986718e-04]), 'S2': array([[ nan, 0.03071869, 0.03581605, 0.05371067, 0.03540227], [ nan, nan, -0.02427913, -0.01442678, -0.02449401], [ nan, nan, nan, 0.02741976, -0.00805804], [ nan, nan, nan, nan, -0.0336892 ], [ nan, nan, nan, nan, nan]]), 'S2_conf': array([[ nan, 0.44441562, 0.45434611, 0.61582631, 0.44618879], [ nan, nan, 0.03379865, 0.07445689, 0.03419138], [ nan, nan, nan, 0.05531242, 0.02064947], [ nan, nan, nan, nan, 0.58106304], [ nan, nan, nan, nan, nan]])}
We obtain an S1 score for each parameter for which we provided a range of values. An higher score means that parameter will have a higher impact on the output.