Running ns-3 Simulation Campaigns with SEM¶

This example shows how you can use SEM to manage an ns-3 simulation campaign. We will be working with the wifi-multi-tos simulation script: this ns-3 program creates a WiFi network, measures the aggregate throughput and prints it out to the standard output.

Creating the simulation campaign object¶

First of all, let's import some relevant libraries. We will be using sem to run our simulations and parse the outputs, and matplotlib and seaborn to plot the results.

In [1]:

import sem
import pprint
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("white")

Most of our interaction with sem will happen through an object of type CampaignManager. When we create our campaign, we need to tell sem where to find our ns-3 installation, what script we want to run, and where it should save our results.

In [2]:

ns_path = 'ns-3'
script = 'wifi-multi-tos'
campaign_dir = "results"
campaign = sem.CampaignManager.new(ns_path, 
                                   script, 
                                   campaign_dir, 
                                   overwrite=True,
                                   max_parallel_processes=8)

At time of campaign creation, sem compiles ns-3, asks the ns-3 script we want to use for its command line parameters, and looks at the hash of the commit the ns-3 repository is currently at. This data is saved and printed out when we inspect the campaign object.

In [3]:

print(campaign)

--- Campaign info ---
script: wifi-multi-tos
params: {'nWifi': 4.0, 'distance': 1.0, 'simulationTime': 10.0, 'useRts': False, 'mcs': 7.0, 'channelWidth': None, 'useShortGuardInterval': False}
HEAD: 6aae3d04c51af2f60daabc9dd9da120a838623c0
Runner type: <class 'sem.parallelrunner.ParallelRunner'>
-----------

Running simulations¶

Now that we created our simulation campaign object, we can use it to run some simulations. In order to do this, we need to specify what parameter combinations we are interested in running. The wifi-multi-tos script supports the following command line parameters:

nWifi: Number of STAs to simulate
distance: Distance of STAs from the AP
useRts: Whether to enable RTS/CTS
useShortGuardInterval: Whether to use the short guard interval
mcs: Modulation Coding Scheme to use
channelWidth: Channel Width in MHz
simulationTime: How long to simulate for

Say we are interested in seeing how a certain MCS is affected by the distance between the STA and the AP. In this case, we might want to run simulations with different values of the mcs and distance parameters. To do this, we create a Python dictionary where the keys are the name of the parameters, and the values are lists of parameter values. When we pass this dictionary to our campaign object's run_missing_simulations function, sem will run simulations for each combination of the specified parameter space.

We also create a runs variable to specify how many randomized experiments we want sem to perform for each parameter combination.

In [4]:

params = {
    'nWifi': [1], 
    'distance': list(range(0, 80, 10)),
    'useRts': [True],  
    'useShortGuardInterval': [True],  
    'mcs': list(range(0, 8, 2)),  
    'channelWidth': [20],  
    'simulationTime': [4],  
}
runs = 2

campaign.run_missing_simulations(params, runs=runs)

Running simulations: 100%|██████████| 64/64 [00:11<00:00,  5.45simulation/s]

Viewing results¶

Results are now saved in the campaign object's database. We can access them through the campaign's database object, which offers a get_complete_results function:

In [5]:

print("There are %s results in the database\n" % len(list(campaign.db.get_complete_results())))
example_result = campaign.db.get_complete_results()[2]
print("This is an example result:\n")
pprint.pprint(example_result)

There are 64 results in the database

This is an example result:

{'meta': {'elapsed_time': 0.43055272102355957,
          'exitcode': 0,
          'id': '8433a788-0074-4653-92c1-4a4362c10d33'},
 'output': {'stderr': '', 'stdout': 'Aggregated throughput: 0 Mbit/s\n'},
 'params': {'RngRun': 47,
            'channelWidth': 20,
            'distance': 50,
            'mcs': 6,
            'nWifi': 1,
            'simulationTime': 4,
            'useRts': True,
            'useShortGuardInterval': True}}

We can see that there are 64 results in the database, corresponding to the simulations we ran earlier. A single result is simply a dictionary, with three keys:

meta, whose value is a dictionary containing information on the simulation execution
output, whose value is a dictionary containing a list of files created by the simulation, in addition to entries containing stdout and stderr
params, whose value is the parameter specification that was used to obtain this result

We can thus see the output produced by the simulation easily:

In [6]:

print(example_result['output']['stdout'])

Aggregated throughput: 0 Mbit/s

If we are interested in running one of such results manually, using waf from our ns-3 installation, sem provides a function that outputs the appropriate command automatically:

In [7]:

waf_command = sem.utils.get_command_from_result(script, example_result)
print("Use this command to reproduce the example result:\n%s" % waf_command)

waf_command_debug = sem.utils.get_command_from_result(script, example_result, debug=True)
print("Or obtain a debug command by setting the debug flag to true:\n%s" % waf_command_debug)

Use this command to reproduce the example result:
python3 waf --run "wifi-multi-tos --nWifi=1 --distance=50 --useRts=True --useShortGuardInterval=True --mcs=6 --channelWidth=20 --simulationTime=4 --RngRun=47"
Or obtain a debug command by setting the debug flag to true:
python3 waf --run wifi-multi-tos --command-template="gdb --args %s --nWifi=1 --distance=50 --useRts=True --useShortGuardInterval=True --mcs=6 --channelWidth=20 --simulationTime=4 --RngRun=47"

In fact, we can see that we get the same result if we execute the script through waf in a shell:

In [8]:

!echo "Executing $waf_command ..."
!cd $ns_path && $waf_command

Executing python3 waf --run wifi-multi-tos --nWifi=1 --distance=50 --useRts=True --useShortGuardInterval=True --mcs=6 --channelWidth=20 --simulationTime=4 --RngRun=47 ...
Waf: Entering directory `/home/davide/Work/sem/examples/ns-3/build/optimized'
Waf: Leaving directory `/home/davide/Work/sem/examples/ns-3/build/optimized'
Build commands will be stored in build/optimized/compile_commands.json
'build' finished successfully (0.304s)
Aggregated throughput: 0 Mbit/s

Exporting and plotting results¶

Now that the results of the simulations we are interested in running are saved in the database, it's time to obtain some plots. In order to do this, we have to transform strings like:

In [9]:

print(example_result['output']['stdout'])

Aggregated throughput: 0 Mbit/s

into values we can plot. In this case, since we are interested in the aggregate throughput, it will be enough to get the second-to-last word of the stdout string and convert it to a float. We can do this with a function that takes the result as input and outputs the measured throughput. For reasons we will explain later, let's have this function return a list containing our throughput:

In [10]:

@sem.utils.output_labels(['Throughput [Mbit/s]'])
@sem.utils.only_load_some_files(['stdout'])
def get_average_throughput(result):
    return [float(result['output']['stdout'].split(" ")[-2])]

We can test the function is working properly on our example result:

In [11]:

get_average_throughput(example_result)

Out[11]:

[0.0]

Looks ok! sem can accept functions defined like this, and use them on all the available results to produce neatly formatted and easy-to-use data structures. Let's create a Pandas DataFrame using the get_results_as_dataframe function:

In [12]:

# Use the parsing function to create a Pandas dataframe
results = campaign.get_results_as_dataframe(get_average_throughput,
                                            params=params)

Let's now inspect the results DataArray:

In [13]:

display(results)

	nWifi	distance	useRts	useShortGuardInterval	mcs	channelWidth	simulationTime	RngRun	Throughput [Mbit/s]
0	1	0	True	True	0	20	4	1	5.90272
1	1	0	True	True	0	20	4	0	5.91155
2	1	0	True	True	2	20	4	3	15.82990
3	1	0	True	True	2	20	4	2	15.79750
4	1	0	True	True	4	20	4	5	29.16030
...	...	...	...	...	...	...	...	...	...
59	1	70	True	True	2	20	4	59	0.00000
60	1	70	True	True	4	20	4	60	0.00000
61	1	70	True	True	4	20	4	61	0.00000
62	1	70	True	True	6	20	4	62	0.00000
63	1	70	True	True	6	20	4	63	0.00000

64 rows × 9 columns

Now that we have our results collected in such a data structure, plotting comes very naturally using seaborn:

In [14]:

sns.catplot(data=results,
            x='distance',
            y='Throughput [Mbit/s]',
            hue='mcs',
            kind='point')
plt.show()

More plot examples¶

Say we are now interested in seeing the effect of the useRts and useShortGuardInterval parameters. We just need to run some additional simulations and re-export results:

In [15]:

params = {
    'nWifi': [1], 
    'distance': [10],
    'useRts': [False, True],  
    'useShortGuardInterval': [False, True],  
    'mcs': list(range(0, 8, 2)),  
    'channelWidth': [20],  
    'simulationTime': [4],  
}
runs = 2

campaign.run_missing_simulations(params, runs=runs)

Running simulations: 100%|██████████| 24/24 [00:05<00:00,  4.21simulation/s]

sem will only perform the necessary simulations that have not already been executed to explore the new parameter space. Now, let's re-use the previous parsing function to extract the throughput for the new results:

In [16]:

# Use the parsing function to create a Pandas dataframe
results = campaign.get_results_as_dataframe(get_average_throughput,
                                            params=params)
#display(results)

As we can see, only 32 results are now parsed by the function, since we specified the parameter space we are interested in when we called get_results_as_dataframe. Now, let's move on to plotting.

In [17]:

sns.catplot(data=results,
            x='mcs',
            y='Throughput [Mbit/s]',
            col='useRts', 
            hue='useShortGuardInterval',
            kind='point')
plt.show()

Once initial plots are obtained, it's easy to adjust the number of repetitions using the runs argument, and get cleaner results.

Let's see how the number of WiFi devices impacts the distribution of the aggregate throughput:

In [18]:

params = {
    'nWifi': list(range(1, 6)), 
    'distance': [10],
    'useRts': [False],  
    'useShortGuardInterval': [False],  
    'mcs': [6],  
    'channelWidth': [20],  
    'simulationTime': [4],  
}
runs = 20

campaign.run_missing_simulations(params, runs=runs)

results = campaign.get_results_as_dataframe(get_average_throughput,
                                            params=params)
sns.catplot(data=results,
            x='nWifi',
            y='Throughput [Mbit/s]',
            kind='box')
plt.show()

Running simulations: 100%|██████████| 98/98 [00:46<00:00,  2.13simulation/s]

Finally, let's see the impact of the useRts parameter. Note the usage of the lambda function to specify we are not interested in testing [False, True] when we have a single STA.

In [19]:

params = {
    'nWifi': list(range(1, 6)), 
    'distance': [10],
    'useRts': lambda p: [False] if p['nWifi'] == 1 else [False, True],  
    'useShortGuardInterval': [False],
    'mcs': [6],  
    'channelWidth': [20],  
    'simulationTime': [4],  
}
runs = 20

campaign.run_missing_simulations(params, runs=runs)

results = campaign.get_results_as_dataframe(get_average_throughput,
                                            params=params)
sns.catplot(data=results,
            x='nWifi',
            y='Throughput [Mbit/s]',
            hue='useRts',
            split=True,
            kind='violin')
plt.show()

Running simulations: 100%|██████████| 80/80 [01:03<00:00,  1.26simulation/s]

Sensitivity Analysis¶

Through an integration with SALib, SEM allows for a quick assessment of the impact each input of the simulation to the output, using Sensitivity Analysis methods:

In [20]:

ranges = {
    # We fix these arguments
    'nWifi': [1], 
    'channelWidth': [20],
    # These are the arguments we are interested in
    'distance': {'min': 0, 'max': 100},
    'useRts': [False, True],  
    'useShortGuardInterval': [False, True],  
    'mcs': list(range(0, 8)),  
    'simulationTime': {'min': 1, 'max': 4},
}
print(sem.utils.compute_sensitivity_analysis(campaign, get_average_throughput, ranges, samples=30))

Running simulations: 100%|██████████| 360/360 [00:40<00:00,  8.96simulation/s]

{'S1': array([ 0.68503299,  0.02138148, -0.00080491,  0.44651507, -0.00088851]), 'S1_conf': array([0.39242959, 0.02033584, 0.02050351, 0.42737775, 0.00464737]), 'ST': array([8.90244032e-01, 4.55955495e-03, 4.58472799e-03, 5.40526063e-01,
       1.85347892e-04]), 'ST_conf': array([6.19237535e-01, 4.46792103e-03, 3.62193471e-03, 3.67095422e-01,
       1.96986718e-04]), 'S2': array([[        nan,  0.03071869,  0.03581605,  0.05371067,  0.03540227],
       [        nan,         nan, -0.02427913, -0.01442678, -0.02449401],
       [        nan,         nan,         nan,  0.02741976, -0.00805804],
       [        nan,         nan,         nan,         nan, -0.0336892 ],
       [        nan,         nan,         nan,         nan,         nan]]), 'S2_conf': array([[       nan, 0.44441562, 0.45434611, 0.61582631, 0.44618879],
       [       nan,        nan, 0.03379865, 0.07445689, 0.03419138],
       [       nan,        nan,        nan, 0.05531242, 0.02064947],
       [       nan,        nan,        nan,        nan, 0.58106304],
       [       nan,        nan,        nan,        nan,        nan]])}

We obtain an S1 score for each parameter for which we provided a range of values. An higher score means that parameter will have a higher impact on the output.