#!/usr/bin/env python
# coding: utf-8

# # Running ns-3 Simulation Campaigns with SEM
# 
# This example shows how you can use SEM to manage an ns-3 simulation campaign. We
# will be working with the ``wifi-multi-tos`` simulation script: this ns-3 program
# creates a WiFi network, measures the aggregate throughput and prints it out to
# the standard output. 

# ## Creating the simulation campaign object
# 
# First of all, let's import some relevant libraries. We will be using ``sem`` to
# run our simulations and parse the outputs, and ``matplotlib`` and ``seaborn`` to
# plot the results.

# In[1]:


import sem
import pprint
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("white")


# Most of our interaction with ``sem`` will happen through an object of type
# ``CampaignManager``. When we create our campaign, we need to tell ``sem`` where
# to find our ns-3 installation, what script we want to run, and where it should
# save our results.

# In[2]:


ns_path = 'ns-3'
script = 'wifi-multi-tos'
campaign_dir = "results"
campaign = sem.CampaignManager.new(ns_path, 
                                   script, 
                                   campaign_dir, 
                                   overwrite=True,
                                   max_parallel_processes=8)


# At time of campaign creation, ``sem`` compiles ns-3, asks the ns-3 script we
# want to use for its command line parameters, and looks at the hash of the commit
# the ns-3 repository is currently at. This data is saved and printed out when we
# inspect the campaign object.

# In[3]:


print(campaign)


# ## Running simulations
# 
# Now that we created our simulation campaign object, we can use it to run some
# simulations. In order to do this, we need to specify what parameter combinations
# we are interested in running. The ``wifi-multi-tos`` script supports the
# following command line parameters:
# 
# - ``nWifi``: Number of STAs to simulate
# - ``distance``: Distance of STAs from the AP
# - ``useRts``: Whether to enable RTS/CTS
# - ``useShortGuardInterval``: Whether to use the short guard interval
# - ``mcs``: Modulation Coding Scheme to use
# - ``channelWidth``: Channel Width in MHz
# - ``simulationTime``:  How long to simulate for
# 
# Say we are interested in seeing how a certain MCS is affected by the distance
# between the STA and the AP. In this case, we might want to run simulations with
# different values of the ``mcs`` and ``distance`` parameters. To do this, we
# create a Python dictionary where the keys are the name of the parameters, and
# the values are lists of parameter values. When we pass this dictionary to our
# ``campaign`` object's ``run_missing_simulations`` function, ``sem`` will run
# simulations for each combination of the specified parameter space.
# 
# We also create a ``runs`` variable to specify how many randomized experiments we
# want ``sem`` to perform for each parameter combination.
# 

# In[4]:


params = {
    'nWifi': [1], 
    'distance': list(range(0, 80, 10)),
    'useRts': [True],  
    'useShortGuardInterval': [True],  
    'mcs': list(range(0, 8, 2)),  
    'channelWidth': [20],  
    'simulationTime': [4],  
}
runs = 2

campaign.run_missing_simulations(params, runs=runs)


# ## Viewing results
# 
# Results are now saved in the campaign object's database. We can access them
# through the campaign's database object, which offers a ``get_complete_results``
# function:
# 

# In[5]:


print("There are %s results in the database\n" % len(list(campaign.db.get_complete_results())))
example_result = campaign.db.get_complete_results()[2]
print("This is an example result:\n")
pprint.pprint(example_result)


# We can see that there are 64 results in the database, corresponding to the
# simulations we ran earlier. A single result is simply a dictionary, with three
# keys:
# - ``meta``, whose value is a dictionary containing information on the simulation execution
# - ``output``, whose value is a dictionary containing a list of files created by the simulation, in addition to entries containing stdout and stderr
# - ``params``, whose value is the parameter specification that was used to obtain this result
# 
# We can thus see the output produced by the simulation easily:
# 

# In[6]:


print(example_result['output']['stdout'])


# If we are interested in running one of such results manually, using waf from our
# ns-3 installation, ``sem`` provides a function that outputs the appropriate
# command automatically:
# 

# In[7]:


waf_command = sem.utils.get_command_from_result(script, example_result)
print("Use this command to reproduce the example result:\n%s" % waf_command)

waf_command_debug = sem.utils.get_command_from_result(script, example_result, debug=True)
print("Or obtain a debug command by setting the debug flag to true:\n%s" % waf_command_debug)


# In fact, we can see that we get the same result if we execute the script through `waf` in a shell:

# In[8]:


get_ipython().system('echo "Executing $waf_command ..."')
get_ipython().system('cd $ns_path && $waf_command')


# ## Exporting and plotting results
# Now that the results of the simulations we are interested in running are saved
# in the database, it's time to obtain some plots. In order to do this, we have to
# transform strings like:
# 

# In[9]:


print(example_result['output']['stdout'])


# into values we can plot. In this case, since we are interested in the aggregate
# throughput, it will be enough to get the second-to-last word of the stdout
# string and convert it to a float. We can do this with a function that takes the 
# result as input and outputs the measured throughput. For reasons we will explain
# later, let's have this function return a list containing our throughput:
# 

# In[10]:


@sem.utils.output_labels(['Throughput [Mbit/s]'])
@sem.utils.only_load_some_files(['stdout'])
def get_average_throughput(result):
    return [float(result['output']['stdout'].split(" ")[-2])]


# We can test the function is working properly on our example result:

# In[11]:


get_average_throughput(example_result)


# Looks ok! ``sem`` can accept functions defined like this, and use them on all
# the available results to produce neatly formatted and easy-to-use data
# structures. Let's create a Pandas DataFrame using the
# ``get_results_as_dataframe`` function:
# 

# In[12]:


# Use the parsing function to create a Pandas dataframe
results = campaign.get_results_as_dataframe(get_average_throughput,
                                            params=params)


# Let's now inspect the ``results`` DataArray:

# In[13]:


display(results)


# Now that we have our results collected in such a data structure, plotting comes
# very naturally using ``seaborn``:

# In[14]:


sns.catplot(data=results,
            x='distance',
            y='Throughput [Mbit/s]',
            hue='mcs',
            kind='point')
plt.show()


# ## More plot examples
# 
# Say we are now interested in seeing the effect of the ``useRts`` and ``useShortGuardInterval`` parameters. We just need to run some additional simulations and re-export results:

# In[15]:


params = {
    'nWifi': [1], 
    'distance': [10],
    'useRts': [False, True],  
    'useShortGuardInterval': [False, True],  
    'mcs': list(range(0, 8, 2)),  
    'channelWidth': [20],  
    'simulationTime': [4],  
}
runs = 2

campaign.run_missing_simulations(params, runs=runs)


# ``sem`` will only perform the necessary simulations that have not already been executed to explore the new parameter space. Now, let's re-use the previous parsing function to extract the throughput for the new results:

# In[16]:


# Use the parsing function to create a Pandas dataframe
results = campaign.get_results_as_dataframe(get_average_throughput,
                                            params=params)
#display(results)


# As we can see, only 32 results are now parsed by the function, since we specified the parameter space we are interested in when we called ``get_results_as_dataframe``. Now, let's move on to plotting.

# In[17]:


sns.catplot(data=results,
            x='mcs',
            y='Throughput [Mbit/s]',
            col='useRts', 
            hue='useShortGuardInterval',
            kind='point')
plt.show()


# Once initial plots are obtained, it's easy to adjust the number of repetitions using the runs argument, and get cleaner results.

# Let's see how the number of WiFi devices impacts the distribution of the aggregate throughput:

# In[18]:


params = {
    'nWifi': list(range(1, 6)), 
    'distance': [10],
    'useRts': [False],  
    'useShortGuardInterval': [False],  
    'mcs': [6],  
    'channelWidth': [20],  
    'simulationTime': [4],  
}
runs = 20

campaign.run_missing_simulations(params, runs=runs)

results = campaign.get_results_as_dataframe(get_average_throughput,
                                            params=params)
sns.catplot(data=results,
            x='nWifi',
            y='Throughput [Mbit/s]',
            kind='box')
plt.show()


# Finally, let's see the impact of the `useRts` parameter. Note the usage of the `lambda` function to specify we are not interested in testing `[False, True]` when we have a single STA.

# In[19]:


params = {
    'nWifi': list(range(1, 6)), 
    'distance': [10],
    'useRts': lambda p: [False] if p['nWifi'] == 1 else [False, True],  
    'useShortGuardInterval': [False],
    'mcs': [6],  
    'channelWidth': [20],  
    'simulationTime': [4],  
}
runs = 20

campaign.run_missing_simulations(params, runs=runs)

results = campaign.get_results_as_dataframe(get_average_throughput,
                                            params=params)
sns.catplot(data=results,
            x='nWifi',
            y='Throughput [Mbit/s]',
            hue='useRts',
            split=True,
            kind='violin')
plt.show()


# ## Sensitivity Analysis
# 
# Through an integration with [SALib](https://salib.readthedocs.io/en/latest/), SEM allows for a quick assessment of the impact each input of the simulation to the output, using Sensitivity Analysis methods:

# In[20]:


ranges = {
    # We fix these arguments
    'nWifi': [1], 
    'channelWidth': [20],
    # These are the arguments we are interested in
    'distance': {'min': 0, 'max': 100},
    'useRts': [False, True],  
    'useShortGuardInterval': [False, True],  
    'mcs': list(range(0, 8)),  
    'simulationTime': {'min': 1, 'max': 4},
}
print(sem.utils.compute_sensitivity_analysis(campaign, get_average_throughput, ranges, samples=30))


# We obtain an S1 score for each parameter for which we provided a range of values. An higher score means that parameter will have a higher impact on the output.