This notebook takes you through detailed setup of your settings for Microsoft Sentinel Notebooks and the MSTICPy library. It covers:
If you are using notebooks in the Microsoft Sentinel/Azure ML environment you can skip the first section "Configuring your Python Environment" entirely.
Warning. Due to rendering issues in Azure Machine Learning, we strongly recommend running this notebook in Jupyter Lab or VSCode.
The main part of this notebook involves setting up your msticpyconfig.yaml. While many of these settings are optional, if you do not configure them correctly you'll experience some loss of functionality. For example, using Threat Intelligence providers usually requires an API key. To save you having to type this in every time you look up an IP Address you should put this in a config file.
This section takes you through creating settings for
You'll typically need the first three of these to use most of the notebooks fully.
Section 3, "The config.json file" can also be ignored if you
are happy using msticpyconfig.yaml
. It is included here
for background.
msticpyconfig.yaml
config.json
fileIf you are running in Jupyterhub environment such as Azure Notebooks, Python is already installed. When using any of the sample notebooks or copies of them you only need to ensure that the Python 3.6 (or later) kernel is selected.
If you are running the notebooks locally will you need to install Python 3.6 or later. The Ananconda distribution is a good starting point since it comes with many required packages already installed.
If you are running these notebooks locally, it is a good idea to create a clean Python virtual environment, before installing any of the packages . This will prevent installed packages conflicting with versions that you may need for other applications.
For standard python use the venv
command.
For Conda use the conda env
command.
In both cases be sure to activate the environment before running jupyter using venvpath/Scripts/activate
or conda activate {my_env_name}
.
# Run this cell to view requirements.txt
%pfile requirements.txt
Although you can use pip inside a conda environment it is usually better to try to install conda packages whenever possible.
activate {my_env_name}
conda config --append channels conda-forge
conda install package1 package2
See Managing packages in Anaconda.
For packages that are not available as conda packages use pip from with a Conda prompt/shell to install the remaining packages.
If you are using a shared installation of Python (i.e. one installed by the administrator) you will need to add the --user
option to your pip install
commands. E.g.
pip install pkg_name --user --upgrade
This will avoid permission errors by installing into your user folder.
Note: the use of the
--user
option is usually not required in a Conda environment since the Python site packages are normally already installed in a per-user folder.
The first time this cell runs for a new Azure ML or Azure Notebooks notebook or other Python environment it will do the following things:
REQ_MSTICPY_VER
)
it will attempt to install a new version (you will be prompted whether you want to do this)
The install can take several minutes depending on the versions of packages that you already have installed.init_notebook
function is run. This:Note: In subsequent runs, this cell should run quickly since you will already have the required packages installed.
Warning: you may see some warnings about incompatibility with certain packages. This should not affect the functionality of this notebook but you may need to upgrade the packages producing the warnings to a more recent version.
from pathlib import Path
import os
import sys
import warnings
from IPython.display import display, HTML, Markdown
REQ_PYTHON_VER="3.8"
REQ_MSTICPY_VER="1.5.0"
# If not using Azure Notebooks, install msticpy with
# %pip install msticpy
from msticpy.nbtools import nbinit
nbinit.init_notebook(
namespace=globals(),
);
msticpyconfig.yaml
¶MSTICPy is a Python package used in most of the Jupyter notebooks on Azure-Sentinel-Notebooks. It provides a lot of functionality specific to threat hunting and investigations, including:
Note: the configuration actions in this section are an abbreviated version of the MPSettingsEditor notebook
Use this notebook for a fuller guide on how to configure your settings.
Also, see these sections in the MSTICPy documentation:
MSTICPy Package Configuration
MSTICPy Settings Editor
config.json
provides some basic configuration for connecting to your Microsoft Sentinel workspace.
However, there are many features that require additional configuration information. Some examples are:
Settings for these are stored in the msticpyconfig.yaml
file. This file is read from the current directory or you can set an environment variable (MSTICPYCONFIG
) pointing to its location.
Form more information about msticpy configuration see msticpy Package Configuration.
The most commonly-used sections are described below.
For more information on the msticpy Threat Intel lookup class see the documentation here.
Primary providers are used by default. Secondary providers are not run by default but can be invoked by using the providers
parameter to lookup_ioc()
or lookup_iocs()
. Set the Primary
config setting to True
or False
for each provider ID according to how you want to use them. The providers
parameter should be a list of strings identifying the provider(s) to use.
Provider:
setting for each of the TI providers - do not alter this value.Like the TI providers these services normally need an API key to access. You can read more about configuration the supported providers here. msticpy GeoIP Providers
The functionality to screenshot a URL in msticpy.sectools.domain_utils relies on a service called BrowShot (https://browshot.com/). An API key is required to use this service and it needs to be defined in the msticpyconfig
file as well. As this is not a threat intelligence provider it doesn't not fall under the TIProviders
section of msticpyconfig
but instead sits alone. See the cell below for example configuration.
msticpyconfig.yaml
¶We'll be using some of the MSTICPy configuration tools: MPConfigEdit and MPConfigFile, so we'll import these first
from msticpy import MpConfigFile, MpConfigEdit
Then run MpConfig file to view your current settings.
mpconfig = MpConfigFile()
mpconfig.load_default()
mpconfig.view_settings()
...in the settings view above it means that you probably need to create up a msticpyconfig.yaml
If you know that you have configured a msticpyconfig file, you can search for this file using MpConfigFile. Click on Load file. Once you've done that go to the Setting the path to your msticpyconfig.yaml
Follow these steps:
msticpyconfig.yaml
fileThis is a good point to set up an environment variable so that you can keep a single configuration file in a known location and always load the same settings. (Of course, you're free to use multiple configs if you need to use different settings for each notebook folder)
msticpyconfig.yaml
- this could be in "~/.msticpyconfig.yaml" or "%userprofile%/msticpyconfig.yaml"msticpyconfig.yaml
file that you just created to this location.MSTICPYCONFIG
environment variable to point to that location:In your .bashrc (or somewhere else convenient) add:
export MSTICPYCONFIG=~/.msticpyconfig.yaml
In Azure ML, you need to decide whether to store your msticpyconfig.yaml
in
the AML file store or on the Compute file system. If you have any secret
key material in the file, we recommend storing on the Compute instance, since
the AML file store is shared storage, whereas the Compute instance is
accessible only by the user who created it.
If you are happy to leave the file in the AML file store, you should be set. The init_notebook function run at the start of the notebook will find it there in your root folder and set the MSTICPYCONFIG environment variable to point to it.
Pointing to a path on a compute instance
Verify your msticpyconfig.yaml is accessible
Your current directory should be your AML file store home directory (this is mounted in the Compute Linux system) and the prompt will look something like the example below.
If you created a msticpyconfig.yaml
in the previous step,
this should be visible if you type ls
.
azureuser@ianhelle-azml7:~/cloudfiles/code/Users/ianhelle$ ls msti*
msticpyconfig.yaml
Move the file to your home folder
mv msticpyconfig.yaml ~
Add an environment variable Because the Jupyter server is started before you connect its process will not inherit and environment variables from you .bashrc You can set it one of two places:
kernel.json
file for your Python kernel (there are kernels for
both Python 3.6 and Python 3.8nbuser_settings.py
to the root of your user folder.These options are described in the following sections.
kernel.json
/usr/local/share/jupyter/kernels/python38-azureml/kernel.json
/usr/local/share/jupyter/kernels/python3-azureml/kernel.json
Make a copy of the file and open the original in an editor (you many need to use sudo to be able to overwrite this file). The file will look something like this
{
"argv": [
"/anaconda/envs/azureml_py38/bin/python",
"-m",
"ipykernel_launcher",
"-f",
"{connection_file}"
],
"display_name": "Python 3.8 - AzureML",
"language": "python"
}
Add the following line after the "language" item.
"env": { "MSTICPYCONFIG": "~/msticpyconfig.yaml" }
Your file should look like this (remember to add a comma at the end of the
"language": "python"
line
{
"argv": [
"/anaconda/envs/azureml_py38/bin/python",
"-m",
"ipykernel_launcher",
"-f",
"{connection_file}"
],
"display_name": "Python 3.8 - AzureML",
"language": "python",
"env": { "MSTICPYCONFIG": "~/msticpyconfig.yaml" }
}
If you use both kernels you will need to edit both files.
nbuser_settings.py
Create this file (you can do this from the AML workspace) in the root of your user folder (i.e. inside the folder with your username) and add the following lines
import os
os.environ["MSTICPYCONFIG"] = "~/msticpyconfig.yaml"
This file, if it exists, is imported by the nb_check.check_versions
function at the start of the notebook. It will set the environment
variable at the start of each notebook before any configuration is read.
This is simpler and less intrusive than editing the kernel.json.
However, it only works if you run check_versions
. If you load
a notebook without running this MSTICPy may not be able to find
its configuration file.
If you loaded a config.json file into your msticpyconfig.yaml, you should see your workspace displayed when you run the following cell. If not, you can add one or more workspaces here. The Name, WorkspaceId and TenantId are mandatory. The other fields are helpful but not essential.
Use the Help drop-down panel to find more information about adding workspaces and finding the correct values for your workspace.
If this the workspace that you use frequently or all of the time, you may want to set this as the default. This creates a duplicate entry named "Default". This is used when you connect to AzureSentinel without needing to supply a workspace name. You can override this by specifying a workspace name at connect time, which you need to do if you are working with multiple workspaces.
When you've finished, type a file name (usually "msticpyconfig.yaml") into the Conf File text box and click Save File,
You can also try the Validate Settings button. This should show that you have a few missing sections (we'll fill these in later) but should show nothing under the the "Type Validation Results".
mpedit = MpConfigEdit(settings=mpconfig)
mpedit.set_tab("AzureSentinel")
mpedit
You will likely want to do lookups of IP Addresses, URLs and other items to check for any Threat Intelligence reports. To do that you need to add the providers that you want to use. Most TI providers require that you have an account with them and supply an API key or other authentication items when you connect.
Most providers have a free use tier (or in cases like AlienVault OTX) are entirely free. Free tiers for paid providers usually impose a certain number of requests that you can make in a given time period.
For account creation, each provider does this slightly differently. Use the help links in the editor help to find where to go set each of these up.
Assuming that you have done this, we can configure a provider. Be sure to store any authentication keys somewhere safe (and memorable).
We are going to use VirusTotal (VT) as an example TI Provider.
For this you will need a VirusTotal API key from the
VirusTotal website.
We also support a range of other threat intelligence providers - you can read about this here MSTICPy TIProviders
Taking VirusTotal as our example.
This should show you the values that you need to provide:
You can paste the key into the Value field and click the Save button.
You can opt to store the VT AuthKey as an environment variable. This is a bit more secure than having it laying around in configuration files. Assuming that you have set you VT key as an environment variable
set VT_KEY=VGhpcyBzaG91bGQgc2hvdyB5b3UgdGhlIHZhbHVlcyB (Windows)
export VT_KEY=VGhpcyBzaG91bGQgc2hvdyB5b3UgdGhlIHZhbHVlcyB (Linux/MAC)
Flip the Storage radio button to EnvironmentVar and type the name of the
variable (VT_KEY
in our example) into the value box.
You can also use Azure Key Vault to store secrets like these but we will need to set up the Key Vault settings before this will work.
Click the Save File button to save your changes.
mpedit.set_tab("TI Providers")
mpedit
MSTICPy supports two Geo IP providers - Maxmind GeoIPLite and IP Stack. The main difference between the two is that Maxmind downloads and uses a local database, while IPStack is a purely online solution.
For either you need API keys to either download the free database from MaxMind or access the IPStack online lookup
We'll use GeoIPLite as our example. You can sign up for a free account and API key at https://www.maxmind.com/en/geolite2/signup. You'll need the API for the following steps.
Set the maxmind data folder:
%USERPROFILE%/.msticpy
..msticpy
in your home folder.Note: as with the TI providers you can opt to store your key as an environment variable or keep it in Key Vault.
mpedit.set_tab("GeoIP Providers")
mpedit
You might not be too comfortable leaving API keys stored in text files. You can opt to have these settings stored either:
- as Environment Variables
- in Azure Key Vault
To see how to do this see these resources
To access Azure APIs (such as the Sentinel APIs or Azure resource APIs) you need to be able to use Azure Authentication. The setting is named "AzureCLI" for historical reasons - don't let that confuse you. We currently support two ways of authenticating:
The former can try up to four methods of authentication:
To use chained authentication methods select the methods to want to use and leave the clientId/tenantiId/clientSecret fields empty.
mpedit.set_tab("Data Providers")
mpedit
This section controls which, if any query providers you want to load automatically
when you run nbinit.init_notebook
.
This can save a lot of time if you are frequently authoring new notebooks. It also allows the right providers to be loaded before other components that might use them such as
(more about these in the next section)
There are two types of provider support:
Available Microsoft Sentinel workspaces are taken from the items you configured in the Microsoft Sentinel tab. Other providers are taken from the list of available provider types in MSTICPy.
There are two options for each of these:
Note if you lose track of which providers have been loaded by this mechanism they are added to the
current_providers
attribute ofmsticpy
import msticpy
msticpy.current_providers
mpedit.set_tab("Autoload QueryProvs")
mpedit
This section controls which, if other components you want to load automatically
when you run nbinit.init_notebook()
.
This includes
These are loaded in this order, since the Pivot component needs query and other providers loaded in order to find the pivot functions that it will attach to entities. For more information see pivot functions
Some components do not require any parameters (e.g. TILookup and Pivot). Others do support or require additional settings:
GeoIpLookup
You must type the name of the GeoIP provider that you want to use - either "GeoLiteLookup" or "IPStack"
AzureData and AzureSentinelAPI
Notebooklets
This has a single parameter block AzureSentinel. At minumum you should specify the workspace name. This needs to be in the following format:
workspace:WORKSPACENAME
WORKSPACENAME must be one of the workspaces defined in the Microsoft Sentinel tab.
You can also add addition parameters to send to the notebooklets init function: Specify these as addition key:value pairs, separated by newlines.
workspace:WORKSPACENAME
providers=["LocalData","geolitelookup"]
See the
msticnb init
documentation
for more details
mpedit.set_tab("Autoload Components")
mpedit
Save your file, and, if you haven't yet done so, create an enviroment variable to point to it. See Setting the path to your msticpyconfig.yaml
msticpyconfig.yaml
settings¶MpConfigFile includes a validation function that can help you diagnose setup problems.
You can run this interactively or from Python.
The examples below assume that you have set MSTICPYCONFIG
to point
to you config file. If not, you will need to use the load_from_file()
function (or Load File button) to load the file before validating.
mpconfig = MpConfigFile()
mpconfig.load_default()
mpconfig.validate_settings()
To validate interactively:
mpconfig = MpConfigFile()
mpconfig.load_default()
mpconfig
config.json
file¶When you start a notebook from Microsoft Sentinel for the first time it will create a config.json
file in
your notebooks folder. This should be populated with your workspace and tenant IDs needed to
authenticate to Microsoft Sentinel.
If you are using notebooks in a different environment you may need to create a config.json
or msticpyconfig.yaml
(see below)
to supply this information to your notebook.
We recommend creating a msticpyconfig.yaml
since this can hold a wide variety
of settings for your notebook, including multiple Microsoft Sentinel workspace settings.
The config.json, in contrast, only holds settings for a single Microsoft Sentinel workspace.
For more information see this msticpy Package Configuration
If you need to create or modify your config.json you can run the following cell.
You will need the subscription and workspace IDs for your Microsoft Sentinel Workspace. These can be found here in the Microsoft Sentinel portal as shown below.
Copy the subscription and workspace IDs:
import requests
import json
import ipywidgets as widgets
from pathlib import Path
from datetime import datetime
config_dict = {}
def get_tenant_for_subscription(sub_id):
aad_url = (
f"https://management.azure.com/subscriptions/{sub_id}?api-version=2016-01-01"
)
resp = requests.get(aad_url)
if resp.status_code == 401:
hdr_list = resp.headers["WWW-Authenticate"].split(",")
hdr_dict = {
item.split("=")[0].strip(): item.split("=")[1].strip() for item in hdr_list
}
return hdr_dict["Bearer authorization_uri"].strip('"').split("/")[3]
else:
return None
def save_config_json(file_path, **kwargs):
if Path(file_path).exists():
bk_file = (
str(Path(file_path))
+ ".bak"
+ datetime.now().isoformat(timespec="seconds").replace(":", "-")
)
print(f"Exising config found. Saving current config.json to {bk_file}")
Path(file_path).rename(bk_file)
with open(file_path, "w") as fp:
json.dump(kwargs, fp, indent=2)
print(f"Settings saved config to {file_path}")
def save_config(b):
tenant = input_tenant.value
if not tenant:
tenant = get_tenant_for_subscription(input_wgt["tenant"].value)
print(f"TenantID found: {tenant_id}")
save_config_json(
file_path=input_wgt["path"].value,
tenant_id=tenant,
subscription_id=input_wgt["sub_id"].value,
workspace_id=input_wgt["ws_id"].value,
workspace_name=input_wgt["workspace"].value,
resource_group=input_wgt["res_grp"].value,
)
DEFAULT_CONFIG = "./config.json"
WIDGET_DEFAULTS = {
"layout": widgets.Layout(width="95%"),
"style": {"description_width": "200px"},
}
input_wgt = {
"path": widgets.Text(
description="Path to config.json", value=DEFAULT_CONFIG, **WIDGET_DEFAULTS
),
"workspace": widgets.Text(
description="Workspace name", placeholder="Workspace name", **WIDGET_DEFAULTS
),
"sub_id": widgets.Text(
description="Microsoft Sentinel Subscription ID",
placeholder="for example, ef28a760-8c61-41d7-8167-5c8e5d91268b",
**WIDGET_DEFAULTS,
),
"ws_id": widgets.Text(
description="Microsoft Sentinel Workspace ID",
placeholder="for example, ef28a760-8c61-41d7-8167-5c8e5d91268b",
**WIDGET_DEFAULTS,
),
"res_grp": widgets.Text(
description="Resource group", placeholder="Resource group", **WIDGET_DEFAULTS
),
"tenant": widgets.Text(
description="TenantId", placeholder="Leave blank to look up", **WIDGET_DEFAULTS
),
}
if Path(DEFAULT_CONFIG).exists():
with open(DEFAULT_CONFIG, "r") as fp:
config_dict = json.load(fp)
input_wgt["path"].value = DEFAULT_CONFIG
input_wgt["sub_id"].value = config_dict.get("subscription_id", "")
input_wgt["ws_id"].value = config_dict.get("workspace_id" "")
input_wgt["workspace"].value = config_dict.get("workspace_name" "")
input_wgt["res_grp"].value = config_dict.get("resource_group" "")
input_wgt["tenant"].value = config_dict.get("tenant_id" "")
save_button = widgets.Button(description="Save config.json file")
save_button.on_click(save_config)
display(widgets.VBox([*(input_wgt.values()), save_button]))
VBox(children=(Text(value='./config.json', description='Path to config.json', layout=Layout(width='95%'), styl…