Notebook

Guided Hunting - Azure Resource Explorer¶

Â Details...

Notebook Version: 1.0
Python Version: Python 3.7 (including Python 3.6 - AzureML)
Required Packages: kqlmagic, msticpy, pandas, numpy, matplotlib, networkx, ipywidgets, ipython
Platforms Supported:

Azure Notebooks Free Compute
Azure Notebooks DSVM
OS Independent
Azure Machine Learning Notebooks

Data Sources Required:

Log Analytics
- SecurityAlert
- SignInLogs
- AzureActivity
ResourceGraph
- Resources
(Optional)
- VirusTotal (with API key)
- Alienvault OTX (with API key)
- IBM Xforce (with API key)

This notebook guides you through an investigation of an Azure Resource of choice and enables you to pivot using functionality from Azure Resource Graphs. The notebook uses SecurityAlert, SignInLogs, and AzureActivity logs.

You can begin with a resource or a security alert you want to investigate or use our queries to find one of interest.

The goal of the notebook is to help you better understand potential malicious behavior in your Azure Resource Graph and to successfully pivot to resources of interest as you hunt.

1 Notebook Initialization
- 1.1 Get WorkspaceId and Authenticate to Log Analytics and ResourceGraph
2 Select Resource to Investigate
- 2.1 Select Time Range
- 2.2 Select Resource
3 View Resource Graph
4 Resource Investigation
5 Related AzureActivityLogs Activity

Notebook initialization¶

The next cell:

Checks for the correct Python version
Checks versions and optionally installs required packages
Imports the required packages into the notebook
Sets a number of configuration options.

This should complete without errors. If you encounter errors or warnings look at the following two notebooks:

If you are running in the Azure Sentinel Notebooks environment (Azure Notebooks or Azure ML) you can run live versions of these notebooks:

You may also need to do some additional configuration to successfully use functions such as Threat Intelligence service lookup and Geo IP lookup. There are more details about this in the ConfiguringNotebookEnvironment notebook and in these documents:

In [ ]:

from pathlib import Path
import os
import sys
import warnings
from IPython.display import display, HTML, Markdown
REQ_PYTHON_VER=(3, 6)
REQ_MSTICPY_VER=(0, 6, 0)

display(HTML("Checking for msticpy update"))

%pip install --upgrade msticpy

import msticpy

msticpy.init_notebook(namespace=globals())

Get WorkspaceId and Authenticate to Log Analytics and ResourceGraph¶

Run the cells below to connect to your Log Analytics workspace. If you haven't already, please fill in the relevant information in msticpyconfig.yaml. This file is found in the Azure Sentinel Notebooks folder this notebook is in. There is more information on how to do this in the Notebook Setup section above. You may need to restart the kernel after doing so and rerun any cells you've already run to update to the new information.

If you are unfamiliar with connecting to Log Analytics or want a more in-depth walkthrough, check out the Getting Started with Azure Sentinel Notebook.

If you are running this notebook locally, you may also need to install Azure CLI. You will have to restart your computer and relaunch the notebook if this is done.

Log into Azure¶

Log into your Azure account by running the following cell.

In [ ]:

!az login

Connect to your Azure Workspace¶

In [ ]:

# See if we have an Azure Sentinel Workspace defined in our config file.
# If not, let the user specify Workspace and Tenant IDs

ws_config = WorkspaceConfig()
if not ws_config.config_loaded:
    ws_config.prompt_for_ws()

Connect to ResourceGraph and LogAnalytics¶

In [ ]:

# Connect to Resource Graph

qp_RG = QueryProvider("ResourceGraph")
qp_RG.connect(ws_config)

In [ ]:

# Connect to Log Analytics

qp_LA = QueryProvider("LogAnalytics")
qp_LA.connect(ws_config)

Select Resource to Investigate¶

Select Time Range¶

This time range will be used in all queries that follow in this notebook to retrieve any related alerts connected to your chosen resource.

In [ ]:

q_times = nbwidgets.QueryTime(units='day', max_before=20, before=5, max_after=1)
q_times.display()

Select Resource¶

Enter ResourceID¶

If you already know which resource you want to investigate, enter its resource ID in the text box after running the following cell.

Skip this cell if you would like to use related alerts to select a resource to investigate. The below cells will provide some context on related alerts and offer you a chance to select a resource directly.

In [ ]:

selected_resourceName = widgets.Text(
    placeholder='insert resource ID',
    description='Resource ID:',
    disabled=False
)

display(selected_resourceName)

Run the following cells for a summary table of alert activity in your workspace. Resources with more SecurityAlert results may be more likely to be victims of malicious activity.

In [ ]:

alert_query = f"""
SecurityAlert
| where TimeGenerated >= datetime("{q_times.start}")
| where TimeGenerated <= datetime("{q_times.end}")
| where isnotempty(ResourceId)
| extend json_extendProp = parse_json(ExtendedProperties)
| extend UserName = json_extendProp['User Name'], ServiceId = json_extendProp['ServiceId'], WdatpTenantId = json_extendProp['WdatpTenantId'], FileName = json_extendProp['File Name'], resourceType = json_extendProp['resourceType'], AttackerSourceIP = json_extendProp['Attacker source IP'], numFailedAuthAttemptsToHost = json_extendProp['Number of failed authentication attempts to host'], numExistingAccountsUsedBySource = json_extendProp['Number of existing accounts used by source to sign in'], numNonExistentAccountsUsedBySource = json_extendProp['Number of nonexistent accounts used by source to sign in'], topAccountsWithFailedSignInAttempts = json_extendProp['Top accounts with failed sign in attempts (count)'], RDPSessionInitiated = json_extendProp['Was RDP session initiated'], attackerSourceComputerName = json_extendProp['Attacker source computer name'] 
| project-away json_extendProp
"""

alert_df = qp_LA.exec_query(alert_query)


sum_alert_query = f"""
SecurityAlert
| where TimeGenerated >= datetime("{q_times.start}")
| where TimeGenerated <= datetime("{q_times.end}")
| where isnotempty(ResourceId)
| extend json_extendProp = parse_json(ExtendedProperties)
| extend UserName = json_extendProp['User Name'], ServiceId = json_extendProp['ServiceId'], WdatpTenantId = json_extendProp['WdatpTenantId'], FileName = json_extendProp['File Name'], resourceType = json_extendProp['resourceType'], AttackerSourceIP = json_extendProp['Attacker source IP'], numFailedAuthAttemptsToHost = json_extendProp['Number of failed authentication attempts to host'], numExistingAccountsUsedBySource = json_extendProp['Number of existing accounts used by source to sign in'], numNonExistentAccountsUsedBySource = json_extendProp['Number of nonexistent accounts used by source to sign in'], topAccountsWithFailedSignInAttempts = json_extendProp['Top accounts with failed sign in attempts (count)'], RDPSessionInitiated = json_extendProp['Was RDP session initiated'], attackerSourceComputerName = json_extendProp['Attacker source computer name'] 
| project-away json_extendProp
| summarize count() by AlertName, AlertSeverity, CompromisedEntity, tostring(resourceType)
| sort by count_
"""

sum_alert_df = qp_LA.exec_query(sum_alert_query)
display(sum_alert_df)

Run the cell below to see a dropdown listing all resources involved in the alerts shown. Select one that you would like to investigate. Skip this section if you have already entered a ResourceID of interest above.

In [ ]:

resource_types = [i if i else "N/A" for i in alert_df.resourceType]
resources = set(zip(alert_df.CompromisedEntity, resource_types))
resources = [i for i in resources if i[0]]
resources = [str(i).replace('(','').replace(')','').replace("'", '') for i in resources]
resource_dropdown = widgets.Dropdown(options = resources, description='Resource:')
display(resource_dropdown)

View Resource Graph¶

This section of the notebook allows you to investigate resources related to the resource you have chosen and better understand your resource graph environment by generating a visual representation of the graph. You can reselect the resource you want to investigate in the sections above at any time. Rerun the below cells to generate a new graph if you select a different resource.

Run the following cells to generate the resource graph.

Import required graph libraries¶

In [ ]:

# Import libraries

import networkx as nx
from bokeh.io import output_notebook, show, save
from bokeh.models import (BoxSelectTool, Circle, EdgesAndLinkedNodes, HoverTool,
                          MultiLine, NodesAndLinkedEdges, Plot, Range1d, TapTool, ColumnDataSource, LabelSet)
from bokeh.plotting import figure
from bokeh.plotting import from_networkx
from bokeh.palettes import Blues8, Reds8, Purples8, Oranges8, Viridis8, Spectral8, Blues256
from bokeh.transform import linear_cmap, factor_cmap
from networkx.algorithms import community
from ipywidgets import interact, interactive, fixed, interact_manual
from bokeh.io import push_notebook, show, output_notebook

output_notebook()

Validate selected resource¶

The following cell will confirm if the resource you selected exists and is valid for generating the investigation graph. If the resource is not found, feel free to use the dropdown or text box to enter a different resource and return to this cell.

In [ ]:

# Query ResourceGraph for resource info 
if selected_resourceName.value == '':
    print("SELECTED: ", resource_dropdown.value.split(',')[0])
    rg_query = f"""
    Resources
    | where name == "{resource_dropdown.value.split(',')[0]}"
    """
else:
    print("SELECTED: ", selected_resourceName.value)
    rg_query = f"""
    Resources
    | where name == "{selected_resourceName.value}"
    """
    
rg_df = qp_RG.exec_query(rg_query)
display(pd.DataFrame(rg_df.iloc[0].T))

try:
    resource_id_list = [rg_df['id'][0]]
    rg = rg_df['resourceGroup'][0]
    print("Resource found!")
    
    related_rg_query = f"""
    Resources
    | where resourceGroup == "{rg}"
    """
    
    related_rg_df = qp_RG.exec_query(related_rg_query)
    resource_id_list.extend(list(related_rg_df['id']))
    related_rg_df['managedByVal'] = related_rg_df['managedBy'].str.split('/').str[-1]
    
except:
    print("No results for that resource. Please select a different resource above.")

#print("You can select a different resource here and run the cell again.")
#resource_dropdown = widgets.Dropdown(options = resources, description='Resource:')
#display(resource_dropdown)

Generate graph¶

The following cells will generate a NetworkX graph of your resource environment. Please run each cell to properly generate the graph. Confirmation that the cell you just ran worked properly will print out once each cell finishes running.

In [ ]:

# Parse for relationships between resource types

network_rg_df = related_rg_df.loc[related_rg_df['managedByVal'] != '']
vm_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.compute/virtualmachines']
nsg_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.network/networksecuritygroups']
ip_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.network/publicipaddresses']

# Get associated NIC to a given VM
def get_associated_nic(vm_name):
    nic_query = f""" Resources
                    | where name == "{vm_name}"
                    | extend d=parse_json(properties)
                    | project result = d.networkProfile['networkInterfaces'][0]["id"]
                """
    nic_id = qp_RG.exec_query(nic_query)['result']
    if nic_id[0] == None: 
        final_nic_id = nic_id[1]
    else:
        final_nic_id = nic_id[0]
    
    nic_name_query = f"""Resources
                        | where id == "{final_nic_id}"
                        | project name
                        """
    nic_name = qp_RG.exec_query(nic_name_query)['name'][0]
    
    return nic_name


# Get associated NIC to a given NSG
def get_associated_nic_nsg(nsg_name):
    nic_query = f""" Resources
                    | where name == "{nsg_name}"
                    | extend d=parse_json(properties)
                    | project result = d.networkInterfaces[0]['id']
                """
    nic_id = qp_RG.exec_query(nic_query)['result'][0]
    
    nic_name_query = f"""Resources
                        | where id == "{nic_id}"
                        | project name
                        """
    nic_name = qp_RG.exec_query(nic_name_query)['name'][0]
    
    return nic_name


vm_nic_pairs = []
vm_nic_dict = {}

for vm in vm_rg_df['name']:
    vm_nic_pairs.append((vm, get_associated_nic(vm)))
    vm_nic_dict[vm] = get_associated_nic(vm)

vm_nic_df = pd.DataFrame(vm_nic_pairs, columns =['name', 'nic'])

nic_nsg_pairs = []
nic_nsg_dict = {}

for nsg in nsg_rg_df['name']:
    nic_nsg_pairs.append((nsg, get_associated_nic_nsg(nsg)))
    nic_nsg_dict[nsg] = get_associated_nic_nsg(nsg)
nic_nsg_df = pd.DataFrame(nic_nsg_pairs, columns =['nsg', 'nic'])


# Get associated NIC to a given IP
def get_associated_nic_ip(ip_name):
    nic_query = f""" Resources
                    | where name == "{ip_name}"
                    | extend d=parse_json(properties)
                    | project result = d.ipConfiguration['id']
                """
    try: 
        nic_name = qp_RG.exec_query(nic_query)['result'][0].split('/')[-3]
    except:
        nic_name = qp_RG.exec_query(nic_query)['result'][1].split('/')[-3]
    return nic_name

nic_ip_pairs = []
nic_ip_dict = {}

for ip in ip_rg_df['name']:
    nic_ip_pairs.append((ip, get_associated_nic_ip(ip)))
    nic_ip_dict[ip] = get_associated_nic_ip(ip)
    
nic_ip_df = pd.DataFrame(nic_ip_pairs, columns =['ip', 'nic'])

storage_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.storage/storageaccounts']
vnet_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.network/virtualnetworks']
endpt_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.network/privateendpoints']

# Get associated Vnet for a given Endpt
def get_associated_vnet(endpt_name):
    vnet_query = f"""Resources
                    | where name == "{endpt_name}"
                    | extend d=parse_json(properties)
                    | project result = d.subnet['id']
                """
    try: 
        vnet_id = qp_RG.exec_query(vnet_query)['result'][0].split('/')[-3]
    except:
        vnet_id = qp_RG.exec_query(vnet_query)['result'][1].split('/')[-3]
    return vnet_id

vnet_endpt_pairs = []
vnet_endpt_dict = {}

for endpt in endpt_rg_df['name']:
    vnet_endpt_pairs.append((endpt, get_associated_vnet(endpt)))
    vnet_endpt_dict[endpt] = get_associated_vnet(endpt)
    
vnet_endpt_df = pd.DataFrame(vnet_endpt_pairs, columns =['vnet', 'endpt'])

print("Associations complete")

In [ ]:

# Create Networkx graph and add nodes
g = nx.MultiGraph()

resource_list_order = [(vm_rg_df, "resourceGroup", "name"), (vm_nic_df, "nic", "name"), (nic_nsg_df, "nsg", "nic"), (nic_ip_df, "ip", "nic"), 
                        (network_rg_df, "resourceGroup", "managedByVal"), (storage_rg_df, "name", "resourceGroup"), (vnet_rg_df, "name", "resourceGroup"),
                        (vnet_endpt_df, "endpt", "vnet"), (related_rg_df, "resourceGroup", "name")]

for r in resource_list_order:
    g.add_nodes_from(nx.from_pandas_edgelist(r[0], r[1], r[2]))

# Add edges between nodes based on the hierarchical associations determined in the previous cell
for node in g:
    if node in vm_rg_df['name'].values:
        g.add_edge(node, vm_nic_dict[node])
    elif node in nic_nsg_df['nsg'].values:
        g.add_edge(node, nic_nsg_dict[node])
    elif node not in vm_nic_df['nic'].values and node in nic_nsg_df['nic'].values:
        g.add_edge(node, rg)
    elif node in network_rg_df['name'].values:
        g.add_edge(node, network_rg_df.loc[network_rg_df['name'] == node, 'managedByVal'].item())
    elif node in vnet_rg_df['name'].values:
        g.add_edge(node, rg)
    elif node in endpt_rg_df['name'].values:
        g.add_edge(node, vnet_endpt_dict[node])
    elif node in storage_rg_df['name'].values:
        g.add_edge(node, rg)
    elif node in nic_ip_df['ip'].values:
        g.add_edge(node, nic_ip_dict[node])
    elif node not in vm_nic_df['nic'].values and node not in nic_nsg_df['nic'].values:
        g.add_edge(node, rg)

#nx.draw(g)
print("NetworkX done")

In [ ]:

# Set graph node (resource) attributes
def get_resource_alert_count(resource_name):
    resource_alert_sev_query = f"""
    SecurityAlert
    | where TimeGenerated >= datetime("{q_times.start}")
    | where TimeGenerated <= datetime("{q_times.end}")
    | where ResourceId contains "{resource_name}"
    | summarize count()
    """
    resource_alert_sev_df = qp_LA.exec_query(resource_alert_sev_query)
    return resource_alert_sev_df["count_"][0]

def get_resource_type(resource_name):
    resource_type_query = f"""
    Resources
    | where name == "{resource_name}"
    | project type
    """
    resource_type_df = qp_RG.exec_query(resource_type_query)
    return resource_type_df["type"][0]

num_alert_dict = {}
resource_type_dict = {}
selected_resource_dict = {}
selected_resource_color_dict = {}
show_or_hide_dict = {}
for node in g:
    show_or_hide_dict[node] = "show"
    num_alert_dict[node] = get_resource_alert_count(node) + 20
    if node != rg:
        if node == resource_dropdown.value.split(',')[0]:
            selected_resource_dict[node] = 1
            selected_resource_color_dict[node] = Spectral8[1]
        else:
            selected_resource_dict[node] = 0
            selected_resource_color_dict[node] = Spectral8[3]
        resource_type_dict[node] = get_resource_type(node)
    else:
        resource_type_dict[node] = "ResourceGroup"
        
nx.set_node_attributes(g, name='num_alerts', values=num_alert_dict)
nx.set_node_attributes(g, name='resource_type', values=resource_type_dict)
nx.set_node_attributes(g, name='selected_resource', values=selected_resource_dict)
nx.set_node_attributes(g, name='selected_resource_color', values=selected_resource_color_dict)
nx.set_node_attributes(g, name='show_or_hide', values=show_or_hide_dict)

print("Graph notes successfully generated")

Show Graph¶

The following graph prints out the graph that the above cells generate. Keep the following in mind for optimal viewing:

The sizes of the circles represent how many alerts are related to the resource that it represents. The resource you selected above to investigate will be in a darker green color than the rest.
Hover over each circle for information on its name, type, and the number of alerts associated with it.
Use the selector tool to choose the types of resources you want displayed in the graph. Be aware the graph will not update unless you also update the slider after updating the selector.
Use the slider to filter by the number of alerts. We recommend clicking rather than sliding to prevent the graph from slowly generating a graph per number you slide onto.

In [ ]:

# Create graph
# Define size and color attributes
size_by_this_attribute = 'num_alerts'
color_by_this_attribute = 'selected_resource_color'
color_palette = Blues8

#Choose colors for node and edge highlighting
node_highlight_color = 'white'
edge_highlight_color = 'black'

def create_graph(g_copy, show_graph):
    #Choose a title
    title = 'Azure Resource Graph'

    #Hover categories
    HOVER_TOOLTIPS = [("Resource Name", "@index"),
                     ("Num Alerts", "@num_alerts"),
                     ("Type", "@resource_type")]

    #Set dimensions, title, toolbar
    plot = figure(tooltips = HOVER_TOOLTIPS,
                  tools="pan,wheel_zoom,save,reset", active_scroll='wheel_zoom', title=title, width=900, height=700)

    plot.add_tools(HoverTool(tooltips=None), TapTool(), BoxSelectTool())
    #Create graph
    network_graph = from_networkx(g_copy, nx.spring_layout, scale=20, center=(0, 0))

    #Set node sizes and colors according to num alerts and type
    network_graph.node_renderer.glyph = Circle(size=size_by_this_attribute, fill_color=color_by_this_attribute)

    #Set highlight colors
    network_graph.node_renderer.hover_glyph = Circle(size=size_by_this_attribute, fill_color=node_highlight_color, line_width=2)
    network_graph.node_renderer.selection_glyph = Circle(size=size_by_this_attribute, fill_color="black", line_width=2)

    #Set edge opacity and width
    network_graph.edge_renderer.glyph = MultiLine(line_alpha=0.5, line_width=1)

    #Set edge highlight colors
    network_graph.edge_renderer.selection_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)
    network_graph.edge_renderer.hover_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)

    #Highlight nodes and edges
    network_graph.selection_policy = NodesAndLinkedEdges()
    network_graph.inspection_policy = NodesAndLinkedEdges()

    #Add Labels
    x, y = zip(*network_graph.layout_provider.graph_layout.values())
    node_labels = list(g_copy.nodes())
    source = ColumnDataSource({'x': x, 'y': y, 'name': [node_labels[i] for i in range(len(x))]})
    labels = LabelSet(x='x', y='y', text='name', source=source, background_fill_color='white', text_font_size='10px', background_fill_alpha=.7)
    plot.renderers.append(labels)

    #Add network graph to the plot
    plot.renderers.append(network_graph)

    show(plot)
    
output_notebook()
resource_names = set(resource_type_dict.values())
resource_names.remove("ResourceGroup")
sel_sub = nbwidgets.SelectSubset(source_items=resource_names, default_selected=["microsoft.compute/virtualmachines"])

def filter_graph(alert_limit):
    g_copy = g.copy()
    att_dict_alerts = nx.get_node_attributes(g_copy,'num_alerts')
    att_dict_type = nx.get_node_attributes(g_copy, 'resource_type')
    kept_alerts = dict(filter(lambda elem: elem[1] > alert_limit, att_dict_alerts.items()))
    kept_types = dict(filter(lambda elem: elem[1] in sel_sub.selected_items, att_dict_type.items()))
    alert_keep = list(kept_alerts.keys())
    type_keep = list(kept_types.keys())
    list_keep = [x for x in alert_keep if x in type_keep]
    
    for node in g:
        if node != rg:
            if node not in list_keep:
                g_copy.remove_node(node)
                
    #g = g_copy
                
    create_graph(g_copy, False)
                
    push_notebook()
    
interact(filter_graph, alert_limit = (0, max(num_alert_dict.values())))

Resource Investigation¶

The following sections provide context around the resource you selected.

The following cell shows SecurityAlert event log entries that feature

This includes alerts in which the Compromised Entity is the resource you selected and those that contain the same IP addresses that appear in alerts with the selected compromised entity. A TI search on available IOC data is calculated where available.

In [ ]:

# Alerts from the chosen resource

related_alerts_df = alert_df[alert_df['CompromisedEntity'] == resource_dropdown.value.split(',')[0]].copy()

# parse for IP address
def ip_splitter(ip):
    if ip != None:
        if "IP Address:" in ip:
            return ip.split(":")[1].strip()
        else:
            return ip
    return ip

related_alerts_df["AttackerSourceIP"] = related_alerts_df["AttackerSourceIP"].apply(lambda ip: ip_splitter(ip))

# add TI Data column
def getTIData(col):
    sev = []
    if col in ti_results["Ioc"].values:
        sev.append((col, ti_results.loc[ti_results['Ioc'] == col, 'Severity'].item()))
    else:
        sev.append(("n/a", "n/a"))
    return sev

severity_values = {'information': 0, 'high': 3}
def getHighestSev(col):
    sev = []
    for i in range(len(col)):
        if 'n/a' in col[i][0]:
            sev.append('n/a')
        else:
            sev.append(col[i][0][1])
    return sev

all_ips = set(related_alerts_df["AttackerSourceIP"].values)

def print_related_alerts(related_alerts_df):
    attacker_source_ips = list(set(related_alerts_df['AttackerSourceIP'].values))
    attacker_source_ips_str = str(attacker_source_ips).replace('[', '(').replace(']', ')')
    ip_alert_query = f"""
    SecurityAlert
    | where TimeGenerated >= datetime("{q_times.start}")
    | where TimeGenerated <= datetime("{q_times.end}")
    | where isnotempty(ResourceId)
    | extend json_extendProp = parse_json(ExtendedProperties)
    | extend UserName = json_extendProp['User Name'], ServiceId = json_extendProp['ServiceId'], WdatpTenantId = json_extendProp['WdatpTenantId'], FileName = json_extendProp['File Name'], resourceType = json_extendProp['resourceType'], AttackerSourceIP = json_extendProp['Attacker source IP'], numFailedAuthAttemptsToHost = json_extendProp['Number of failed authentication attempts to host'], numExistingAccountsUsedBySource = json_extendProp['Number of existing accounts used by source to sign in'], numNonExistentAccountsUsedBySource = json_extendProp['Number of nonexistent accounts used by source to sign in'], topAccountsWithFailedSignInAttempts = json_extendProp['Top accounts with failed sign in attempts (count)'], RDPSessionInitiated = json_extendProp['Was RDP session initiated'], attackerSourceComputerName = json_extendProp['Attacker source computer name'] 
    | project-away json_extendProp
    | where AttackerSourceIP has_any {attacker_source_ips_str}
    """
    ip_alert_df = qp_LA.exec_query(ip_alert_query)
    related_alerts_df = pd.concat([ip_alert_df, related_alerts_df]).drop_duplicates().reset_index(drop=True)
    related_alerts_df["AttackerSourceIP"] = related_alerts_df["AttackerSourceIP"].apply(lambda ip: ip_splitter(ip))
    ti_lookup = TILookup()
    ti_results = ti_lookup.lookup_iocs(data=attacker_source_ips)
    related_alerts_df["TIData"] = related_alerts_df['AttackerSourceIP'].apply(getTIData)
    related_alerts_df["TISeverity"] = getHighestSev(list(related_alerts_df['TIData'].values))
    display(related_alerts_df[['TimeGenerated', 'AlertName', 'AlertSeverity', 'TISeverity', 'AttackerSourceIP', 'ResourceId', 'TIData', 'ProductName', 'resourceType', 'numNonExistentAccountsUsedBySource', 'topAccountsWithFailedSignInAttempts', 'attackerSourceComputerName']])

if len(all_ips) == 0 or (len(all_ips) == 1 and None in all_ips):
    print("No data for TI search")
    display(related_alerts_df[['TimeGenerated', 'AlertName', 'AlertSeverity', 'ResourceId', 'ProductName', 'resourceType', 'numNonExistentAccountsUsedBySource', 'topAccountsWithFailedSignInAttempts', 'attackerSourceComputerName']])
else:
    print_related_alerts(related_alerts_df)

Investigate further!¶

If you would like to pivot further on a certain entity, please check out our Entity Explorer series:

In [ ]:

# density timeline - all on one line, or at least high on top

if 'TISeverity' in related_alerts_df.columns:
    nbdisplay.display_timeline(related_alerts_df,
                        time_column="TimeGenerated",
                        group_by="TISeverity",
                        source_columns=["AlertName", "Description", "AlertSeverity", "TISeverity", "ProviderName"])
else:
    nbdisplay.display_timeline(related_alerts_df,
                        time_column="TimeGenerated",
                        group_by="AlertSeverity",
                        source_columns=["AlertName", "Description", "AlertSeverity", "ProviderName"]) 

Parse ResourceGraph¶

From the dropdown below, pick a resource of interest from the resource graph then run the cell below it to view all information gathered on it.

In [ ]:

rg = rg_df['resourceGroup'][0]

related_rg_query = f"""
Resources
| where resourceGroup == "{rg}"
"""

related_rg_df = qp_RG.exec_query(related_rg_query)
resource_id_list.extend(list(related_rg_df['id']))

all_resources = [i for i in g]
all_resource_dropdown = widgets.Dropdown(options = all_resources, description='Resources:')
display(all_resource_dropdown)

In [ ]:

# Parse all info

chosen_resource_query = f"""
Resources
| where name == "{all_resource_dropdown.value}"
"""
try:
    chosen_resource_df = qp_RG.exec_query(chosen_resource_query)
    display(chosen_resource_df.transpose().style.set_properties(**{'text-align': 'left'}))    
except:
    print("No results. Please select another resource.")

Investigate further!¶

To further view a user's access, please check out our Guided Analysis - User Security Metadata notebook.

Location and Resource Type Counts¶

The following cell prints out summary information about all of the resources and their locations and types in your workspace.

In [ ]:

print ("LOCATIONS:")
print(related_rg_df['location'].value_counts())

print("\n\nRESOURCE TYPE COUNTS:")
print(related_rg_df['type'].value_counts())

In the following cell, we use a KQL query to see if there are any AzureActivity log entries related to the resource you selected. You can use the results to pivot and check for TI intel results.

In [ ]:

azure_activity_query = f"""
AzureActivity
//| where TimeGenerated >= datetime("{q_times.start}")
//| where TimeGenerated <= datetime("{q_times.end}")
| where Resource =~ "{resource_dropdown.value.split(',')[0]}"
| extend json_prop = parse_json(Properties)
| extend isComplianceCheck = json_prop['isComplianceCheck'], ancestors = json_prop['ancestors'], message = json_prop['message']
| extend json_auth = parse_json(Authorization)
| extend action = json_auth['action'], scope = json_auth['scope']
| extend json_http = parse_json(HTTPRequest)
| extend clientRequestId = json_http['clientRequestId'], clientIpAddress = json_http['clientIpAddress'], method = json_http['method']
| project-away json_prop, json_auth, json_http
| summarize count() by OperationName, Caller, CallerIpAddress, tostring(clientIpAddress)
| sort by count_
"""

azure_activity_df = qp_LA.exec_query(azure_activity_query)

# get TI data
callIpAddressList = list(azure_activity_df['CallerIpAddress'].unique())
cliIpAddressList = list(azure_activity_df['clientIpAddress'].unique())
callIpAddressList.extend(cliIpAddressList)
callIpAddressList = list(set([i for i in callIpAddressList if i]))
aa_full_list = callIpAddressList

#aa_results = ti_lookup.lookup_iocs(data=aa_full_list)

# add TI column
def getTIData(col):
    sev = []
    if col in aa_results["Ioc"].values:
        sev.append((col, aa_results.loc[aa_results['Ioc'] == col, 'Severity'].item()))
    else:
        sev.append(("n/a", "n/a"))
    return sev

severity_values = {'information': 0, 'high': 3}
def getHighestSev(call, cli):
    sev = []
    for i in range(len(call)):
        if 'n/a' in call[i][0] or 'n/a' in cli[i][0]:
            sev.append('n/a')
        else:
            if severity_values[call[i][0][1]] > severity_values[cli[i][0][1]]:
                sev.append(call[i][0][1])
            else:
                sev.append(cli[i][0][1])
    return sev


if len(aa_full_list) == 0:
    print("No data for TI search")
    display(azure_activity_df)
else:
    ti_lookup = TILookup()
    aa_results = ti_lookup.lookup_iocs(data=aa_full_list)
    azure_activity_df["TIData_caller"] = azure_activity_df['CallerIpAddress'].apply(getTIData)
    azure_activity_df["TIData_client"] = azure_activity_df['clientIpAddress'].apply(getTIData)
    azure_activity_df["Severity"] = getHighestSev(list(azure_activity_df['TIData_caller'].values), list(azure_activity_df['TIData_client'].values))
                               
display(azure_activity_df)

AzureActivity Timeline¶

The following cell prints out a timeline of AzureActivity entries related to the resource you selected to put the results into time context. It also parses any TI data out and results from connected TI sources.

In [ ]:

all_azure_activity_query = f"""
AzureActivity
//| where TimeGenerated >= datetime("{q_times.start}")
//| where TimeGenerated <= datetime("{q_times.end}")
| where Resource =~ "{resource_dropdown.value.split(',')[0]}"
| extend json_prop = parse_json(Properties)
| extend isComplianceCheck = json_prop['isComplianceCheck'], ancestors = json_prop['ancestors'], message = json_prop['message']
| extend json_auth = parse_json(Authorization)
| extend action = json_auth['action'], scope = json_auth['scope']
| extend json_http = parse_json(HTTPRequest)
| extend clientRequestId = json_http['clientRequestId'], clientIpAddress = json_http['clientIpAddress'], method = json_http['method']
| project-away json_prop, json_auth, json_http
"""
all_azure_activity_df = qp_LA.exec_query(all_azure_activity_query)

if len(aa_full_list) == 0:
    print("No data for TI search")
    display(all_azure_activity_df)
else:
    ti_lookup = TILookup()
    aa_results = ti_lookup.lookup_iocs(data=aa_full_list)
    all_azure_activity_df["TIData_caller"] = all_azure_activity_df['CallerIpAddress'].apply(getTIData)
    all_azure_activity_df["TIData_client"] = all_azure_activity_df['clientIpAddress'].apply(getTIData)
    all_azure_activity_df["TISeverity"] = getHighestSev(list(all_azure_activity_df['TIData_caller'].values), list(all_azure_activity_df['TIData_client'].values))
    display(all_azure_activity_df[['TimeGenerated', 'OperationName', 'Level', 'ActivityStatus', 'TISeverity', 'TIData_caller', 'TIData_client', 'CorrelationId', 'Caller', 'clientRequestId']])

Show Timeline¶

In [ ]:

if 'TISeverity' in all_azure_activity_df.columns:
    nbdisplay.display_timeline(all_azure_activity_df,
                    time_column="TimeGenerated",
                    group_by="TISeverity",
                    source_columns=["OperationName", "Level", "CorrelationId", "Caller", "CallerIpAddress"])
else:
    nbdisplay.display_timeline(related_alerts_df,
                        time_column="TimeGenerated",
                        group_by="Level",
                        source_columns=["AlertName", "Description", "AlertSeverity", "ProviderName"]) 

Guided Hunting - Azure Resource Explorer¶

Notebook initialization¶

Get WorkspaceId and Authenticate to Log Analytics and ResourceGraph¶

Log into Azure¶

Connect to your Azure Workspace¶

Connect to ResourceGraph and LogAnalytics¶

Select Resource to Investigate¶

Select Time Range¶

Select Resource¶

Enter ResourceID¶

Gather related alert information and select resource¶

View Resource Graph¶

Import required graph libraries¶

Validate selected resource¶

Generate graph¶

Show Graph¶

Resource Investigation¶

Related Alerts¶

Investigate further!¶

Timeline of related alerts¶

Parse ResourceGraph¶

Investigate further!¶

Location and Resource Type Counts¶

Related AzureActivityLogs Activity¶

AzureActivity Timeline¶

Show Timeline¶