# Looking beyond the details: understand attack 'storylines' with MITRE tags¶

## Introduction¶

Today's compute environments are typically well monitored by multiple different network- and host-based sensors - producing a large amount of events and alerts even after careful tuning. In case of a complex attack chain, where an attacker is able to break into an internal system and move to other systems, gain privileges and get access to data, different alerts will be created during the attacker's activities. While the detailed monitoring is great to catch and mitigate very specific attack steps, it can be hard to get the bigger picture of the attack operation and understand the overall goals (and results) of the attack.

In the following, we explore one possible way to abstract from the details to a higher-level understanding of an attack chain to expose the 'storyline' of the attack with the help of MITRE tags - complementary to the lower-level details of alerts. The MITRE ATT&CK® framework defines a "globally-accessible knowledge base of adversary tactics and techniques based on real-world observations", i.e., describes the different kinds of possible attack steps in a systematic way, independent from any particular tool. For each of the documented techniques/sub-techniques the MITRE ATT&CK includes textual context, examples, ways of mitigations and detections and references. It is structured into a set of 14 high-level tactics, like reconnaissance, initial access, privileged escalation etc, each of which is comprised of many techniques and sub-techniques in a hierarchical fashion.

One concrete example for illustration is the Command and Scripting Interpreter: PowerShell with the MITRE tag T1059.001. It describes the potential abuse of the PowerShell by adversaries for execution of malicious tasks. The tag T1059.001 indicates that this is the sub-technique 1 of the technique T1059: Command and Scripting Interpreter that in turn is part of the high-level tactic Execution.

Particularly important is that this effort is independent of any specific tool and thereby can be used to understand attack steps across multiple tools. Support for tagging the tool-specific alerts with MITRE tags are in many cases under way or already available.

Intuition suggests that a more complex attack chain will have some typical sequence of phases at a higher level, proceeding maybe in progression like discovery → initial access → privilege escalation → data exfiltration → persistence (although this in general will not be a strict sequence of phases, as an attacker might skip or repeat phases, e.g., after further lateral movements).

To test and illustrate this approach we have created a small test environment. In this setup, we use a known, scripted attack scenario while observing the environment with multiple sensors and collect the alerts generated by the sensors. Instead of looking at the details of these alerts, we here go in the opposite direction of abstracting from the low-level, detailed alerts to look only at the timeline of the MITRE tags associated with the alerts.

## Experimental setup¶

The small environment that we use for testing is shown in the above figure: the blue parts are the 'productive' components comprised of a web application juice-shop, a MySQL server and MS Windows server. The web application we use is the intentionally vulnerable web shop created by the OWASP team called Juice Shop for testing and learning. We use it on a Linux VM and on the Windows VM as an easy way for us to create an attack scenario targetting some of the known vulnerabilities of the Juice-Shop. In addition, there are a small number of VMs that act as simulated users of the webshop in order to create 'normal' background traffic.

The monitoring components are shown in green: we monitor the Linux VMs with SysFlow, the network is monitored by Snort and for the Windows server we have a setup that feeds Sysmon events through Winlogbeat into an instance of HELK. In this article, we only look at alerts (rather then the base telemetry events produced by SysFlow/Sysmon) that are generated by a TTP ruleset on the SysFlow side and the set of Sigma rules on the Sysmon side. The alerts based on the Sysmon events are in this setting created by ElastAlert that is part of the HELK package and is configured to use the set of Sigma rules. With regards to the current availability of MITRE tags:

• the TTP ruleset used by SysFlow already contains MITRE tags, and the team is continually extending the existing list
• also Sigma rules already are enriched with MITRE tags that we can use in ElastAlert
• Snort rules currently have no MITRE tags attached currently - in this case we manually added MITRE tag information to selected rules.

Our setup also contains a transformation of the alerts into the Elastic Common Schema (ECS) format for normalisation as part of the pipeline before they are collected in two instances of ElasticSearch.

For SysFlow this capability to convert events and alerts to the ECS format and storage into ElasticSearch was added in version 0.3.0 - you can find details about it in the blog together with a detailed demonstration.

The figure below shows the attack scenario that we created as a scripted way to run a complex attack scenario in a reproducible way: it contains steps from reconnaisance, like nmap scans, uses some of the known intentional vulnerabilities of the OWASP Juice-Shop to break into the web shop, attack from this foothold the database server and exfiltrate data from it. As a next step, the attacker moves laterally to the Windows server where the attack succeeds to install an additional user and a persistent backdoor with OpenSSH.

## Create MITRE tactics timeline from ECS alert data¶

### Define mapping MITRE tag → MITRE tactic¶

The alerts received by our sensors contain MITRE tags like T1059.001, but from this tag alone we do not know the MITRE tactic it belongs to as there is no strict hierarchy as MITRE techniques can belong to multiple tactics. Luckily there is an API for the MITRE ATT&CK data that we can use to resolve the tag to the tactic(s) it belongs to.

To achieve this we first download the information about all techniques via the API and then build a mapping tag2tactics for each tag to the tactics it belongs to.

In [1]:
# --- imports as required
import json, os, collections, datetime
import pandas as pd
from taxii2client.v20 import Server, Collection
from stix2 import TAXIICollectionSource, Filter
import plotly.graph_objects as go
import plotly as pl
import plotly.io as pio
pio.renderers.default = 'iframe'

In [2]:
# ----- download the ATT&CK data via its API
# see https://www.mitre.org/capabilities/cybersecurity/overview/cybersecurity-blog/attck%E2%84%A2-content-available-in-stix%E2%84%A2-20-via

# Instantiate server and get API Root
server = Server("https://cti-taxii.mitre.org/taxii/")
api_root = server.api_roots[0]

# Print name and ID of all ATT&CK technology-domains available as collections
for collection in api_root.collections:
print(collection.title + ": " + collection.id)

# Establish TAXII2 Collection instance for Enterprise ATT&CK collection
collection = Collection("https://cti-taxii.mitre.org/stix/collections/95ecc380-afe9-11e4-9b6c-751b66dd541e/")

# Supply the collection to TAXIICollection
tc_source = TAXIICollectionSource(collection)

# Fetch information about the techniques
techniques = tc_source.query([Filter("type", "=", "attack-pattern")])
return techniques

# ----- create a Python dict 'tag2tactics' containing the mapping
def get_mitre_tag(technique):
external_refs = technique.get('external_references')
mitre_tags = list(map(lambda x: x.get('external_id'),
filter(lambda x: x.get('source_name') == 'mitre-attack', external_refs)
))
if len(mitre_tags) == 1: return mitre_tags[0]
print(f'WARN: more than one or no mitre_tags found: mitre_tags = {mitre_tags}')
return None

def get_tactics(technique):
kill_chain_phases = technique.get('kill_chain_phases', [])
tactics = list(map(lambda x: x.get('phase_name'),
filter(lambda x: x.get('kill_chain_name') == 'mitre-attack', kill_chain_phases)
))
return tactics

def define_tag_tactic_mapping(techniques):
tag2tactics = {}
for it,technique in enumerate(techniques):
mitre_tag = get_mitre_tag(technique)
tactics = get_tactics(technique)
tag2tactics[mitre_tag] = tactics
return tag2tactics

# ----- create dictionary with mapping MITRE tag → MITRE tactics
tag2tactics = define_tag_tactic_mapping(techniques)
print(f'Found {len(tag2tactics)} different MITRE tags and created mapping to corresponding tactics in the dictionary tag2tactics.')

Enterprise ATT&CK: 95ecc380-afe9-11e4-9b6c-751b66dd541e
PRE-ATT&CK: 062767bd-02d2-4b72-84ba-56caef0f8658
Mobile ATT&CK: 2f669986-b40b-4423-b720-4396ca6a462b
ICS ATT&CK: 02c3ef24-9cd4-48f3-a99f-b74ce24f1d34

[taxii2client.v20] [WARNING ] [2022-02-03 17:12:18,042] TAXII Server Response did not include 'Content-Range' header - results could be incomplete.
[taxii2client.v20] [WARNING ] [2022-02-03 17:12:18,095] TAXII Server Response with different amount of objects! Setting per_request=707

Found 707 different MITRE tags and created mapping to corresponding tactics in the dictionary tag2tactics.


Please note that a MITRE tag can be associated with multiple tactics, so this is a one-to-many relationship.

As recreation of our test environment is beyond this article, we provide the alert data that we collected during the run of an attack here as JSON files that we load in this step into a combined list alert_data.

In [3]:
# --- loading the alert data collected in our test environment while running the scripted attack
data_dir = './data/mitre-tag-timeline/'

for data_file in data_files:
with open(data_dir + data_file, 'r') as inp:
'event.start': x['event'].get('start', None),
'event.created': x['event'].get('created', None),
'event.reason': x['event']['reason'],
'event.severity': x['event']['severity'],
'event.category': x['event']['category'],
'event.action': x['event']['action'],
'source_file': data_file}, _data)))


### Combine alert data with MITRE tactics using the mapping¶

With the observed alerts in alert_data, we use our previously generated tag2tactics dictionary to add the tactic(s) corresponding the MITRE tags to the alerts. As each alert can have multiple tags, and each MITRE tag in turn can belong to multiple tactics, we have to duplicate alert info accordingly to account for multiple related tactics. The enriched data is then converted to a pandas DataFrame alert_data_tactics while fixing the timestamps coming from different columns.

In [4]:
# --- use tags to add tactics: as there can be multiple tags per alert as well as multiple tactics for a tag, create copies of the alert as required
all_tactics = set()
for tag in tags:
tactics = tag2tactics[tag.replace('mitre:', '').upper()]
for tactic in tactics:
for tactic in all_tactics:
# --- convert data to pandas DataFrame
# --- depending on source file, the timestamps are in different columns ('event.start' vs 'event.created')
#     create a fixed 'timestamp' column instead


At this point, the list of alerts with MITRE tactics contains 37 entries:

In [5]:
alert_data_tactics

Out[5]:
tags event.reason event.severity event.category event.action source_file tactic timestamp
0 [mitre:T1018] Possible Nmap ping sweep 0 network Detection of a Network Scan elastic_snort.json discovery 2021-11-29 10:32:30.544
1 [mitre:T1018] Possible Nmap ping sweep 0 network Detection of a Network Scan elastic_snort.json discovery 2021-11-29 10:32:30.544
2 [mitre:T1018] Possible Nmap ping sweep 0 network Detection of a Network Scan elastic_snort.json discovery 2021-11-29 10:32:30.832
3 [mitre:T1018] Possible Nmap ping sweep 0 network Detection of a Network Scan elastic_snort.json discovery 2021-11-29 10:32:30.832
4 [mitre:T1046] TCP Port Scanning - probing closed port 0 network Detection of a Network Scan elastic_snort.json discovery 2021-11-29 10:32:31.405
5 [mitre:T1046] TCP Port Scanning 0 network Detection of a Network Scan elastic_snort.json discovery 2021-11-29 10:32:31.408
6 [mitre:T1190] SQL Injection attempt 2 network Web Application Attack elastic_snort.json initial-access 2021-11-29 10:32:34.172
7 [mitre:T1068] Possible SSTI attack 2 network Web Application Attack elastic_snort.json privilege-escalation 2021-11-29 10:32:40.593
8 [mitre:T1059.004] Node process starts shell 2 process process-start elastic_sysflow.json execution 2021-11-29 10:32:40.803
9 [mitre:T1059.004] Reverse Unix shell started 2 process process-start elastic_sysflow.json execution 2021-11-29 10:32:40.805
10 [mitre:T1059.004] Node process starts shell 2 process process-start elastic_sysflow.json execution 2021-11-29 10:32:40.808
11 [mitre:T1033] System Owner/User Discovery 1 process process-start elastic_sysflow.json discovery 2021-11-29 10:32:40.982
12 [mitre:T1082] System Information Discovery 0 process process-start elastic_sysflow.json discovery 2021-11-29 10:32:45.073
13 [mitre:T1057] Process Discovery 1 process process-start elastic_sysflow.json discovery 2021-11-29 10:32:51.837
14 [mitre:T1083] File and Directory Discovery 1 process process-start elastic_sysflow.json discovery 2021-11-29 10:32:56.332
15 [mitre:T1049] System Network Connections Discovery 1 process process-start elastic_sysflow.json discovery 2021-11-29 10:33:02.389
16 [mitre:T1087.001] Account Discovery - Local Account 2 process process-start elastic_sysflow.json discovery 2021-11-29 10:33:08.324
17 [mitre:T1018] Remote System Discovery 1 process process-start elastic_sysflow.json discovery 2021-11-29 10:33:20.998
18 [mitre:T1083] File and Directory Discovery 1 process process-start elastic_sysflow.json discovery 2021-11-29 10:33:27.290
19 [mitre:T1222.002] Linux and Mac File and Directory Permissions M... 1 process process-start elastic_sysflow.json defense-evasion 2021-11-29 10:33:47.689
20 [mitre:T1110.001] MySQL: failed login attempt 1 network Attempted User Privilege Gain elastic_snort.json credential-access 2021-11-29 10:34:01.626
21 [mitre:T1110.001] MySQL: failed login attempt 1 network Attempted User Privilege Gain elastic_snort.json credential-access 2021-11-29 10:34:01.631
22 [mitre:T1110.001] MySQL: failed login attempt 1 network Attempted User Privilege Gain elastic_snort.json credential-access 2021-11-29 10:34:01.637
23 [mitre:T1030] Large network data transfer with database endp... 2 network network-connection-traffic elastic_sysflow.json exfiltration 2021-11-29 10:34:01.638
24 [mitre:T1222.002] Linux and Mac File and Directory Permissions M... 1 process process-start elastic_sysflow.json defense-evasion 2021-11-29 10:34:18.209
25 [mitre:T1046] TCP Port Scanning 0 network Detection of a Network Scan elastic_snort.json discovery 2021-11-29 10:34:26.719
26 [mitre:T1190] SQL Injection attempt 2 network Web Application Attack elastic_snort.json initial-access 2021-11-29 10:34:33.245
27 [mitre:T1190] SQL Injection attempt 2 network Web Application Attack elastic_snort.json initial-access 2021-11-29 10:34:33.932
28 [mitre:T1190] SQL Injection attempt 2 network Web Application Attack elastic_snort.json initial-access 2021-11-29 10:34:34.416
29 [mitre:T1068] Possible SSTI attack 2 network Web Application Attack elastic_snort.json privilege-escalation 2021-11-29 10:34:34.746
30 [mitre:t1049, mitre:t1018, mitre:t1135, mitre:... Sysmon Net.exe Execution 0 process Process Create (rule: ProcessCreate) elastic_elastalert.json discovery 2021-11-29 10:34:35.068
31 [mitre:t1136, mitre:t1136.001] Sysmon Net.exe User Account Creation 1 process Process Create (rule: ProcessCreate) elastic_elastalert.json persistence 2021-11-29 10:34:35.068
32 [mitre:t1049, mitre:t1018, mitre:t1135, mitre:... Sysmon Net.exe Execution 0 process Process Create (rule: ProcessCreate) elastic_elastalert.json lateral-movement 2021-11-29 10:34:35.068
33 [mitre:t1049, mitre:t1018, mitre:t1135, mitre:... Sysmon Net.exe Execution 0 process Process Create (rule: ProcessCreate) elastic_elastalert.json discovery 2021-11-29 10:34:35.068
34 [mitre:t1049, mitre:t1018, mitre:t1135, mitre:... Sysmon Net.exe Execution 0 process Process Create (rule: ProcessCreate) elastic_elastalert.json lateral-movement 2021-11-29 10:34:35.068
35 [mitre:t1136, mitre:t1136.001] Sysmon Net.exe User Account Creation 1 process Process Create (rule: ProcessCreate) elastic_elastalert.json persistence 2021-11-29 10:34:35.068
36 [mitre:t1086, mitre:t1059.001] Sysmon Non Interactive PowerShell 1 process Process Create (rule: ProcessCreate) elastic_elastalert.json execution 2021-11-29 10:34:35.069

### Draw timeline of observed alerts¶

Now we finally show the observed alerts as markers over the timeline of the observation with the categorical y-axis being the tactic the alert belongs to.

In [6]:
# --- define a useful coloring of the markers
tactic2color = {
'Discovery':pl.colors.qualitative.D3[0],
'Initial Access':pl.colors.qualitative.D3[1],
'Execution':pl.colors.qualitative.D3[2],
'Privilege Escalation':pl.colors.qualitative.D3[3],
'Defense Evasion':pl.colors.qualitative.D3[4],
'Exfiltration':pl.colors.qualitative.D3[5],
'Credential Access':pl.colors.qualitative.D3[6],
'Persistence':pl.colors.qualitative.D3[7],
'Lateral Movement':pl.colors.qualitative.D3[8],
'discovery':pl.colors.qualitative.D3[0],
'initial-access':pl.colors.qualitative.D3[1],
'execution':pl.colors.qualitative.D3[2],
'privilege-escalation':pl.colors.qualitative.D3[3],
'defense-evasion':pl.colors.qualitative.D3[4],
'exfiltration':pl.colors.qualitative.D3[5],
'credential-access':pl.colors.qualitative.D3[6],
'persistence':pl.colors.qualitative.D3[7],
'lateral-movement':pl.colors.qualitative.D3[8],
}

# --- show the alerts as a scatter plot
mode='markers', marker_size=18,
marker_line=dict(color = 'rgb(255, 255, 255)', width = 1),
fig.update_layout(height=800)
fig.update_layout(xaxis_title="time",
yaxis_title="MITRE tactic")

# --- annotate the alerts with the 'event.reason' coming from the monitoring sensors
fig.update_layout(annotations=[
go.layout.Annotation(x=row['timestamp'], y=row['tactic'], xref="x", yref="y",
text=row['event.reason'],
showarrow=False, xanchor='left', yanchor='bottom', textangle=-27) for i,row in alert_data_tactics.iterrows()])
fig.show()


In this view we can see a progression from discovery to access and execution, with a potential privilege escalation. We also find in the later phase some failed attempts to log into a database followed by a large data transfer - indicating a potential successful break-in with data exfiltration. Another round of discovery operations precedes more alerts with a similar sequence of access-escalation-execution and alerts tagged as lateral movement and persistence. With that high level view, looking into the details of selected alerts will confirm that the first set of alerts were related to the linux machines in our setup, whereas the later alerts come from the monitoring of our Windows machine - confirming the lateral movement that is hinted by the timeline of the tactics.

In the next section, we compare this timeline with the ground truth information of our scripted attack.

## Compare timeline with ground truth information¶

As the attack itself is scripted, we know exactly what happens during this attack operation. We can use this ground truth to annotate the timeline and compare with the observed alerts.

In [7]:
# --- define the ground truth data as observed in the log of our attack script
ground_truth_attack_data = [
(datetime.datetime.fromisoformat('2021-11-29T10:32:30.387+01:00'), "Network scans with nmap", 0),
(datetime.datetime.fromisoformat('2021-11-29T10:32:31.554+01:00'), "use of SQL injection attack on web shop", 0),
(datetime.datetime.fromisoformat('2021-11-29T10:32:40.435+01:00'), "use of SSTI vulnerability of jshop", 0),
(datetime.datetime.fromisoformat('2021-11-29T10:32:40.804+01:00'), "reverse shell back to attacker", 10),
(datetime.datetime.fromisoformat('2021-11-29T10:32:40.979+01:00'), "sequence of discovery commands by attacker", 20),
(datetime.datetime.fromisoformat('2021-11-29T10:34:01.211+01:00'), "attempted (failed) login attemtps to MySQL server", 0),
(datetime.datetime.fromisoformat('2021-11-29T10:34:01.637+01:00'), "succeeds login MySQL server, exfiltrating data", 10),
(datetime.datetime.fromisoformat('2021-11-29T10:34:33.152+01:00'), "lateral movement: attacker uses SQL injection", 0),
(datetime.datetime.fromisoformat('2021-11-29T10:34:34.730+01:00'), "SSTI attack on Windows server", 0),
(datetime.datetime.fromisoformat('2021-11-29T10:34:35.014+01:00'), "attacker creates backdoor user", 10),
(datetime.datetime.fromisoformat('2021-11-29T10:34:35.031+01:00'), "attacker installs OpenSSH", 20)
]

In [8]:
# --- show the alerts as a scatter plot
mode='markers', marker_size=18,
marker_line=dict(color = 'rgb(255, 255, 255)', width = 1),
fig.update_layout(height=1000)
fig.update_layout(xaxis_title="time",
yaxis_title="MITRE tactic")
fig.update_layout(margin=dict(l=20, r=20, t=300, b=20))

# --- annotate the alerts with the 'event.reason' coming from the monitoring sensors
annotations = [
go.layout.Annotation(x=row['timestamp'], y=row['tactic'], xref="x", yref="y",
text=row['event.reason'],
showarrow=False, xanchor='left', yanchor='bottom', textangle=-27) for i,row in alert_data_tactics.iterrows()]

# --- add annotations from ground truth of known attack for comparison
for gt in ground_truth_attack_data:
fig.add_shape(dict(type="line", x0=gt[0], y0=0.1, x1=gt[0], y1=0.88, xref='x', yref='paper', line=dict(color="darkblue", width=1), layer='below'))
annotations_ground_truth = [
go.layout.Annotation(x=gt[0], y=0.88, xref="x", yref="paper", ax=gt[2],
text=gt[1], arrowcolor='darkblue', font_color='darkblue',
showarrow=True, xanchor='left', textangle=-65) for gt in ground_truth_attack_data
]
fig.update_layout(annotations=annotations + annotations_ground_truth)
fig.show()


When we compare the timeline of tactics with the annotated ground-truth information from the log of the attack script, we find that many steps are nicely visible in the collected alerts, i.e., the scanning 'discovery' actions, the break-in via the SSTI attack followed by the reverse shell; the attempted database logins followed by a successful login and data exfiltration; the lateral movement to the Windows server with the concluding installation of a backdoor user.

Of course not all attack steps leave traces in the form of alerts: for example, the attacker manages in the end to install and activate OpenSSH on the Windows VM and use the created user to exfiltrate files from that machine. Though this is not visible on the level of the alerts, the collected alerts make us aware of a larger attack scenario and raise our suspicion to trigger further investigations. In out case at hand, looking at the detailed telemetry of the collected events (like SysFlow and Sysmon events in our case) will show the details, i.e., we will find that with the non-interactive PowerShell the attacker was able to install the OpenSSH server and that this was used later on as access point for the backdoor user to collect files from the victim machine.

## Conclusions and future work¶

This demonstration has shown that the temporal succession of MITRE tactics describes the attack sequence in a very high-level abstraction and in correspondance to our ground-truth knowledge of what is actually going on, in this case indicating a progression from discovery to access and exploitation, via privilege escalations to data exfiltration, lateral movement with more discovery and final persistence phase towards the end.

Potential advantages of such a high-level representation include enabling a faster human understanding of the bigger picture, cutting through the noise of the details, complementary to the low-level details. This could indicate parts of a longer running attack, possibly prompting to query more data to fill in potential gaps visible from that level of view to further understand consequences of the attack.

On the other hand, this sequence of tactics will most often be quite fuzzy, as of course an attacker has many different ways to progress with the attack and different phases/tactics will be repeated/missing/reordered.

Further work is required, especially testing this approach with a broader set of attack scenarios in a more complex environment (including more background traffic), to evaluate whether this can be helpful for a human operator to identify the ’storyline’ of an attack. If this approach proves to be useful, adding some more details back to high-level timeline, like visualising the events in the context of the environment, should be helpful to understand the 'path' of the attack better, i.e., from the entry point through various stepping stones along the journey of the attack operation.