Starting with version 0.5.0, SysFlow records contain more information related to containers in case they are part of a Kubernetes (k8s) or OpenShift environment. Specifically, there is a new record type, KE
, that captures and exposes the Kubernetes events in the new k8s.*
attributes. Furthermore, all other record types are extended with Kubernetes pod information in the new pod.*
attributes.
These new attributes are, in more detail:
k8s.action K8s Event Action k8s.kind K8s Event Component Type k8s.msg K8s Event Message pod.id Pod Identifier pod.name Pod Name pod.nname Pod Node Name pod.hostip Pod Host IP pod.internalip Pod Internal IP pod.ns Pod Namespace pod.rstrtcnt Pod Restart Count pod.services Pod Services
In this notebook we will look into the new information using data from a test setup running Instana's robot-shop application, with a special eye to the new information that is available with respect to cluster-relevant IP addresses.
First, we describe the experimental setup that was used to create our test data. Next, after loading the experimental SysFlow data, we look at the new k8s event data, the new information related to pods as avaible per SysFlow record, and compare the newly available cluster-level IP address that augment the observed network activity in the regular NF objects.
The experimental setup is based on the installation of:
as a small base Kubernetes test environment.
The experiment consists in the installation of Instana's robot-shop application. To have sufficient resources to run this multi-container application, slightly more than the default minimal configuration should be used, e.g., here we are using 4 CPUs and 16GB of memory for a virtualbox VM.
$ minikube start
* minikube v1.25.2 on Ubuntu 18.04 (kvm/amd64)
* Using the virtualbox driver based on user configuration
* Starting control plane node minikube in cluster minikube
* Creating virtualbox VM (CPUs=4, Memory=16000MB, Disk=100000MB) ...
* Preparing Kubernetes v1.23.3 on Docker 20.10.12 ...
- kubelet.housekeeping-interval=5m
- Generating certificates and keys ...
- Booting up control plane ...
- Configuring RBAC rules ...
- Using image gcr.io/k8s-minikube/storage-provisioner:v5
* Verifying Kubernetes components...
* Enabled addons: storage-provisioner, default-storageclass
* Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
The next section contains information about the experiment used to collect the SysFlow data with sf-collector
0.5.0.
The experiment used to create our data consists of:
robot-shop
robot-shop
with helm charts provided by the applicationmongodb
, web
, user
) - which will be automatically recreated by the applicationrobot-shop
application.The scripted experiment logs the timestamps of these events so that we have a baseline of the events to compare with the collected SysFlow data.
from sysflow.reader import FlattenedSFReader, SFReader
from sysflow.formatter import SFFormatter
import json
import os
import pprint
import pickle
import gzip
import pandas as pd
import numpy as np
import datetime
import tabulate
import textwrap
import plotly.graph_objects as go
import plotly as pl
import plotly.io as pio
pio.renderers.default = 'iframe'
pd.set_option('display.max_rows', 50)
data_dir = 'data/'
log_file = data_dir + 'experiment.log'
log_content_selection = ['----- ', 'waiting']
log_selected_lines = []
with open(log_file, 'r') as inp:
for line in inp:
if any([p in line for p in log_content_selection]):
log_selected_lines.append(line.rstrip())
log_events = []
for line in log_selected_lines:
time_str = ' '.join(line.split()[0:2])
tdt = datetime.datetime.strptime(time_str, '%Y-%m-%d %H:%M:%S,%f') #.replace(tzinfo=localtz)
rest = ' '.join(line.split()[4:])
event = rest.replace('----- ', '').replace('... ', '').replace(' seconds', 's')
log_events.append([tdt, event])
print(tabulate.tabulate(log_events))
-------------------------- ------------------------- 2022-03-17 18:48:40.736000 starting experiment 2022-03-17 18:48:40.736000 create project robot-shop 2022-03-17 18:48:40.991000 waiting for 60s 2022-03-17 18:49:41.047000 install robot shop 2022-03-17 18:49:42.531000 waiting for 900s 2022-03-17 19:04:47.423000 kill container mongodb 2022-03-17 19:04:53.940000 waiting for 300s 2022-03-17 19:09:53.947000 kill container web 2022-03-17 19:10:02.620000 waiting for 300s 2022-03-17 19:15:02.660000 kill container user 2022-03-17 19:15:36.016000 waiting for 300s 2022-03-17 19:20:36.117000 delete robot shop 2022-03-17 19:20:37.534000 waiting for 300s 2022-03-17 19:25:37.627000 delete project robot-shop 2022-03-17 19:25:50.305000 waiting for 300s 2022-03-17 19:30:50.346000 experiment ends -------------------------- -------------------------
The collected SysFlow data is combined into the accompanying experiment.sf
(as SysFlow trace files are essentially AVRO files, AVRO tools like avro-tools concat
can be used to combine multiple SysFlow traces into one file).
This file is read into a Pandas DataFrame for our further evaluation.
sf_file = data_dir + 'experiment.sf'
# reading of the SysFlow trace file and conversion to a Pandas DataFrame (with caching onto disk)
df_file = data_dir + 'experiment_df.pkl.gz'
if os.path.exists(df_file):
with gzip.open(df_file, 'rb') as inp:
df = pickle.load(inp)
else:
reader = FlattenedSFReader(sf_file, False)
formatter = SFFormatter(reader)
df = formatter.toDataframe()
# applying some functions to allow for hashing of the more complex data types
df['pod.internalip'] = df['pod.internalip'].apply(tuple)
df['pod.hostip'] = df['pod.hostip'].apply(tuple)
df['pod.services_str'] = df['pod.services'].apply(str)
with gzip.open(df_file, 'wb') as out:
pickle.dump(df, out)
print(f'The captured data contains {df.shape[0]} SysFlow records, describing the activity of {len(df["container.id"].unique())} containers.')
The captured data contains 169324 SysFlow records, describing the activity of 41 containers.
KE
record type¶Let us first look at the new KE
record type. For this, we subselect the entries of interest into a new DataFrame df_ke
.
# k8s.msg fields have still some spurious line-ending
def fix_k8s_msg(msg):
if msg.endswith('\n\u0000'):
msg = msg[:-2]
return msg
# select the KE records, drop all irrelevant columns (empty string or NaN)
df_ke = df[df.type == 'KE'].replace('', np.nan).dropna(axis=1, how='all').reset_index()
# fix k8s.msg
df_ke['k8s.msg'] = df_ke['k8s.msg'].apply(fix_k8s_msg)
The relevant information gathered from the events is shown in the fields:
k8s.kind
: the kind of the K8s infrastructure that this event is concerned with (like "K8S_NODES", "K8S_NAMESPACES", "K8S_PODS", "K8S_REPLICATIONCONTROLLERS", "K8S_SERVICES", "K8S_EVENTS", "K8S_REPLICASETS", "K8S_DAEMONSETS", "K8S_DEPLOYMENTS", "K8S_UNKNOWN")k8s.action
: the action type (like "K8S_COMPONENT_ADDED", "K8S_COMPONENT_MODIFIED", "K8S_COMPONENT_DELETED", "K8S_COMPONENT_ERROR", "K8S_COMPONENT_NONEXISTENT", "K8S_COMPONENT_UNKNOWN")k8s.msg
: the JSON string of the K8s eventdf_ke
index | version | type | ts | ts_uts | pod.hostip | pod.internalip | node.id | node.ip | filename | schema | tags | k8s.action | k8s.kind | k8s.msg | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | 4 | () | K8S_COMPONENT_ADDED | K8S_NODES | {"apiVersion":"v1","items":[{"addresses":["192... |
1 | 1 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | 4 | () | K8S_COMPONENT_ADDED | K8S_NAMESPACES | {"apiVersion":"v1","items":[{"labels":{"kubern... |
2 | 2 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | 4 | () | K8S_COMPONENT_ADDED | K8S_PODS | {"apiVersion":"v1","items":[{"containerStatuse... |
3 | 3 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | 4 | () | K8S_COMPONENT_ADDED | K8S_REPLICATIONCONTROLLERS | {"apiVersion":"v1","items":[],"kind":"Replicat... |
4 | 4 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | 4 | () | K8S_COMPONENT_ADDED | K8S_SERVICES | {"apiVersion":"v1","items":[{"clusterIP":"10.9... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
168 | 157707 | 4 | KE | 2022-03-17T19:25:50.692444 | 1647545150692443689 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545115 | 4 | () | K8S_COMPONENT_MODIFIED | K8S_NAMESPACES | {"apiVersion":"v1","items":[{"labels":{"kubern... |
169 | 157708 | 4 | KE | 2022-03-17T19:25:50.692444 | 1647545150692443689 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545115 | 4 | () | K8S_COMPONENT_DELETED | K8S_NAMESPACES | {"apiVersion":"v1","items":[{"labels":{"kubern... |
170 | 159405 | 4 | KE | 2022-03-17T19:26:56.746972 | 1647545216746972249 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545175 | 4 | () | K8S_COMPONENT_MODIFIED | K8S_NODES | {"apiVersion":"v1","items":[{"addresses":["192... |
171 | 164836 | 4 | KE | 2022-03-17T19:30:57.880767 | 1647545457880767012 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545415 | 4 | () | K8S_COMPONENT_ADDED | K8S_NODES | {"apiVersion":"v1","items":[{"addresses":["192... |
172 | 166372 | 4 | KE | 2022-03-17T19:32:04.931546 | 1647545524931546223 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545475 | 4 | () | K8S_COMPONENT_MODIFIED | K8S_NODES | {"apiVersion":"v1","items":[{"addresses":["192... |
173 rows × 15 columns
As to be expected given the experiment, the largest activity can be seen around changes in the Pods:
df_ke.value_counts(['k8s.kind', 'k8s.action'], sort=False)
k8s.kind k8s.action K8S_NAMESPACES K8S_COMPONENT_ADDED 13 K8S_COMPONENT_DELETED 1 K8S_COMPONENT_MODIFIED 3 K8S_NODES K8S_COMPONENT_ADDED 3 K8S_COMPONENT_MODIFIED 9 K8S_PODS K8S_COMPONENT_ADDED 24 K8S_COMPONENT_DELETED 15 K8S_COMPONENT_MODIFIED 77 K8S_REPLICATIONCONTROLLERS K8S_COMPONENT_ADDED 1 K8S_SERVICES K8S_COMPONENT_ADDED 15 K8S_COMPONENT_DELETED 12 dtype: int64
k8s.msg
data¶A deeper understanding of what the KE records tell us about the cluster activity can be found when expanding the k8s.msg
field of the records.
The JSON-formatted k8s.msg contains a list of items to which the event relates. Usually this is only one item, but in some cases, the event is related to multiple items, e.g., when multiple items of the same type are added or deleted.
For this reason, we create a new DataFrame, where events are potentially duplicated for each item if there are multiple. Special consideration is given to extract IP releated data out of k8s.msg
where avaible. The resulting information is stored in the new DataFrame df_ke_ext
.
table = []
itemcols = ['name', 'namespace', 'podIP', 'hostIP', 'clusterIP']
for ie,e in df_ke.iterrows():
msg = json.loads(e['k8s.msg'])
for item in msg['items']:
d = e.to_dict()
d['msg_hash'] = hash(str(msg))
d['kind'] = msg['kind']
d['typ'] = msg['type']
d['name'] = item.get('name')
d['namespace'] = item.get('namespace')
d['ts_item'] = item.get('timestamp')
if item.get('podIP'):
d['ip'] = item.get('podIP')
d['iptype'] = 'podIP'
# d['podIP'] = item.get('podIP')
table.append(d)
elif item.get('hostIP'):
d['ip'] = item.get('hostIP')
d['iptype'] = 'hostIP'
# d['hostIP'] = item.get('hostIP')
table.append(d)
elif item.get('clusterIP') and not item.get('clusterIP')=='None':
if msg['kind'] != 'Service':
print(f'>>>> WARNING: clusterIP but not a service - investigate! msg: {msg}')
continue
d['ip'] = item.get('clusterIP')
d['iptype'] = 'clusterIP'
# d['clusterIP'] = item.get('clusterIP')
ports = item.get('ports')
for port in ports:
port['portname'] = port['name'] # make sure to avoid overwriting data in existing record
del port['name']
d.update(port)
# fix naming of proto
d['proto'] = d['protocol']
del d['protocol']
table.append(d)
else:
table.append(d)
df_ke_ext = pd.DataFrame(table).reset_index(drop=True).sort_values(['ts','kind'])
df_ke_ext
index | version | type | ts | ts_uts | pod.hostip | pod.internalip | node.id | node.ip | filename | ... | name | namespace | ts_item | ip | iptype | port | targetPort | portname | proto | nodePort | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | ... | default | None | 2022-03-17T15:19:35Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | 1 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | ... | kube-node-lease | None | 2022-03-17T15:19:33Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | 1 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | ... | kube-public | None | 2022-03-17T15:19:33Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | 1 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | ... | kube-system | None | 2022-03-17T15:19:33Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 | 1 | 4 | KE | 2022-03-17T18:47:15.845631 | 1647542835845631000 | () | () | minikube | 192.168.59.100 | /mnt/data/1647542836 | ... | sysflow | None | 2022-03-17T15:32:12Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
187 | 157707 | 4 | KE | 2022-03-17T19:25:50.692444 | 1647545150692443689 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545115 | ... | robot-shop | None | 2022-03-17T18:48:40Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
188 | 157708 | 4 | KE | 2022-03-17T19:25:50.692444 | 1647545150692443689 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545115 | ... | robot-shop | None | 2022-03-17T18:48:40Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
189 | 159405 | 4 | KE | 2022-03-17T19:26:56.746972 | 1647545216746972249 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545175 | ... | minikube | None | 2022-03-17T15:19:33Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
190 | 164836 | 4 | KE | 2022-03-17T19:30:57.880767 | 1647545457880767012 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545415 | ... | minikube | None | 2022-03-17T15:19:33Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
191 | 166372 | 4 | KE | 2022-03-17T19:32:04.931546 | 1647545524931546223 | () | () | minikube | 192.168.59.100 | /mnt/data/1647545475 | ... | minikube | None | 2022-03-17T15:19:33Z | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
192 rows × 28 columns
Finally we put the focus onto the subset of data concerned with the robot-shop application. The final DataFrame df_ke_ext_sel
is a restriction of the KE event information to events related to the robot-shop, with a focus on addition/deletion events.
df_ke_ext_sel = df_ke_ext[((df_ke_ext.name == 'robot-shop') | (df_ke_ext.namespace =='robot-shop')) & ((df_ke_ext.typ == 'ADDED') | (df_ke_ext.typ == 'DELETED'))]
df_ke_ext_sel.value_counts(['kind', 'typ'])
kind typ Pod ADDED 15 DELETED 15 Service ADDED 14 DELETED 14 Namespace ADDED 2 DELETED 1 dtype: int64
Next, we compare the timeline of the experiment with the Kubernetes events seen in the SysFlow data.
fig = go.Figure()
pp = pprint.PrettyPrinter(indent=4, width=80, compact=True)
log_events_cleaned = list(filter(lambda x: 'waiting' not in x[1] and 'starting' not in x[1], log_events))
# fig.add_trace(go.Scatter(x=[e[0] for e in log_events_cleaned], y=['LOGEVENT' for e in log_events_cleaned], text=[e[1] for e in log_events_cleaned], mode='markers', marker_size=10, marker_symbol='diamond-open'))
for e in log_events_cleaned:
fig.add_annotation(x=e[0], xref='x', yref='paper', y=1., text=e[1], xanchor='left', showarrow=True, textangle=-35, arrowwidth=2)
fig.add_shape(dict(type="line", x0=e[0], y0=0, x1=e[0], y1=1, xref='x', yref='paper', line=dict(color="RoyalBlue", width=2)))
texts = ['<span style="font-size:x-small">' +
df_ke_ext_sel.iloc[i]['k8s.kind'] +'<br>' +
df_ke_ext_sel.iloc[i]['k8s.action'] +'<br>' +
pp.pformat(json.loads(df_ke_ext_sel.iloc[i]['k8s.msg'])).replace('\n', ' <br> ') +
'</span>'
for i in range(df_ke_ext_sel.shape[0])]
colors = ['green' if df_ke_ext_sel.iloc[i]['k8s.action'] == 'K8S_COMPONENT_ADDED' else 'red'
for i in range(df_ke_ext_sel.shape[0])]
symbols = ['triangle-up' if df_ke_ext_sel.iloc[i]['k8s.action'] == 'K8S_COMPONENT_ADDED' else 'triangle-down'
for i in range(df_ke_ext_sel.shape[0])]
fig.add_trace(go.Scatter(x=df_ke_ext_sel.ts,
y=[df_ke_ext_sel.iloc[i]['k8s.kind'] for i in range(df_ke_ext_sel.shape[0])],
text=texts,
mode='markers', marker_size=20, marker_color=colors, marker_symbol=symbols, marker_line_color='black', marker_line_width=1))
fig.update_layout(height=900, margin=dict(t=200, pad=4), showlegend=False)
fig.show()
Green triangles represent the creation of a component (K8S_COMPONENT_ADDED
), whereas red triangles represent the deletion of a component (K8S_COMPONENT_DELETED
). It is quite clear that we find KE events for all the changes related to the robot-shop application happening in the experiment: creation/deletion of the namespace robot-shop, creation/deletion of services and creation/deletion of pods, on beginning and end, but also seen when we forcibly killed some of the containers of the robot-shop application.
KE
records¶Let's now take a quick look into the IP data gathered from the KE records.
df_ke_ext_sel.dropna(subset=['iptype'])[['kind', 'name', 'namespace', 'iptype', 'ip', 'proto', 'port', 'targetPort', 'portname']].sort_values('ip').drop_duplicates()
kind | name | namespace | iptype | ip | proto | port | targetPort | portname | |
---|---|---|---|---|---|---|---|---|---|
63 | Service | shipping | robot-shop | clusterIP | 10.103.83.70 | TCP | 8080.0 | 8080.0 | http |
64 | Service | ratings | robot-shop | clusterIP | 10.104.84.135 | TCP | 80.0 | 80.0 | http |
58 | Service | user | robot-shop | clusterIP | 10.105.106.134 | TCP | 8080.0 | 8080.0 | http |
62 | Service | mysql | robot-shop | clusterIP | 10.107.7.181 | TCP | 3306.0 | 3306.0 | mysql |
147 | Service | payment | robot-shop | clusterIP | 10.108.220.250 | TCP | 8080.0 | 8080.0 | http |
69 | Service | mongodb | robot-shop | clusterIP | 10.109.105.252 | TCP | 27017.0 | 27017.0 | mongo |
146 | Service | cart | robot-shop | clusterIP | 10.109.213.103 | TCP | 8080.0 | 8080.0 | http |
141 | Service | rabbitmq | robot-shop | clusterIP | 10.109.218.161 | TCP | 4369.0 | 4369.0 | tcp-epmd |
60 | Service | redis | robot-shop | clusterIP | 10.111.214.104 | TCP | 6379.0 | 6379.0 | redis |
137 | Service | catalogue | robot-shop | clusterIP | 10.96.58.129 | TCP | 8080.0 | 8080.0 | http |
136 | Service | web | robot-shop | clusterIP | 10.99.3.147 | TCP | 8080.0 | 8080.0 | http |
165 | Pod | mysql-6d778f4c8f-4bcr7 | robot-shop | podIP | 172.17.0.10 | NaN | NaN | NaN | NaN |
162 | Pod | shipping-7f6dfbf46f-94trr | robot-shop | podIP | 172.17.0.11 | NaN | NaN | NaN | NaN |
156 | Pod | ratings-7ccf67b49f-6qckr | robot-shop | podIP | 172.17.0.12 | NaN | NaN | NaN | NaN |
174 | Pod | dispatch-69b65d89b9-4lgl7 | robot-shop | podIP | 172.17.0.13 | NaN | NaN | NaN | NaN |
171 | Pod | payment-5465d9cc79-8ln4b | robot-shop | podIP | 172.17.0.14 | NaN | NaN | NaN | NaN |
150 | Pod | redis-0 | robot-shop | podIP | 172.17.0.15 | NaN | NaN | NaN | NaN |
168 | Pod | catalogue-998b69bc9-bfnr7 | robot-shop | podIP | 172.17.0.4 | NaN | NaN | NaN | NaN |
180 | Pod | rabbitmq-785b678f74-mhhtg | robot-shop | podIP | 172.17.0.5 | NaN | NaN | NaN | NaN |
114 | Pod | user-899b6c7ff-c7wnj | robot-shop | podIP | 172.17.0.6 | NaN | NaN | NaN | NaN |
183 | Pod | cart-7d7745696b-qgb99 | robot-shop | podIP | 172.17.0.8 | NaN | NaN | NaN | NaN |
153 | Pod | web-77486f858f-jnf9r | robot-shop | podIP | 172.17.0.9 | NaN | NaN | NaN | NaN |
97 | Pod | mongodb-67c5456f4-d4bgv | robot-shop | podIP | 172.17.0.9 | NaN | NaN | NaN | NaN |
105 | Pod | web-77486f858f-gpfrn | robot-shop | hostIP | 192.168.59.100 | NaN | NaN | NaN | NaN |
159 | Pod | mongodb-67c5456f4-ddhnf | robot-shop | hostIP | 192.168.59.100 | NaN | NaN | NaN | NaN |
177 | Pod | user-899b6c7ff-qxd47 | robot-shop | hostIP | 192.168.59.100 | NaN | NaN | NaN | NaN |
In the data, we can recognize quite a bit of IP address information:
hostIP
(that corresponds to the node.ip
data in the records)Let us keep track of the pod IPs gleaned from this data for later use in the comparison with the observed network traffic.
podips = {}
for irow, row in df_ke_ext_sel.dropna(subset=['iptype'])[['kind', 'name', 'namespace', 'iptype', 'ip', 'proto', 'port', 'targetPort', 'portname']].sort_values('ip').drop_duplicates().iterrows():
if not row['iptype'] == 'podIP': continue
podips.setdefault(row['ip'], set()).add(row['name'])
podips
{'172.17.0.10': {'mysql-6d778f4c8f-4bcr7'}, '172.17.0.11': {'shipping-7f6dfbf46f-94trr'}, '172.17.0.12': {'ratings-7ccf67b49f-6qckr'}, '172.17.0.13': {'dispatch-69b65d89b9-4lgl7'}, '172.17.0.14': {'payment-5465d9cc79-8ln4b'}, '172.17.0.15': {'redis-0'}, '172.17.0.4': {'catalogue-998b69bc9-bfnr7'}, '172.17.0.5': {'rabbitmq-785b678f74-mhhtg'}, '172.17.0.6': {'user-899b6c7ff-c7wnj'}, '172.17.0.8': {'cart-7d7745696b-qgb99'}, '172.17.0.9': {'mongodb-67c5456f4-d4bgv', 'web-77486f858f-jnf9r'}}
pod.*
fields¶SysFlow records identify the containers they belong to. In the context of a Kubernetes/OpenShift cluster, each container belongs to a pod, which in turn is part of a namespace. Every SysFlow record now contains this metadata that helps to put the low-level, container-related information into the context of the cluster.
Let us first look into the relationship between containers and pods. To make this easier, let's focus again on the containers related to the robot-shop application.
df[df['container.name'].str.contains('robot-shop')].value_counts(['container.name', 'container.id', 'container.image', 'pod.name'], sort=False).reset_index()
container.name | container.id | container.image | pod.name | 0 | |
---|---|---|---|---|---|
0 | k8s_POD_cart-7d7745696b-qgb99_robot-shop_06111... | 9f2d50473bb4 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
1 | k8s_POD_catalogue-998b69bc9-bfnr7_robot-shop_4... | eb32dd737f52 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
2 | k8s_POD_dispatch-69b65d89b9-4lgl7_robot-shop_b... | 63e3c777b5c7 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
3 | k8s_POD_mongodb-67c5456f4-d4bgv_robot-shop_646... | 9d90554ce0a6 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
4 | k8s_POD_mongodb-67c5456f4-ddhnf_robot-shop_423... | a1397d34b86e | k8s.gcr.io/pause:3.6:3.6 | 2 | |
5 | k8s_POD_mysql-6d778f4c8f-4bcr7_robot-shop_93bb... | cabf4edbc827 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
6 | k8s_POD_payment-5465d9cc79-8ln4b_robot-shop_39... | f4631d398156 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
7 | k8s_POD_rabbitmq-785b678f74-mhhtg_robot-shop_b... | ba13946ea53f | k8s.gcr.io/pause:3.6:3.6 | 2 | |
8 | k8s_POD_ratings-7ccf67b49f-6qckr_robot-shop_ee... | 4d4b1cf9894d | k8s.gcr.io/pause:3.6:3.6 | 2 | |
9 | k8s_POD_redis-0_robot-shop_29c707fd-318a-4af4-... | b3f1af98e495 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
10 | k8s_POD_shipping-7f6dfbf46f-94trr_robot-shop_5... | c661019b5785 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
11 | k8s_POD_user-899b6c7ff-c7wnj_robot-shop_6fad1f... | 7eca18b706ef | k8s.gcr.io/pause:3.6:3.6 | 2 | |
12 | k8s_POD_user-899b6c7ff-qxd47_robot-shop_b9b155... | b59379f4b206 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
13 | k8s_POD_web-77486f858f-gpfrn_robot-shop_cb8700... | 43861efc9ee3 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
14 | k8s_POD_web-77486f858f-jnf9r_robot-shop_9f6687... | 96f8362d2e24 | k8s.gcr.io/pause:3.6:3.6 | 2 | |
15 | k8s_cart_cart-7d7745696b-qgb99_robot-shop_0611... | 668cdd1c7d0c | sha256:791bdeb9c40842f51f9037086ac06b6e630d6b0... | 2658 | |
16 | k8s_catalogue_catalogue-998b69bc9-bfnr7_robot-... | e9b881dd05e5 | sha256:7cdadb2368155c32906c789493f7481975a1ccc... | 886 | |
17 | k8s_catalogue_catalogue-998b69bc9-bfnr7_robot-... | e9b881dd05e5 | sha256:7cdadb2368155c32906c789493f7481975a1ccc... | catalogue-998b69bc9-bfnr7 | 1962 |
18 | k8s_dispatch_dispatch-69b65d89b9-4lgl7_robot-s... | e04525cc3464 | sha256:7e63803d2a06df53549d0d089b88b56d242920f... | 59 | |
19 | k8s_dispatch_dispatch-69b65d89b9-4lgl7_robot-s... | e04525cc3464 | sha256:7e63803d2a06df53549d0d089b88b56d242920f... | dispatch-69b65d89b9-4lgl7 | 5716 |
20 | k8s_mongodb_mongodb-67c5456f4-d4bgv_robot-shop... | cd4ea3f3e8ae | sha256:621ddd7848a2327f471de8541d8b020d65a58a1... | 1045 | |
21 | k8s_mongodb_mongodb-67c5456f4-d4bgv_robot-shop... | cd4ea3f3e8ae | sha256:621ddd7848a2327f471de8541d8b020d65a58a1... | mongodb-67c5456f4-d4bgv | 9311 |
22 | k8s_mongodb_mongodb-67c5456f4-ddhnf_robot-shop... | 4269f06be75a | sha256:621ddd7848a2327f471de8541d8b020d65a58a1... | 1315 | |
23 | k8s_mongodb_mongodb-67c5456f4-ddhnf_robot-shop... | 4269f06be75a | sha256:621ddd7848a2327f471de8541d8b020d65a58a1... | mongodb-67c5456f4-ddhnf | 9721 |
24 | k8s_mysql_mysql-6d778f4c8f-4bcr7_robot-shop_93... | 38af625d9ec8 | sha256:1a9332d91b6161f822552e2c9b3411b9ca75fad... | 877 | |
25 | k8s_mysql_mysql-6d778f4c8f-4bcr7_robot-shop_93... | 38af625d9ec8 | sha256:1a9332d91b6161f822552e2c9b3411b9ca75fad... | mysql-6d778f4c8f-4bcr7 | 5429 |
26 | k8s_payment_payment-5465d9cc79-8ln4b_robot-sho... | eed324f483a1 | sha256:d7df366a05b47922b99cb6cfe58384f28edf79c... | 659 | |
27 | k8s_payment_payment-5465d9cc79-8ln4b_robot-sho... | eed324f483a1 | sha256:d7df366a05b47922b99cb6cfe58384f28edf79c... | payment-5465d9cc79-8ln4b | 363 |
28 | k8s_rabbitmq_rabbitmq-785b678f74-mhhtg_robot-s... | 27a319d7a44d | sha256:f6219c413094ec0fbaa87f134dbfb3bc4a28b05... | 1184 | |
29 | k8s_rabbitmq_rabbitmq-785b678f74-mhhtg_robot-s... | 27a319d7a44d | sha256:f6219c413094ec0fbaa87f134dbfb3bc4a28b05... | rabbitmq-785b678f74-mhhtg | 14885 |
30 | k8s_ratings_ratings-7ccf67b49f-6qckr_robot-sho... | 2bcca59fd11c | sha256:a1676b19d3f9142ab46185efe631d1dd9918bc7... | 1114 | |
31 | k8s_ratings_ratings-7ccf67b49f-6qckr_robot-sho... | 2bcca59fd11c | sha256:a1676b19d3f9142ab46185efe631d1dd9918bc7... | ratings-7ccf67b49f-6qckr | 5777 |
32 | k8s_redis_redis-0_robot-shop_29c707fd-318a-4af... | 9e71b7187518 | sha256:1e70071f4af45af2cc9e1d1300c675c1ce37ee2... | 123 | |
33 | k8s_redis_redis-0_robot-shop_29c707fd-318a-4af... | 9e71b7187518 | sha256:1e70071f4af45af2cc9e1d1300c675c1ce37ee2... | redis-0 | 18127 |
34 | k8s_shipping_shipping-7f6dfbf46f-94trr_robot-s... | 34bdedd4f86d | sha256:3bb300680fb6c3257ba84e8a65715ad621dab14... | 78 | |
35 | k8s_shipping_shipping-7f6dfbf46f-94trr_robot-s... | 34bdedd4f86d | sha256:3bb300680fb6c3257ba84e8a65715ad621dab14... | shipping-7f6dfbf46f-94trr | 1894 |
36 | k8s_user_user-899b6c7ff-c7wnj_robot-shop_6fad1... | 942c80fd11ea | sha256:8c1e369ddb3a21c7d2b66ab46d66d27ce3a6200... | 874 | |
37 | k8s_user_user-899b6c7ff-c7wnj_robot-shop_6fad1... | 942c80fd11ea | sha256:8c1e369ddb3a21c7d2b66ab46d66d27ce3a6200... | user-899b6c7ff-c7wnj | 1541 |
38 | k8s_user_user-899b6c7ff-qxd47_robot-shop_b9b15... | 35ef3651eb95 | sha256:8c1e369ddb3a21c7d2b66ab46d66d27ce3a6200... | 509 | |
39 | k8s_user_user-899b6c7ff-qxd47_robot-shop_b9b15... | 35ef3651eb95 | sha256:8c1e369ddb3a21c7d2b66ab46d66d27ce3a6200... | user-899b6c7ff-qxd47 | 917 |
40 | k8s_web_web-77486f858f-gpfrn_robot-shop_cb8700... | 69ca25df3460 | sha256:e1d19c905d3ba8267a5a41cc19230a26f4bf967... | 992 | |
41 | k8s_web_web-77486f858f-jnf9r_robot-shop_9f6687... | 6765f016049a | sha256:e1d19c905d3ba8267a5a41cc19230a26f4bf967... | 660 | |
42 | k8s_web_web-77486f858f-jnf9r_robot-shop_9f6687... | 6765f016049a | sha256:e1d19c905d3ba8267a5a41cc19230a26f4bf967... | web-77486f858f-jnf9r | 39 |
To understand this better, let's take a look at one specific container of the robot-shop setup, picking up the data for the mongodb
container, as this is also one of the containers that gets killed as part of the experiment and subsequently gets restarted by Kubernetes.
df[df['container.name'].str.contains('mongodb')].sort_values('ts_uts').groupby(['container.name', 'container.id', 'pod.name']).agg({'container.image': 'first', 'ts': ['min', 'max']}).reset_index().sort_values(by=[('ts', 'min')])
container.name | container.id | pod.name | container.image | ts | ||
---|---|---|---|---|---|---|
first | min | max | ||||
2 | k8s_mongodb_mongodb-67c5456f4-d4bgv_robot-shop... | cd4ea3f3e8ae | sha256:621ddd7848a2327f471de8541d8b020d65a58a1... | 2022-03-17T18:50:01.173124 | 2022-03-17T18:50:14.874146 | |
3 | k8s_mongodb_mongodb-67c5456f4-d4bgv_robot-shop... | cd4ea3f3e8ae | mongodb-67c5456f4-d4bgv | sha256:621ddd7848a2327f471de8541d8b020d65a58a1... | 2022-03-17T18:50:01.276773 | 2022-03-17T19:04:48.912214 |
0 | k8s_POD_mongodb-67c5456f4-d4bgv_robot-shop_646... | 9d90554ce0a6 | k8s.gcr.io/pause:3.6:3.6 | 2022-03-17T19:04:49.484837 | 2022-03-17T19:04:49.485062 | |
4 | k8s_mongodb_mongodb-67c5456f4-ddhnf_robot-shop... | 4269f06be75a | sha256:621ddd7848a2327f471de8541d8b020d65a58a1... | 2022-03-17T19:04:55.566971 | 2022-03-17T19:05:13.797194 | |
5 | k8s_mongodb_mongodb-67c5456f4-ddhnf_robot-shop... | 4269f06be75a | mongodb-67c5456f4-ddhnf | sha256:621ddd7848a2327f471de8541d8b020d65a58a1... | 2022-03-17T19:04:55.568221 | 2022-03-17T19:20:39.186520 |
1 | k8s_POD_mongodb-67c5456f4-ddhnf_robot-shop_423... | a1397d34b86e | k8s.gcr.io/pause:3.6:3.6 | 2022-03-17T19:20:40.328251 | 2022-03-17T19:20:40.328324 |
From this listing we observe that:
container.id
s involved, 2 each having the same container.image
- corresponding to our killing of the first mongodb
container and its restartpod.name
here), only slightly after the creationk8s_POD_mongodb-...
using the pause
imageAs long as we have all information, including the pod data, the relationship between container and pod is unique:
df[df['pod.name'].astype(bool)][['container.name', 'pod.name']].drop_duplicates().sort_values('container.name').reset_index(drop=True)
container.name | pod.name | |
---|---|---|
0 | k8s_catalogue_catalogue-998b69bc9-bfnr7_robot-... | catalogue-998b69bc9-bfnr7 |
1 | k8s_coredns_coredns-64897985d-n4jjl_kube-syste... | coredns-64897985d-n4jjl |
2 | k8s_dispatch_dispatch-69b65d89b9-4lgl7_robot-s... | dispatch-69b65d89b9-4lgl7 |
3 | k8s_etcd_etcd-minikube_kube-system_fc45a20ce68... | etcd-minikube |
4 | k8s_kube-apiserver_kube-apiserver-minikube_kub... | kube-apiserver-minikube |
5 | k8s_kube-controller-manager_kube-controller-ma... | kube-controller-manager-minikube |
6 | k8s_kube-proxy_kube-proxy-9g9kt_kube-system_e4... | kube-proxy-9g9kt |
7 | k8s_kube-scheduler_kube-scheduler-minikube_kub... | kube-scheduler-minikube |
8 | k8s_mongodb_mongodb-67c5456f4-d4bgv_robot-shop... | mongodb-67c5456f4-d4bgv |
9 | k8s_mongodb_mongodb-67c5456f4-ddhnf_robot-shop... | mongodb-67c5456f4-ddhnf |
10 | k8s_mysql_mysql-6d778f4c8f-4bcr7_robot-shop_93... | mysql-6d778f4c8f-4bcr7 |
11 | k8s_payment_payment-5465d9cc79-8ln4b_robot-sho... | payment-5465d9cc79-8ln4b |
12 | k8s_rabbitmq_rabbitmq-785b678f74-mhhtg_robot-s... | rabbitmq-785b678f74-mhhtg |
13 | k8s_ratings_ratings-7ccf67b49f-6qckr_robot-sho... | ratings-7ccf67b49f-6qckr |
14 | k8s_redis_redis-0_robot-shop_29c707fd-318a-4af... | redis-0 |
15 | k8s_sfcollector_sysflowagent-9wslt_sysflow_884... | sysflowagent-9wslt |
16 | k8s_sfexporter_sysflowagent-9wslt_sysflow_8845... | sysflowagent-9wslt |
17 | k8s_shipping_shipping-7f6dfbf46f-94trr_robot-s... | shipping-7f6dfbf46f-94trr |
18 | k8s_sidecar_sysflowagent-9wslt_sysflow_8845ce3... | sysflowagent-9wslt |
19 | k8s_storage-provisioner_storage-provisioner_ku... | storage-provisioner |
20 | k8s_user_user-899b6c7ff-c7wnj_robot-shop_6fad1... | user-899b6c7ff-c7wnj |
21 | k8s_user_user-899b6c7ff-qxd47_robot-shop_b9b15... | user-899b6c7ff-qxd47 |
22 | k8s_web_web-77486f858f-jnf9r_robot-shop_9f6687... | web-77486f858f-jnf9r |
pod.*
fields¶pod.internalip
¶Let us start by looking specifically at the pod.internalip
field first, limiting ourselves to SysFlow records for the robot-shop
namespace:
df_rs = df[df['pod.ns'] == 'robot-shop']
df_rs.sort_values('ts_uts').groupby(['pod.name', 'pod.internalip']).agg({'ts': ['min', 'max']}).reset_index().sort_values('pod.name')
pod.name | pod.internalip | ts | ||
---|---|---|---|---|
min | max | |||
0 | catalogue-998b69bc9-bfnr7 | (172.17.0.4,) | 2022-03-17T18:49:59.867636 | 2022-03-17T19:21:08.694894 |
1 | dispatch-69b65d89b9-4lgl7 | (172.17.0.13,) | 2022-03-17T18:50:02.550097 | 2022-03-17T19:21:08.963347 |
2 | mongodb-67c5456f4-d4bgv | (172.17.0.9,) | 2022-03-17T18:50:01.276773 | 2022-03-17T19:04:48.912214 |
3 | mongodb-67c5456f4-ddhnf | (172.17.0.16,) | 2022-03-17T19:04:55.568221 | 2022-03-17T19:20:39.186520 |
4 | mysql-6d778f4c8f-4bcr7 | (172.17.0.10,) | 2022-03-17T18:50:01.452982 | 2022-03-17T19:20:41.989241 |
5 | payment-5465d9cc79-8ln4b | (172.17.0.14,) | 2022-03-17T18:51:37.241520 | 2022-03-17T19:21:09.370607 |
6 | rabbitmq-785b678f74-mhhtg | (172.17.0.5,) | 2022-03-17T18:51:01.009719 | 2022-03-17T19:21:09.582437 |
7 | ratings-7ccf67b49f-6qckr | (172.17.0.12,) | 2022-03-17T18:50:02.401834 | 2022-03-17T19:20:40.260521 |
8 | redis-0 | (172.17.0.15,) | 2022-03-17T18:50:02.520990 | 2022-03-17T19:20:37.733984 |
9 | shipping-7f6dfbf46f-94trr | (172.17.0.11,) | 2022-03-17T18:50:02.571352 | 2022-03-17T19:20:40.710991 |
10 | user-899b6c7ff-c7wnj | (172.17.0.6,) | 2022-03-17T18:50:00.497336 | 2022-03-17T19:15:33.901413 |
11 | user-899b6c7ff-qxd47 | (172.17.0.7,) | 2022-03-17T19:15:12.484746 | 2022-03-17T19:21:08.707541 |
12 | web-77486f858f-jnf9r | (172.17.0.9,) | 2022-03-17T19:20:13.027331 | 2022-03-17T19:20:38.690758 |
We can see here that IP addresses are allocated to pods, i.e., if a new pod is respawned after the old one is killed (e.g., for the mongodb-*
pod), it will receive a new IP address (old ones being reused later on, e.g., 172.17.0.9
).
We use this information to update our list of podips
with the additional data found here.
for irow, row in df_rs.sort_values('ts_uts')[['pod.name', 'pod.internalip']].drop_duplicates().iterrows():
for ip in row['pod.internalip']:
podips.setdefault(ip, set()).add(row['pod.name'])
podips
{'172.17.0.10': {'mysql-6d778f4c8f-4bcr7'}, '172.17.0.11': {'shipping-7f6dfbf46f-94trr'}, '172.17.0.12': {'ratings-7ccf67b49f-6qckr'}, '172.17.0.13': {'dispatch-69b65d89b9-4lgl7'}, '172.17.0.14': {'payment-5465d9cc79-8ln4b'}, '172.17.0.15': {'redis-0'}, '172.17.0.4': {'catalogue-998b69bc9-bfnr7'}, '172.17.0.5': {'rabbitmq-785b678f74-mhhtg'}, '172.17.0.6': {'user-899b6c7ff-c7wnj'}, '172.17.0.8': {'cart-7d7745696b-qgb99'}, '172.17.0.9': {'mongodb-67c5456f4-d4bgv', 'web-77486f858f-jnf9r'}, '172.17.0.16': {'mongodb-67c5456f4-ddhnf'}, '172.17.0.7': {'user-899b6c7ff-qxd47'}}
The information in the pod.services
attribute shows us the services running in the robot-shop application and gives us details like IP address and port for each service:
table = []
for irow, row in df_rs.drop_duplicates(subset=['pod.name', 'pod.services_str']).iterrows():
for service in row['pod.services']:
# resolve portList x clusterIP
for cip in service['clusterIP']:
for port in service['portList']:
svc = service.copy()
del svc['portList']
svc.update(port)
del svc['clusterIP']
svc['clusterIP'] = cip
svc['pod.name'] = row['pod.name']
# svc.update(row)
table.append(svc)
df_services = pd.DataFrame(table)
df_services
name | id | namespace | port | targetPort | nodePort | proto | clusterIP | pod.name | |
---|---|---|---|---|---|---|---|---|---|
0 | redis | 81bc1a3e-f067-4b37-a077-54061050c1cb | robot-shop | 6379 | 6379 | 0 | TCP | 10.111.214.104 | redis-0 |
1 | mongodb | 09131df1-404c-4a02-bba2-ebe1e4393caa | robot-shop | 27017 | 27017 | 0 | TCP | 10.109.105.252 | mongodb-67c5456f4-d4bgv |
2 | user | f1686819-1201-4e1b-8d99-8579007104de | robot-shop | 8080 | 8080 | 0 | TCP | 10.105.106.134 | user-899b6c7ff-c7wnj |
3 | catalogue | d34b8f78-0865-4b54-a600-c011fe449545 | robot-shop | 8080 | 8080 | 0 | TCP | 10.96.58.129 | catalogue-998b69bc9-bfnr7 |
4 | shipping | e11145c8-1fe9-4f2d-a9a0-07d039d519ff | robot-shop | 8080 | 8080 | 0 | TCP | 10.103.83.70 | shipping-7f6dfbf46f-94trr |
5 | ratings | 1f438794-721a-4641-8643-8964cba70095 | robot-shop | 80 | 80 | 0 | TCP | 10.104.84.135 | ratings-7ccf67b49f-6qckr |
6 | mysql | d92b4e08-03bf-49a5-954b-026f7744482e | robot-shop | 3306 | 3306 | 0 | TCP | 10.107.7.181 | mysql-6d778f4c8f-4bcr7 |
7 | rabbitmq | 9a209f55-ed6f-4224-b6d3-2a056f9c7783 | robot-shop | 5672 | 5672 | 0 | TCP | 10.109.218.161 | rabbitmq-785b678f74-mhhtg |
8 | rabbitmq | 9a209f55-ed6f-4224-b6d3-2a056f9c7783 | robot-shop | 15672 | 15672 | 0 | TCP | 10.109.218.161 | rabbitmq-785b678f74-mhhtg |
9 | rabbitmq | 9a209f55-ed6f-4224-b6d3-2a056f9c7783 | robot-shop | 4369 | 4369 | 0 | TCP | 10.109.218.161 | rabbitmq-785b678f74-mhhtg |
10 | payment | 9eb039d4-dc0d-4837-9d8e-857cb8e5d8eb | robot-shop | 8080 | 8080 | 0 | TCP | 10.108.220.250 | payment-5465d9cc79-8ln4b |
Collect the high-level information for later identification in the observed network traffic:
services = {}
for irow, row in df_services.iterrows():
services[(row['clusterIP'], row['port'])] = f"{row['name']}-{row['port']}"
services
{('10.111.214.104', 6379): 'redis-6379', ('10.109.105.252', 27017): 'mongodb-27017', ('10.105.106.134', 8080): 'user-8080', ('10.96.58.129', 8080): 'catalogue-8080', ('10.103.83.70', 8080): 'shipping-8080', ('10.104.84.135', 80): 'ratings-80', ('10.107.7.181', 3306): 'mysql-3306', ('10.109.218.161', 5672): 'rabbitmq-5672', ('10.109.218.161', 15672): 'rabbitmq-15672', ('10.109.218.161', 4369): 'rabbitmq-4369', ('10.108.220.250', 8080): 'payment-8080'}
Let us take a look at the actually observed network traffic (i.e., the SysFlow NF records) for the robot-shop application, and collect this subset of SysFlow records into df_rs_traffic
:
df_nf = df[df.type == 'NF']
df_rs_traffic = df_nf[df_nf['pod.ns']=='robot-shop'].groupby(['pod.name', 'net.sip', 'net.dip', 'net.dport']).agg({'flow.rops':'sum', 'flow.rbytes':'sum','flow.wops':'sum','flow.wbytes':'sum'}).reset_index()
... and compare these records with the knowledge about the IP address background that we gathered from the new cluster metadata attributes above (while adding also some 'well-known' background IP information manually):
types_source = []
types_destination = []
for irow, row in df_rs_traffic.iterrows():
sip = row['net.sip']
dip = row['net.dip']
dport = row['net.dport']
type_source = ''
if sip == '0.0.0.0': type_source = '"localhost"'
elif sip == '127.0.0.1': type_source = '"localhost"'
elif sip == '172.17.0.1': type_source = '"docker network gateway"'
elif podips.get(sip): type_source = f'POD {podips[sip]}'
type_destination = ''
if dip == '0.0.0.0': type_destination = '"localhost"'
elif dip == '127.0.0.1': type_destination = '"localhost"'
elif dip == '172.17.0.1': type_destination = '"docker network gateway"'
elif podips.get(dip): type_destination = f'POD {podips[dip]}'
elif dport == 42699: type_destination = '"instana agent"'
elif sip == dip:
type_source = 'local'
type_destination = 'local'
else:
service = services.get((dip, dport), '')
if service != '': type_destination = f'SERVICE {service}'
if dip == '10.96.0.10' and dport == 53: type_destination = '"cluster DNS"'
types_source.append(type_source)
types_destination.append(type_destination)
df_rs_traffic['type_source'] = types_source
df_rs_traffic['type_destination'] = types_destination
# for readability
for col in ('net.dport', 'flow.rops', 'flow.rbytes', 'flow.wops', 'flow.wbytes'):
df_rs_traffic[col] = df_rs_traffic[col].apply(int)
df_rs_traffic[df_rs_traffic['flow.rbytes']>0] #[['pod.name', 'net.sip', 'net.dip', 'net.dport', 'type_source', 'type_destination']]
pod.name | net.sip | net.dip | net.dport | flow.rops | flow.rbytes | flow.wops | flow.wbytes | type_source | type_destination | |
---|---|---|---|---|---|---|---|---|---|---|
0 | catalogue-998b69bc9-bfnr7 | 172.17.0.4 | 10.109.105.252 | 27017 | 359 | 108558 | 1069 | 18992 | POD {'catalogue-998b69bc9-bfnr7'} | SERVICE mongodb-27017 |
1 | catalogue-998b69bc9-bfnr7 | 172.17.0.4 | 10.96.0.10 | 53 | 148 | 16136 | 148 | 6452 | POD {'catalogue-998b69bc9-bfnr7'} | "cluster DNS" |
4 | dispatch-69b65d89b9-4lgl7 | 172.17.0.13 | 10.109.218.161 | 5672 | 473 | 2469 | 123 | 1287 | POD {'dispatch-69b65d89b9-4lgl7'} | SERVICE rabbitmq-5672 |
5 | dispatch-69b65d89b9-4lgl7 | 172.17.0.13 | 10.96.0.10 | 53 | 2779 | 180224 | 1408 | 77440 | POD {'dispatch-69b65d89b9-4lgl7'} | "cluster DNS" |
9 | mongodb-67c5456f4-d4bgv | 172.17.0.1 | 172.17.0.9 | 27017 | 921 | 18316 | 344 | 104606 | "docker network gateway" | POD {'mongodb-67c5456f4-d4bgv', 'web-77486f858... |
10 | mongodb-67c5456f4-ddhnf | 172.17.0.1 | 172.17.0.16 | 27017 | 1026 | 19986 | 372 | 113133 | "docker network gateway" | POD {'mongodb-67c5456f4-ddhnf'} |
11 | mysql-6d778f4c8f-4bcr7 | 172.17.0.1 | 172.17.0.10 | 3306 | 4135 | 64937 | 1161 | 73956 | "docker network gateway" | POD {'mysql-6d778f4c8f-4bcr7'} |
15 | rabbitmq-785b678f74-mhhtg | 127.0.0.1 | 127.0.0.1 | 4369 | 35 | 470 | 24 | 470 | "localhost" | "localhost" |
16 | rabbitmq-785b678f74-mhhtg | 172.17.0.1 | 172.17.0.5 | 5672 | 245 | 1287 | 236 | 2469 | "docker network gateway" | POD {'rabbitmq-785b678f74-mhhtg'} |
17 | rabbitmq-785b678f74-mhhtg | 172.17.0.5 | 172.17.0.5 | 4369 | 39 | 375 | 27 | 375 | POD {'rabbitmq-785b678f74-mhhtg'} | POD {'rabbitmq-785b678f74-mhhtg'} |
18 | ratings-7ccf67b49f-6qckr | 172.17.0.1 | 172.17.0.12 | 80 | 680 | 40068 | 479 | 12744694 | "docker network gateway" | POD {'ratings-7ccf67b49f-6qckr'} |
19 | ratings-7ccf67b49f-6qckr | 172.17.0.12 | 10.107.7.181 | 3306 | 1068 | 56871 | 1335 | 51264 | POD {'ratings-7ccf67b49f-6qckr'} | SERVICE mysql-3306 |
20 | ratings-7ccf67b49f-6qckr | 172.17.0.12 | 10.96.0.10 | 53 | 742 | 91637 | 0 | 0 | POD {'ratings-7ccf67b49f-6qckr'} | "cluster DNS" |
21 | redis-0 | 172.17.0.1 | 172.17.0.15 | 6379 | 4 | 42 | 3 | 7914 | "docker network gateway" | POD {'redis-0'} |
23 | shipping-7f6dfbf46f-94trr | 172.17.0.1 | 172.17.0.11 | 8080 | 258 | 28122 | 258 | 35862 | "docker network gateway" | POD {'shipping-7f6dfbf46f-94trr'} |
24 | shipping-7f6dfbf46f-94trr | 172.17.0.11 | 10.107.7.181 | 3306 | 173 | 17085 | 93 | 13723 | POD {'shipping-7f6dfbf46f-94trr'} | SERVICE mysql-3306 |
25 | shipping-7f6dfbf46f-94trr | 172.17.0.11 | 10.96.0.10 | 53 | 10 | 1020 | 10 | 520 | POD {'shipping-7f6dfbf46f-94trr'} | "cluster DNS" |
27 | user-899b6c7ff-c7wnj | 172.17.0.6 | 10.109.105.252 | 27017 | 296 | 89710 | 883 | 15768 | POD {'user-899b6c7ff-c7wnj'} | SERVICE mongodb-27017 |
28 | user-899b6c7ff-c7wnj | 172.17.0.6 | 10.111.214.104 | 6379 | 1 | 2635 | 1 | 14 | POD {'user-899b6c7ff-c7wnj'} | SERVICE redis-6379 |
29 | user-899b6c7ff-c7wnj | 172.17.0.6 | 10.96.0.10 | 53 | 36 | 3816 | 36 | 1944 | POD {'user-899b6c7ff-c7wnj'} | "cluster DNS" |
33 | user-899b6c7ff-qxd47 | 172.17.0.7 | 10.109.105.252 | 27017 | 65 | 19471 | 191 | 3542 | POD {'user-899b6c7ff-qxd47'} | SERVICE mongodb-27017 |
34 | user-899b6c7ff-qxd47 | 172.17.0.7 | 10.111.214.104 | 6379 | 2 | 2641 | 1 | 14 | POD {'user-899b6c7ff-qxd47'} | SERVICE redis-6379 |
35 | user-899b6c7ff-qxd47 | 172.17.0.7 | 10.96.0.10 | 53 | 146 | 15984 | 146 | 5838 | POD {'user-899b6c7ff-qxd47'} | "cluster DNS" |
In this table we can see the summarized network flows inside the robot-shop application — how often, how many bytes have been read and written — in the context of the application, i.e., recognizing which pods and services that are involved!
Small further remarks:
podips
set). In order to make this exact, one would have to keep track of the time information in addition, e.g., which IP address is used by a pod during what time interval.NF
records themselves still bind that as before to the lower-level details like the process involved etc.In this notebook, we have taken a first look at the new cluster metadata for Kubernetes/OpenShift clusters that is available with the recent SysFlow 0.5.0 release.
In an experiment with Instana's robot-shop, we have seen this new cluster metadata at work and did an initial investigation into the collected data for that experiment, especially into the IP-related information newly available through the newly collected metadata.
We find that with the new data, the lower-level SysFlow data related to containers are put into the context of the cluster structure, namely pods, nodes and namespaces.
The new KE
records are in that respect complimentary to existing records, as they report data driven by cluster events like the creation of a new pod. Conversily, the standard SysFlow records contain now the cluster metadata directly attached via the pod.*
attributes.
With respect to observed IP addresses, the availability of the endpoint IPs/ports connected to services is especially interesting as it can be used, together with the IP information for the pods, to understand network flows internal to a cluster application.
Further extensions of these cluster-related metadata are planned, stay tuned!