In the Jupyter server API, a session
is a relationship between a document (or console) and a kernel.
In practice, it is typical for each kernel to be associated with one document.
Caveats
kernels
directly may discover more kernels than the sessions API,
but this is rare.import requests
token = "abc123" # get this somewhere, e.g. $JUPYTERHUB_API_TOKEN or via a JupyterHub service token
s = requests.Session()
s.headers = {"Authorization": "token abc123"}
server_url = "http://127.0.0.1:8888"
session_list = s.get(f"{server_url}/api/sessions").json()
session_list
[{'id': '0aa93fcc-0001-4de8-bb31-9aa5c9e00ffe', 'path': 'notebook-a.ipynb', 'name': 'notebook-a.ipynb', 'type': 'notebook', 'kernel': {'id': '2efac3d3-4adb-4193-9d52-a68940660347', 'name': 'python3', 'last_activity': '2022-08-03T12:15:36.181996Z', 'execution_state': 'busy', 'connections': 1}, 'notebook': {'path': 'notebook-a.ipynb', 'name': 'notebook-a.ipynb'}}, {'id': 'ca73e455-7593-4633-aebd-b080c9bbcde1', 'path': 'other notebook.ipynb', 'name': 'other notebook.ipynb', 'type': 'notebook', 'kernel': {'id': '1482c156-1abc-44fd-8c35-74805a3c8e98', 'name': 'python3', 'last_activity': '2022-08-03T12:15:30.509980Z', 'execution_state': 'idle', 'connections': 2}, 'notebook': {'path': 'other notebook.ipynb', 'name': 'other notebook.ipynb'}}, {'id': '237f6734-4a16-46b8-ae82-bc65e678d96d', 'path': 'console-1-49ab0dc9-a394-4a25-a4e3-f34cc22a5c6b', 'name': 'Console 1', 'type': 'console', 'kernel': {'id': '0518a5da-38d3-4dcd-8e1c-ddb7fbc494b3', 'name': 'python3', 'last_activity': '2022-08-03T12:15:32.717393Z', 'execution_state': 'idle', 'connections': 3}}]
kernel_list = s.get(f"{server_url}/api/kernels").json()
kernel_list
[{'id': '2efac3d3-4adb-4193-9d52-a68940660347', 'name': 'python3', 'last_activity': '2022-08-03T12:15:36.981129Z', 'execution_state': 'busy', 'connections': 1}, {'id': '1482c156-1abc-44fd-8c35-74805a3c8e98', 'name': 'python3', 'last_activity': '2022-08-03T12:15:30.509980Z', 'execution_state': 'idle', 'connections': 2}, {'id': '0518a5da-38d3-4dcd-8e1c-ddb7fbc494b3', 'name': 'python3', 'last_activity': '2022-08-03T12:15:32.717393Z', 'execution_state': 'idle', 'connections': 3}]
Jupyter doesn't include PIDs in any part of the API because kernels need not be local to the Jupyter server
(e.g. when they are run via a gateway server).
However, they are often enough that the IPython kernel has a prototype message handler called a usage_request
.
We can use this to ask for information, including the PID of the process.
Note: this step only works for recent versions of ipykernel
. This is a nonstandard message, and not supported by other kernels.
An alternative would be to use an execute_request
to run os.getpid()
. This, in turn, is Python-specific and would need to be implemented.
To do this, we connect a websocket to the Jupyter server to send messages directly to the kernel:
import json
from tornado.websocket import websocket_connect
from jupyter_client.session import Session, json_packer
client_session = Session()
kernel_id = session_list[0]["kernel"]["id"]
ws = await websocket_connect(f"ws{server_url[4:]}/api/kernels/{kernel_id}/channels?token={token}")
msg = client_session.msg("usage_request", content={})
msg["channel"] = "control"
json_msg = json_packer(msg)
await ws.write_message(json_msg)
while True:
reply = json.loads(await ws.read_message())
if reply["channel"] == "control":
break
reply["content"]
{'hostname': 'heavy', 'pid': 94479, 'kernel_cpu': 1.1, 'kernel_memory': 112164864, 'host_cpu_percent': 8.3, 'cpu_count': 10, 'host_virtual_memory': {'total': 17179869184, 'available': 4527505408, 'percent': 73.6, 'used': 7586217984, 'free': 165265408, 'active': 4448583680, 'inactive': 4358438912, 'wired': 3137634304}}
We can use this information to produce a map of PID to notebook document / console
async def get_kernel_pid(kernel_id):
client_session = Session()
ws = await websocket_connect(f"ws{server_url[4:]}/api/kernels/{kernel_id}/channels?token={token}")
msg = client_session.msg("usage_request", content={})
msg["channel"] = "control"
json_msg = json_packer(msg)
await ws.write_message(json_msg)
while True:
reply = json.loads(await ws.read_message())
if reply["channel"] == "control":
break
ws.close()
return reply["content"]["pid"]
for session in session_list:
pid = await get_kernel_pid(session["kernel"]["id"])
print(f"{pid}: {session['type']} {session['name']!r}")
94479: notebook 'notebook-a.ipynb' 94480: notebook 'other notebook.ipynb' 94481: console 'Console 1'
In the question, the runtime file is also mentioned as a valid input. This may be simpler and more general to use.
By convention, Jupyter servers put the kernel id in the connection file as well. So a simpler approach that works for all kernel languages is to check for the kernel id in.
You could know exactly how we produce these files and parse them out, but perhaps simpler and more robust to internal detail changes is simple substring matching:
from jupyter_core.paths import jupyter_runtime_dir
import glob
connection_files = glob.glob(f"{jupyter_runtime_dir()}/kernel-*.json")
connection_files
['/Users/minrk/Library/Jupyter/runtime/kernel-2efac3d3-4adb-4193-9d52-a68940660347.json', '/Users/minrk/Library/Jupyter/runtime/kernel-1482c156-1abc-44fd-8c35-74805a3c8e98.json', '/Users/minrk/Library/Jupyter/runtime/kernel-0518a5da-38d3-4dcd-8e1c-ddb7fbc494b3.json']
import os
for session in session_list:
kernel_id = session["kernel"]["id"]
for connection_file in connection_files:
fname = os.path.basename(connection_file)
if kernel_id in fname:
print(f"{fname}: {session['type']} {session['name']!r}")
break
kernel-2efac3d3-4adb-4193-9d52-a68940660347.json: notebook 'notebook-a.ipynb' kernel-1482c156-1abc-44fd-8c35-74805a3c8e98.json: notebook 'other notebook.ipynb' kernel-0518a5da-38d3-4dcd-8e1c-ddb7fbc494b3.json: console 'Console 1'