Notebook

Gett¶

Step 1: list kernels and sessions¶

In the Jupyter server API, a session is a relationship between a document (or console) and a kernel. In practice, it is typical for each kernel to be associated with one document.

Caveats

Kernels do not need to be associated with a document, so listing kernels directly may discover more kernels than the sessions API, but this is rare.
Technically, one kernel can be associated with multiple documents, but this is also exceedingly rare and not super relevant.

In [1]:

import requests

token = "abc123"  # get this somewhere, e.g. $JUPYTERHUB_API_TOKEN or via a JupyterHub service token

s = requests.Session()
s.headers = {"Authorization": "token abc123"}
server_url = "http://127.0.0.1:8888"

In [2]:

session_list = s.get(f"{server_url}/api/sessions").json()
session_list

Out[2]:

[{'id': '0aa93fcc-0001-4de8-bb31-9aa5c9e00ffe',
  'path': 'notebook-a.ipynb',
  'name': 'notebook-a.ipynb',
  'type': 'notebook',
  'kernel': {'id': '2efac3d3-4adb-4193-9d52-a68940660347',
   'name': 'python3',
   'last_activity': '2022-08-03T12:15:36.181996Z',
   'execution_state': 'busy',
   'connections': 1},
  'notebook': {'path': 'notebook-a.ipynb', 'name': 'notebook-a.ipynb'}},
 {'id': 'ca73e455-7593-4633-aebd-b080c9bbcde1',
  'path': 'other notebook.ipynb',
  'name': 'other notebook.ipynb',
  'type': 'notebook',
  'kernel': {'id': '1482c156-1abc-44fd-8c35-74805a3c8e98',
   'name': 'python3',
   'last_activity': '2022-08-03T12:15:30.509980Z',
   'execution_state': 'idle',
   'connections': 2},
  'notebook': {'path': 'other notebook.ipynb',
   'name': 'other notebook.ipynb'}},
 {'id': '237f6734-4a16-46b8-ae82-bc65e678d96d',
  'path': 'console-1-49ab0dc9-a394-4a25-a4e3-f34cc22a5c6b',
  'name': 'Console 1',
  'type': 'console',
  'kernel': {'id': '0518a5da-38d3-4dcd-8e1c-ddb7fbc494b3',
   'name': 'python3',
   'last_activity': '2022-08-03T12:15:32.717393Z',
   'execution_state': 'idle',
   'connections': 3}}]

In [3]:

kernel_list = s.get(f"{server_url}/api/kernels").json()
kernel_list

Out[3]:

[{'id': '2efac3d3-4adb-4193-9d52-a68940660347',
  'name': 'python3',
  'last_activity': '2022-08-03T12:15:36.981129Z',
  'execution_state': 'busy',
  'connections': 1},
 {'id': '1482c156-1abc-44fd-8c35-74805a3c8e98',
  'name': 'python3',
  'last_activity': '2022-08-03T12:15:30.509980Z',
  'execution_state': 'idle',
  'connections': 2},
 {'id': '0518a5da-38d3-4dcd-8e1c-ddb7fbc494b3',
  'name': 'python3',
  'last_activity': '2022-08-03T12:15:32.717393Z',
  'execution_state': 'idle',
  'connections': 3}]

Step 2: look up kernel PIDs¶

Jupyter doesn't include PIDs in any part of the API because kernels need not be local to the Jupyter server (e.g. when they are run via a gateway server). However, they are often enough that the IPython kernel has a prototype message handler called a usage_request. We can use this to ask for information, including the PID of the process.

Note: this step only works for recent versions of ipykernel. This is a nonstandard message, and not supported by other kernels. An alternative would be to use an execute_request to run os.getpid(). This, in turn, is Python-specific and would need to be implemented.

To do this, we connect a websocket to the Jupyter server to send messages directly to the kernel:

In [7]:

import json

from tornado.websocket import websocket_connect
from jupyter_client.session import Session, json_packer

In [9]:

client_session = Session()

kernel_id = session_list[0]["kernel"]["id"]

ws = await websocket_connect(f"ws{server_url[4:]}/api/kernels/{kernel_id}/channels?token={token}")
msg = client_session.msg("usage_request", content={})
msg["channel"] = "control"
json_msg = json_packer(msg)
await ws.write_message(json_msg)
while True:
    reply = json.loads(await ws.read_message())
    if reply["channel"] == "control":
        break
reply["content"]

Out[9]:

{'hostname': 'heavy',
 'pid': 94479,
 'kernel_cpu': 1.1,
 'kernel_memory': 112164864,
 'host_cpu_percent': 8.3,
 'cpu_count': 10,
 'host_virtual_memory': {'total': 17179869184,
  'available': 4527505408,
  'percent': 73.6,
  'used': 7586217984,
  'free': 165265408,
  'active': 4448583680,
  'inactive': 4358438912,
  'wired': 3137634304}}

We can use this information to produce a map of PID to notebook document / console

In [10]:

async def get_kernel_pid(kernel_id):
    client_session = Session()
    ws = await websocket_connect(f"ws{server_url[4:]}/api/kernels/{kernel_id}/channels?token={token}")
    msg = client_session.msg("usage_request", content={})
    msg["channel"] = "control"
    json_msg = json_packer(msg)
    await ws.write_message(json_msg)
    while True:
        reply = json.loads(await ws.read_message())
        if reply["channel"] == "control":
            break
    ws.close()
    return reply["content"]["pid"]

for session in session_list:
    pid = await get_kernel_pid(session["kernel"]["id"])
    print(f"{pid}: {session['type']} {session['name']!r}")

94479: notebook 'notebook-a.ipynb'
94480: notebook 'other notebook.ipynb'
94481: console 'Console 1'

Alternative: using kernel ids from connection files¶

In the question, the runtime file is also mentioned as a valid input. This may be simpler and more general to use.

By convention, Jupyter servers put the kernel id in the connection file as well. So a simpler approach that works for all kernel languages is to check for the kernel id in.

You could know exactly how we produce these files and parse them out, but perhaps simpler and more robust to internal detail changes is simple substring matching:

In [13]:

from jupyter_core.paths import jupyter_runtime_dir
import glob

In [14]:

connection_files = glob.glob(f"{jupyter_runtime_dir()}/kernel-*.json")
connection_files

Out[14]:

['/Users/minrk/Library/Jupyter/runtime/kernel-2efac3d3-4adb-4193-9d52-a68940660347.json',
 '/Users/minrk/Library/Jupyter/runtime/kernel-1482c156-1abc-44fd-8c35-74805a3c8e98.json',
 '/Users/minrk/Library/Jupyter/runtime/kernel-0518a5da-38d3-4dcd-8e1c-ddb7fbc494b3.json']

In [16]:

import os

for session in session_list:
    kernel_id = session["kernel"]["id"]
    for connection_file in connection_files:
        fname = os.path.basename(connection_file)
        if kernel_id in fname:
            print(f"{fname}: {session['type']} {session['name']!r}")
            break

kernel-2efac3d3-4adb-4193-9d52-a68940660347.json: notebook 'notebook-a.ipynb'
kernel-1482c156-1abc-44fd-8c35-74805a3c8e98.json: notebook 'other notebook.ipynb'
kernel-0518a5da-38d3-4dcd-8e1c-ddb7fbc494b3.json: console 'Console 1'