Skip to content

Pre-proposal: Kernel messages to get and set variables #115

Open
@fcollonval

Description

@fcollonval

Summary

Add two kernel messages to get and set variables fully or partially (e.g. slice of a data frame).

Motivation

We should aim at user experience close to MatLab/Spyder rather than an IDE regarding user / kernel variables interaction; aka high flexibility to introspect and modify variable values on the fly.

Looking for example at the Spyder variable explorer, we can see it is easy to edit simple variables, add new ones (from scratch or by duplication).

Requirements

  • This should be an optional feature.
  • The message contents and replies must be independent of the kernel language.
  • The get message reply should allow to display a variable meaningfully for the user.
  • The get and set messages should support partial introspection and modification of a variable.
  • The set message can be used to create and to update a variable.

[Draft] Implementation details

  • This feature will be flagged as optional building on top of provisional JEP 92.

  • New kernel message get_variables:

    • Request content:
{
  "type": "object",
  "properties": {
    "variables":  {
        "type": "array",
        "uniqueItems": true,
        "items": {
          "type":"object",
          "properties": {
            // Variable unique identifier
            "name": {"type": "string", "minLength": 1}
            // TODO how to specify slicing/pagination
            // ...
            // Variable type
            "mimetype": {"type": "string", "minLength": 1 }
          },
          "required": ["name", "mimetype"]
        }
      }
    },
    "page": {
      "type": "integer",
      "mininum": 1
    },
    "pre_page": {
      "type": "integer",
      "mininum": 1
    }
}

Notes:

  • if no variable are specified, the kernel must return all available variables.

  • pagination is optional

    • Reply content:
{
  "type": "object",
  "properties": {
    // Request status - 'ok' does not mean all set succeeded only that the request was processed
    // if status is 'error', the field `ename`, `evalue` and `traceback` will be present but no `variables`
    // if status is 'abort', nothing else will be provided.
    "status": {
      "const": "ok"
    },
    "variables": {
      "type": "array",
      "items": {
        "oneOf": [
          {
            "type": "object",
            "properties": {
              // Variable unique identifier
              "name": {"type": "string", "minLength": 1}
              // TODO how to specify slicing/pagination
              // ...
              // Status of set per variable
              "status": {
                "const": "ok"
              },
              // Variable type
              "mimetype": {"type": "string", "minLength": 1 }
              // TODO Variable value
              // ... -> JSON, binary, something else (?)
              "value": {},
              "schema": {
                "description": "Text/plain schema that could be used to validate the value."
                "type": "string" 
              }
            },
            "required": ["name", "status", "mimetype", "value"]
          },
          // if status is 'error', the field `ename`, `evalue` and `traceback` will be present
          // if status is 'abort', only the `name` will be provided.
        ]
      }
    },
    "page": {
      "type": "integer",
      "mininum": 1
    },
    "last_page": {
      "type": "integer",
      "mininum": 1
    }
  },
  "required": ["status"]
}
  • There is no execution order
  • New kernel message set_variables:

    • Request content:
{
  "type": "array",
  "uniqueItems": true
  "items": {
    "type":"object",
    "properties": {
      // Variable unique identifier
      "name": {"type": "string", "minLength": 1}
      // TODO how to specify slicing/pagination
      // ...
      // Variable type
      "mimetype": {"type": "string", "minLength": 1 }
      // Variable value
      "value": // TODO ...
      // Value text/plain schema
      "schema": { "type": "string" }
    }
  }
}
  • Reply content:
{
  "type": "object",
  "properties": {
    // Request status - 'ok' does not mean all set succeeded only that the request was processed
    // if status is 'error', the field `ename`, `evalue` and `traceback` will be present but no `variables`
    // if status is 'abort', nothing else will be provided.
    "status": {
      "const": "ok"
    },
    "variables": {
      "type": "array",
      "items": {
        "oneOf": [
          {
            "type": "object",
            "properties": {
              // Variable unique identifier
              "name": {"type": "string", "minLength": 1}
              // TODO how to specify slicing/pagination
              // ...
              // Status of set per variable
              "status": {
                "const": "ok"
              }
            }
          },
          // if status is 'error', the field `ename`, `evalue` and `traceback` will be present
          // if status is 'abort', only the `name` will be provided.
        ]
      }
    }
  }
}
  • There is no execution order
  • Those messages should be executed as quietly as possible (as a silent execute request) and should not populate the history.

Rationale and alternatives

  • We could expand on the debugger adapter protocol. But the debugger feature has the following disadvantages:

    • Implementing DAP for a new kernel is though
    • DAP offers variable introspection but is not meant to create or edit variables
    • Running code with active debugging capabilities reduce code execution performances, what we want here is to be able to get/set variables mainly between cell execution request - no need to execute the code in debug mode.
    • A counterpoint for the previous case is that we could reuse some DAP messages even if the debugger is not active. The trouble is that it will create confusion. And what should a kernel author do if the get/set feature is wanted but not the DAP.
  • We could use silent execute request:

    • This is the path used to get variables by the Classical notebook and JupyterLab extensions.
    • The problem is that the syntax is not languages agnostic

Prior art

Variable explorers are an important features in science exploratory softwares; here are some examples

Extensions for existing Jupyter UI:

Unresolved questions

  • How should this feature interact with the debugger?

    • Should it be turned off or available? - I would vote for the second possibility for consistency
  • How to express variable type and its serialization over the wire?

  • How to specify array and data frame slicing?

  • Should we support slicing for more data types?

Future possibilities

A nice side effect of these new messages will be the ability to introduce no-code/low-code cells like Colab forms, SQL cells (the result of the cell would be a set message to the kernel),... . It could also help in developing polyglot kernels.


Edited:

  • add optional value text/plain schema
  • rename type in mimetype

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions