Skip to content

Pre-proposal: standardize object representations for ai and a protocol to retrieve them #128

Open
@mlucool

Description

Summary

To deeply integrate AI into Jupyter, we should standardize both a method on objects to represent themselves and a messaging protocol for retrieving these representations. We propose using _ai_repr_(**kwargs) -> str | dict for objects to return representations. Additionally, we suggest creating a registry in kernels (e.g. IPython) for users to set representations for objects that do not define this method, along with a new message type for retrieving these representations.

Motivation

Users should be able to include representations of instances of objects from their kernel as they interact with AI. This capability is what sets a productive Jupyter experience apart from other IDE-based approaches. For example, you should be able to use Jupyter AI and ask Given @myvar, how do I add another row or What's the best way to do X with @myvar?

While using something like _repr_* may have been sufficient, it can slow down display requests and does not allow passing information to hint about the shape of representations. For example, imagine a Chart. In a multimodal model, we may want to use a rendered image, but in a text-only model, we may want to pass only a description. Other model parameters or user preferences may also matter, such as the size of the context window or how verbose they want the representation to be.

Because of this, we suggest defining a new standard called _ai_repr_(**kwargs) -> str | dict. This method should return either a string or a MIME bundle. Additionally, since many libraries will not have this defined initially, there should be a registry where users can create a set of defaults and/or overrides, allowing them to use this feature without waiting for libraries to define it themselves.

Finally, the UI (e.g., jupyter-ai) needs a way to retrieve these representations for a given object. This is best done by introducing a new message type that can include the object and the kwargs. We expect this process to be slow at times (e.g., generating an image for a chart), so the control channel should not be used. Instead, a normal comms message can be used today, and as support for subshells improves, we can use that to avoid blocking while kernels are busy.

Example

Continuing with the chart object example, we may want to add something like below. Typically this fictional chart it returns structured data for its JS display to render, but now we want an image for the context, which we expect to be slow to compute (e.g. a headless browser may need to be launched to do this):

class JSBasedChart:
    ...

    def _ai_repr_(self, **kwargs):
        return {
            "text/plain": f"A chart titled {self.title} with series named {self.series_names}",
            "image/png": self.get_image()
        }

Other MIME types can also be used to enable the caller to represent the object in an optimized way for the model they are using (e.g., XML). For example, we could imagine Pandas DataFrame's defining this method:

class DataFrame:
    ...
    def _ai_repr_(self, **kwargs):
        info_buf = io.StringIO()
        self.info(buf=info_buf, memory_usage=False, show_counts=False)

        return {
            "text/plain": self.to_string(),
            "application/foo": {
                "type": "pandas.DataFrame",
                "value": f"Some random rows from the dataframe:\n{self.sample(min(5, len(self)))}",
                "structure": info_buf.getvalue()
            }
        }

Now the caller can use this MIME type to render the object in the context window using xml if it chooses (see here):

<variable>
    <name>{name}</name>
    <type>{type}</type>
    <value>{value}</value>
    <structure>{structure}</structure>
</variable>

This approach intentionally mirrors how repr works in the Jupyter ecosystem, but it focused on non-displayed reprs. In a similar fashion, we don't want to over-specify return types because we want to encourage innovation in this area.

Given the desire to query for this from the front end, we also propose a new message type similar to inspect_request, but allowing kwargs to be passed in by the caller. We intentionally do not want to define what these kwargs are at this early stage, preferring to let extension providers innovate and reach a consensus on what is useful. In the example above, we may pass multimodal=False and update the code in JSBasedChart to not render an image or we may pass context_window=1_000_000 and let the DataFrame repr include statistics per column or maybe even put small tables into the context window as is.

CC @Carreau @krassowski @SylvainCorlay

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions