Skip to content

[v3-beta] Add glossary to developer docs #1123

Open
@dlqqq

Description

@dlqqq

Problem

It's unclear how local variables, functions, and classes should be named due to the lack of established & documented naming conventions. Some of the existing naming conventions were poorly chosen, and make it difficult to read existing code.

Although this issue may seem trivial, I believe that good, well-documented naming conventions can save days of effort for contributors when measured across years of development.

This issue serves two purposes:

  1. To track progress on adding contributor documentation regarding naming conventions in v3.
  2. To track proposals for new naming conventions in v3.

Contributors are absolutely welcome to offer feedback and contribute suggestions! Please leave them as comments here.

Proposed name changes

New term: chat model

In v2, "language model" generally referred to the model used in the chat. With the introduction of completion models, we need to reconsider the name "language model", as it's ambiguous whether the term refers to the LLM used in chat or the LLM used in completions.

For v3, we should prefer "chat model" as much as possible for the sake of clarity.

New terms: model IDs and model UIDs

In v2, model IDs ambiguously refer to either the values used by Jupyter AI (e.g. openai-chat:gpt-4o) or the arguments accepted by a provider class (gpt-4o). Previously, to distinguish this, we referred to the former as global model IDs (abbreviated as gmid or gid), and the latter as local model IDs (abbreviated as lid or lmid).

  • Furthermore, in v2, to indicate that a model ID referred to a language model, variables were named lm_id, lm_lid, lm_gid. Similarly so for embedding models (em_id, em_gid, em_lid) and completion models (cm_id, cm_gid, cm_lid).

I have found this very confusing (even though I set these conventions). The v2 definitions produces 9 different ways to label model IDs.

For v3, I propose new definitions to eliminate ambiguity in the term "model ID":

  • Model name: the argument which identifies a model to a provider (e.g. gpt-4o)
  • Model ID: the argument which identifies a model to Jupyter AI (e.g. openai-chat:gpt-4o).
  • The definition of provider ID remains unchanged.

Local variables should be renamed accordingly:

  • lm_gid => chat_model_id
  • lm_lid => chat_model_name
  • em_gid => embedding_model_id
  • em_lid => embedding_model_name
  • etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions