Skip to content

pass globally unique conversation identifier as sessionId in databricks api call#8576

Merged
lifeizhou-ap merged 2 commits intomainfrom
lifei/databricks-api-user-data
Apr 16, 2026
Merged

pass globally unique conversation identifier as sessionId in databricks api call#8576
lifeizhou-ap merged 2 commits intomainfrom
lifei/databricks-api-user-data

Conversation

@lifeizhou-ap
Copy link
Copy Markdown
Collaborator

@lifeizhou-ap lifeizhou-ap commented Apr 16, 2026

Summary

Why

There is some requirements for the org that would like to pass session id (conversation id) to databricks api call in the client_request_id

What

  • Generate a persistent instance ID (UUID) per machine to combine with the local session ID, forming a globally unique conversation identifier

  • Send this identifier as a client_request_id JSON field in Databricks API requests, gated behind the GOOSE_DATABRICKS_CLIENT_REQUEST_ID config flag or env

Testing

Checked that data is in the databricks table

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0e31b24ca1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +261 to +263
let enabled = std::env::var("GOOSE_DATABRICKS_CLIENT_REQUEST_ID")
.map(|v| v == "true")
.unwrap_or_else(|_| {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse env flag value as boolean

resolve_instance_id treats GOOSE_DATABRICKS_CLIENT_REQUEST_ID as enabled only when the env var is exactly "true", so common boolean spellings like "TRUE", "True", or "1" silently disable the feature. This makes the new client_request_id behavior appear flaky in real deployments where env values are often normalized differently by shells/CI, and it is inconsistent with other boolean flag parsing in this codebase.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used config getParam only as it also handles the env var

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4956a604c1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +24 to +27
if let Some(parent) = path.parent() {
let _ = fs::create_dir_all(parent);
}
let _ = fs::write(&path, &id);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail closed when persisting instance ID

load_or_create silently ignores both directory-creation and file-write failures, then still returns a newly generated UUID. In environments where Paths::state_dir() is not writable (for example, locked-down containers or read-only home/state mounts), this causes a different instance_id to be generated on each process restart, so the new client_request_id prefix is not actually stable across runs even though the feature depends on persistence.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If state_dir() is not writable, session storage and config would also be broken. Goose wouldn't function at all. This is an edge case

/// Returns a stable, globally unique identifier for this Goose installation.
/// The ID is generated once and persisted to disk, surviving restarts.
pub fn get_instance_id() -> &'static str {
&INSTANCE_ID
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we did one of these for posthog (or maybe that lib did it, we didn't need to?)

Copy link
Copy Markdown
Collaborator Author

@lifeizhou-ap lifeizhou-ap Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, posthog had the same thing. Initially I want to reuse the posthog, but I found it has more than instance id and has state in the file, and it also creates the file only when telemetry is enabled. We also need to consider backward compatibility.

So I decided to create the instance id and it can be used for other scenarios in the future

Copy link
Copy Markdown
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think seems fine to do - just wasn't sure if there was a previous existing fingerprint which was used for other monitoring in the past

@lifeizhou-ap
Copy link
Copy Markdown
Collaborator Author

lifeizhou-ap commented Apr 16, 2026

I think seems fine to do - just wasn't sure if there was a previous existing fingerprint which was used for other monitoring in the past

Thanks @michaelneale.

In Goose, we have 2 types of trackings. Otle (flow to the targets of Snowflakes and Datadog in our org) and Posthog, mainly for user behaviours.

Also on the Databrick side we have tables tracking for internal org usage especially on the LLM usage via Databricks local inference. This PR is to send the data via databricks api call to populate them in the Databricks table

@lifeizhou-ap lifeizhou-ap added this pull request to the merge queue Apr 16, 2026
Merged via the queue into main with commit a5a497b Apr 16, 2026
21 checks passed
@lifeizhou-ap lifeizhou-ap deleted the lifei/databricks-api-user-data branch April 16, 2026 04:45
jh-block added a commit to sunilkumarvalmiki/goose that referenced this pull request Apr 16, 2026
…l-placeholder

* origin/main: (64 commits)
  fix: expand tool calls by default when Response Style is Detailed (aaif-goose#8478)
  fix: create logs dir before writing llm request log (aaif-goose#8522)
  fix: enable token usage tracking and configurable stream timeout for Ollama provider (aaif-goose#8493)
  fix tauri-plugin-dialog version constraint to match other plugins (aaif-goose#8542)
  call goose serve from tauri frontend via goose-acp client (aaif-goose#8549)
  failed the script when bundle:default fails and cleanup "alpha"  (aaif-goose#8580)
  pass globally unique conversation identifier as sessionId in databricks api call (aaif-goose#8576)
  fix: use sqlx chrono decode for thread timestamps instead of manual parsing (aaif-goose#8575)
  docs: remove stale gemini-acp references (aaif-goose#8572)
  show individual untracked files in git changes widget (aaif-goose#8574)
  fix: update publishing flow to include new sdk dir (aaif-goose#8573)
  fix: remove double border on content in chat (aaif-goose#8545)
  chore(goose2): `just goose2 <command>` with the addition of `just goose2 kill` (aaif-goose#8570)
  Lifei/oltp data (aaif-goose#8458)
  Sidebar polish: search copy, dividers, project reorder, fix DnD (aaif-goose#8568)
  remove the workflow_dispatch check (aaif-goose#8563)
  fix: one more rename (aaif-goose#8562)
  fix(desktop): accept self-signed certs from configured external goosed host (aaif-goose#8400)
  alexhancock/npm-bumps (aaif-goose#8557)
  Remove npm publish from release for now (aaif-goose#8558)
  ...
michaelneale added a commit that referenced this pull request Apr 17, 2026
* main: (37 commits)
  polish: refine sidebar activity indicators, add placeholder token, and tidy search field (#8606)
  feat: add /edit command to cli for on-demand prompt editing (#8566)
  docs(mcp): add Rendex MCP Server extension tutorial (#8541)
  Lifei/delete tauri backend acp (#8582)
  chore: set goose binaries as executable in package.json (#8589)
  feat: add Novita AI as declarative provider (#8432)
  feat: add Kimi Code provider with OAuth device flow authentication (#8466)
  fix: chat loading-state model placeholder (#8431)
  fix: expand tool calls by default when Response Style is Detailed (#8478)
  fix: create logs dir before writing llm request log (#8522)
  fix: enable token usage tracking and configurable stream timeout for Ollama provider (#8493)
  fix tauri-plugin-dialog version constraint to match other plugins (#8542)
  call goose serve from tauri frontend via goose-acp client (#8549)
  failed the script when bundle:default fails and cleanup "alpha"  (#8580)
  pass globally unique conversation identifier as sessionId in databricks api call (#8576)
  fix: use sqlx chrono decode for thread timestamps instead of manual parsing (#8575)
  docs: remove stale gemini-acp references (#8572)
  show individual untracked files in git changes widget (#8574)
  fix: update publishing flow to include new sdk dir (#8573)
  fix: remove double border on content in chat (#8545)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants