pass globally unique conversation identifier as sessionId in databricks api call#8576
pass globally unique conversation identifier as sessionId in databricks api call#8576lifeizhou-ap merged 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0e31b24ca1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let enabled = std::env::var("GOOSE_DATABRICKS_CLIENT_REQUEST_ID") | ||
| .map(|v| v == "true") | ||
| .unwrap_or_else(|_| { |
There was a problem hiding this comment.
Parse env flag value as boolean
resolve_instance_id treats GOOSE_DATABRICKS_CLIENT_REQUEST_ID as enabled only when the env var is exactly "true", so common boolean spellings like "TRUE", "True", or "1" silently disable the feature. This makes the new client_request_id behavior appear flaky in real deployments where env values are often normalized differently by shells/CI, and it is inconsistent with other boolean flag parsing in this codebase.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
used config getParam only as it also handles the env var
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4956a604c1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let Some(parent) = path.parent() { | ||
| let _ = fs::create_dir_all(parent); | ||
| } | ||
| let _ = fs::write(&path, &id); |
There was a problem hiding this comment.
Fail closed when persisting instance ID
load_or_create silently ignores both directory-creation and file-write failures, then still returns a newly generated UUID. In environments where Paths::state_dir() is not writable (for example, locked-down containers or read-only home/state mounts), this causes a different instance_id to be generated on each process restart, so the new client_request_id prefix is not actually stable across runs even though the feature depends on persistence.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
If state_dir() is not writable, session storage and config would also be broken. Goose wouldn't function at all. This is an edge case
| /// Returns a stable, globally unique identifier for this Goose installation. | ||
| /// The ID is generated once and persisted to disk, surviving restarts. | ||
| pub fn get_instance_id() -> &'static str { | ||
| &INSTANCE_ID |
There was a problem hiding this comment.
I wonder if we did one of these for posthog (or maybe that lib did it, we didn't need to?)
There was a problem hiding this comment.
yes, posthog had the same thing. Initially I want to reuse the posthog, but I found it has more than instance id and has state in the file, and it also creates the file only when telemetry is enabled. We also need to consider backward compatibility.
So I decided to create the instance id and it can be used for other scenarios in the future
michaelneale
left a comment
There was a problem hiding this comment.
I think seems fine to do - just wasn't sure if there was a previous existing fingerprint which was used for other monitoring in the past
Thanks @michaelneale. In Goose, we have 2 types of trackings. Otle (flow to the targets of Snowflakes and Datadog in our org) and Posthog, mainly for user behaviours. Also on the Databrick side we have tables tracking for internal org usage especially on the LLM usage via Databricks local inference. This PR is to send the data via databricks api call to populate them in the Databricks table |
…l-placeholder * origin/main: (64 commits) fix: expand tool calls by default when Response Style is Detailed (aaif-goose#8478) fix: create logs dir before writing llm request log (aaif-goose#8522) fix: enable token usage tracking and configurable stream timeout for Ollama provider (aaif-goose#8493) fix tauri-plugin-dialog version constraint to match other plugins (aaif-goose#8542) call goose serve from tauri frontend via goose-acp client (aaif-goose#8549) failed the script when bundle:default fails and cleanup "alpha" (aaif-goose#8580) pass globally unique conversation identifier as sessionId in databricks api call (aaif-goose#8576) fix: use sqlx chrono decode for thread timestamps instead of manual parsing (aaif-goose#8575) docs: remove stale gemini-acp references (aaif-goose#8572) show individual untracked files in git changes widget (aaif-goose#8574) fix: update publishing flow to include new sdk dir (aaif-goose#8573) fix: remove double border on content in chat (aaif-goose#8545) chore(goose2): `just goose2 <command>` with the addition of `just goose2 kill` (aaif-goose#8570) Lifei/oltp data (aaif-goose#8458) Sidebar polish: search copy, dividers, project reorder, fix DnD (aaif-goose#8568) remove the workflow_dispatch check (aaif-goose#8563) fix: one more rename (aaif-goose#8562) fix(desktop): accept self-signed certs from configured external goosed host (aaif-goose#8400) alexhancock/npm-bumps (aaif-goose#8557) Remove npm publish from release for now (aaif-goose#8558) ...
* main: (37 commits) polish: refine sidebar activity indicators, add placeholder token, and tidy search field (#8606) feat: add /edit command to cli for on-demand prompt editing (#8566) docs(mcp): add Rendex MCP Server extension tutorial (#8541) Lifei/delete tauri backend acp (#8582) chore: set goose binaries as executable in package.json (#8589) feat: add Novita AI as declarative provider (#8432) feat: add Kimi Code provider with OAuth device flow authentication (#8466) fix: chat loading-state model placeholder (#8431) fix: expand tool calls by default when Response Style is Detailed (#8478) fix: create logs dir before writing llm request log (#8522) fix: enable token usage tracking and configurable stream timeout for Ollama provider (#8493) fix tauri-plugin-dialog version constraint to match other plugins (#8542) call goose serve from tauri frontend via goose-acp client (#8549) failed the script when bundle:default fails and cleanup "alpha" (#8580) pass globally unique conversation identifier as sessionId in databricks api call (#8576) fix: use sqlx chrono decode for thread timestamps instead of manual parsing (#8575) docs: remove stale gemini-acp references (#8572) show individual untracked files in git changes widget (#8574) fix: update publishing flow to include new sdk dir (#8573) fix: remove double border on content in chat (#8545) ...
Summary
Why
There is some requirements for the org that would like to pass session id (conversation id) to databricks api call in the
client_request_idWhat
Generate a
persistent instance ID (UUID)per machine to combine with the local session ID, forming a globally unique conversation identifierSend this identifier as a client_request_id JSON field in Databricks API requests, gated behind the
GOOSE_DATABRICKS_CLIENT_REQUEST_IDconfig flag or envTesting
Checked that data is in the databricks table