[AgentProfile][sdk] AgentProfileStore + FK lifecycle#3775
Conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new SDK profile-store API, secret-at-rest handling, LLM-profile FK guard/cascade behavior, and concurrent cascade path by running real SDK scripts against the PR checkout.
Does this PR achieve its stated goal?
Yes. The PR set out to add an AgentProfileStore plus FK lifecycle helpers that prevent existing llm_profile_ref values from dangling when LLM profiles are deleted or renamed; exercising the SDK shows the new APIs are absent on the base branch, available on the PR branch, and work for persistence, summaries, guarded deletion, cascade rename, and secret-preserving rewrites. I also verified encrypted skill MCP secrets stay out of cleartext at rest and remain decryptable after a cascade rewrite.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully. |
| CI Status | gh pr checks: sdk-tests and qa-changes were still in progress; the other listed checks were passing at observation time. |
| Functional Verification | ✅ Real SDK scripts completed successfully on the PR head; no test suite, linters, or type checkers were run. |
Functional Verification
Test 1: API availability and basic AgentProfileStore lifecycle
Step 1 — Establish baseline on origin/agent-profile-model:
Ran:
git checkout --detach origin/agent-profile-model
OPENHANDS_SUPPRESS_BANNER=1 uv run python - <<'BASELINE_PY'
try:
from openhands.sdk.profiles import AgentProfileStore, delete_llm_profile, rename_llm_profile
except Exception as exc:
print(type(exc).__name__ + ': ' + str(exc))
else:
print('UNEXPECTED_AVAILABLE')
BASELINE_PYObserved:
ImportError: cannot import name 'AgentProfileStore' from 'openhands.sdk.profiles'
This establishes the new user-facing SDK store/FK APIs are not present on the base branch.
Step 2 — Apply PR changes: checked out agent-profile-store at c92976923191d1c1028a845f14719e43d5297b97.
Step 3 — Run real store operations on the PR head:
Ran a Python script that created temporary stores, saved OpenHands and ACP profiles, loaded them, renamed one, listed summaries, exercised max_profiles, and tried an invalid profile name. Key output:
basic.list.after_save=acp.json,oh.json
basic.loaded={"id_preserved": true, "llm_profile_ref": "default", "mcp_server_refs": ["fetch"], "name": "oh", "revision": 7, "type": "OpenHandsAgentProfile"}
basic.renamed={"files": ["acp.json", "oh-renamed.json"], "id_preserved": true, "name": "oh-renamed"}
basic.summaries=[{"agent_kind": "acp", ... "llm_profile_ref": null, ...}, {"agent_kind": "openhands", ... "llm_profile_ref": "default", "mcp_server_refs": ["fetch"], "name": "oh-renamed", "revision": 7}]
basic.max_profiles=blocked:Profile limit reached (2).
basic.invalid_name=blocked:Invalid profile name: 'bad/name'
This confirms the new store is usable through the exported SDK API and performs the expected lifecycle operations.
Test 2: Secret-at-rest and summary projection behavior
Baseline: same as Test 1 — the store API is unavailable on the base branch, so this behavior cannot be exercised there.
PR head verification:
Ran real SDK scripts saving Skill.mcp_tools secrets with and without a cipher, then inspected the persisted profile JSON and summaries. Key output:
secret.encrypted_at_rest={"cleartext_absent": true, "decrypts_to_secret": true, "has_encrypted_token": true, "summary_contains_skills": false}
default_redaction={"cleartext_absent": true, "redacted_present": false}
cascade_secret_preservation={"cleartext_absent": true, "decrypts_after_cascade": true, "llm_profile_ref": "renamed"}
secret_summary_projection={"contains_secret": false, "contains_skills_key": false, "summary_count": 1, "summary_keys": ["agent_kind", "id", "llm_profile_ref", "mcp_server_refs", "name", "revision"]}
This confirms no cleartext MCP secret was written, cipher-backed values were recoverable, cascade rename did not corrupt encrypted MCP tool data, and summaries omit skill/secret payloads.
Test 3: LLM-profile FK delete guard and rename cascade
Baseline: same as Test 1 — delete_llm_profile / rename_llm_profile are unavailable on the base branch.
PR head verification:
Ran a Python script that created LLM profiles plus agent profiles referencing them, attempted deletion of a referenced LLM profile, then renamed the LLM profile and reloaded affected agent profiles. Key output:
fk.referrers.default=fk-a,fk-b,oh-renamed,secretful
fk.delete_referenced=blocked:fk-a,fk-b,oh-renamed,secretful
fk.llm_after_blocked_delete=default.json,unused.json
fk.rename_result={"llm_files": ["renamed.json", "unused.json"], "old_referrers": [], "refs_after": {"fk-a": "renamed", "fk-b": "renamed", "fk-c": "other"}, "rewritten": ["fk-a", "fk-b", "oh-renamed", "secretful"]}
fk.delete_unreferenced=renamed.json
This shows deletion is blocked while references exist, the LLM profile remains after the blocked delete, rename cascades to all matching agent profiles, unrelated refs remain unchanged, and deleting an unreferenced LLM profile succeeds.
Test 4: Concurrent cascade access
PR head verification:
Ran concurrent reader threads calling find_referrers while another thread ran cascade_rename over 30 profiles. Key output:
concurrency={"errors": [], "final_new_count": 30, "final_old_count": 0, "rewritten_count": 30, "sample_snapshots": [[30, 0], [30, 0], [30, 0], [30, 0], [30, 0]]}
This confirms the concurrent path completed without runtime errors and all profiles ended up referencing the renamed LLM profile.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: [AgentProfile][sdk] AgentProfileStore + FK lifecycle
🟢 Good taste — The implementation follows the established LLMProfileStore patterns, and the FK lifecycle design is clean and correct.
[IMPROVEMENT OPPORTUNITIES] (Should fix - violates good taste)
-
[
openhands-sdk/openhands/sdk/profiles/agent_profile_store.py, Line ~215] Docstring Mismatch: Thedelete_llm_profiledocstring states "Lock order: agent-profiles (held across check + delete), then llm-profiles (insidellm_store.delete)" — but the code only explicitly acquires theagent_storelock. Thellm_store.delete()call acquires its own lock internally. The description isn't wrong, but it could be clearer: "Holds the agent-profiles lock across the referrer check and delegates deletion to llm_store (which manages its own lock)." -
[
openhands-sdk/openhands/sdk/profiles/profile_refs.py, Line ~40] Minor Clarity: TheProfileReferenced.__init__usesjoined = ", ".join(self.referrers) or "<none>"— theor "<none>"branch is unreachable becauseself.referrersis set from the input, but the docstring comment says "routers surface it in a 409" so empty lists are expected. Consider just using the join result directly or clarifying the edge case.
[STYLE NOTES]
- The
# Required: ``AgentProfileStore.list()`` shadows the builtin...comment at the top ofagent_profile_store.pyis appropriate — it explains a deliberate design decision that would otherwise confuse readers. ✅
[TESTING GAPS]
- Tests are comprehensive: real code paths, concurrent access, atomic operations, cipher round-trips, FK cascade scenarios. No gaps found. ✅
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW
The PR introduces a new module with no external API changes, uses atomic file operations with proper locking, and the FK lifecycle correctly prevents dangling references through lock ordering. Tests provide strong coverage of edge cases including concurrent access patterns.
VERDICT:
✅ Worth merging: Solid implementation with only minor documentation improvements suggested.
KEY INSIGHT:
The lock ordering discipline (agent-profiles lock before llm-profiles) is the critical invariant that prevents TOCTOU races in the FK lifecycle — it's well-documented and correctly enforced throughout.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
AgentProfileStore and the LLM-profile FK lifecycle worked end-to-end through the public SDK APIs, with secrets protected at rest and referenced LLM deletes blocked.
Does this PR achieve its stated goal?
Yes. The PR set out to add a persistent AgentProfileStore plus foreign-key lifecycle so llm_profile_ref does not silently dangle when LLM profiles are deleted or renamed. I verified the base branch does not expose AgentProfileStore, then exercised the PR as an SDK user: saved/loaded an agent profile with encrypted MCP skill secrets, confirmed list_summaries() only returns the metadata projection, confirmed delete is blocked with ProfileReferenced, and confirmed rename_llm_profile() renames the LLM profile and cascades the agent profile ref.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully with uv-managed dependencies installed. |
| CI Status | ✅ Visible functional checks passed (Run tests, pre-commit, docstrings, PR description); this QA job was still in progress and PR artifact cleanup checks were skipped. |
| Functional Verification | ✅ SDK-level user workflow passed on PR commit c92976923191d1c1028a845f14719e43d5297b97. |
Functional Verification
Test 1: AgentProfileStore availability before and after the PR
Step 1 — Establish baseline without the feature:
Checked out origin/agent-profile-model and ran:
uv run python /tmp/qa_agent_profile_baseline.pyObserved:
BASELINE: attempting to use AgentProfileStore from the public SDK profiles API
RESULT: AgentProfileStore is unavailable before this PR
ERROR: ImportError: cannot import name 'AgentProfileStore' from 'openhands.sdk.profiles'
This confirms the base branch has the AgentProfile model substrate but not the new public store API.
Step 2 — Apply the PR's changes:
Checked out c92976923191d1c1028a845f14719e43d5297b97.
Step 3 — Re-run with the PR in place:
Ran an SDK user workflow script that imported AgentProfileStore, delete_llm_profile, rename_llm_profile, and related public APIs. The import succeeded and the behavior script continued through all store operations.
Test 2: Store lifecycle, secret-at-rest behavior, summaries, and FK lifecycle
Step 1 — Baseline:
On the base branch the public import failed, so these store/FK operations could not be performed there.
Step 2 — Apply the PR's changes:
Checked out c92976923191d1c1028a845f14719e43d5297b97 and ran:
uv run python /tmp/qa_agent_profile_pr.pyObserved key output:
PR: exercising AgentProfileStore and FK lifecycle through SDK APIs
SAVED_FILE: writer.json
NO_CLEARTEXT_SECRET_AT_REST: True
ENCRYPTED_TOKEN_PRESENT: True
LOADED_PROFILE: name=writer ref=default revision=7
SUMMARY_PROJECTION: [{"agent_kind": "openhands", "id": "...", "llm_profile_ref": "default", "mcp_server_refs": ["fetch"], "name": "writer", "revision": 7}]
REFERRERS_BEFORE_DELETE: ['writer']
DELETE_BLOCKED: referrers=['writer']
LLM_PROFILES_AFTER_BLOCKED_DELETE: ['default.json']
RENAME_REWRITTEN_REFERRERS: ['writer']
LLM_PROFILES_AFTER_RENAME: ['renamed.json']
RELOADED_REF_AFTER_RENAME: renamed
AGENT_PROFILE_RENAMED: name=writer2 id_preserved=True
MAX_PROFILES_BLOCKED: Profile limit reached (3).
AGENT_PROFILES_AFTER_DELETE: ['second.json', 'third.json']
RESULT: PASS - AgentProfileStore and FK lifecycle behaved as described
This shows the PR delivers the claimed user-visible SDK behavior: profiles persist and load, MCP tool secrets are not stored as cleartext when a cipher is provided, summaries avoid loading skill data, referenced LLM profile deletion is prevented, LLM renames cascade into agent profiles, agent-profile rename preserves the stable id, max_profiles blocks new profiles over the cap, and deletion removes the profile from the store.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
|
Thanks for the review! Addressed in 795a32b:
Re-requesting review on the new SHA. |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new AgentProfile persistence and LLM-profile FK lifecycle through real SDK usage; the PR delivers the stated store + restrict/delete + cascade/rename behavior.
Does this PR achieve its stated goal?
Yes. On the base branch, AgentProfileStore is not available; on the PR commit, I successfully used the new SDK APIs to save/load/list/rename/delete OpenHands and ACP agent profiles, keep skill MCP secrets encrypted at rest, block deletion of a referenced LLM profile, cascade an LLM profile rename to the referencing agent profile, and preserve consistency under concurrent FK rename access.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no tests/linters/type checkers were run. |
| CI Status | 🟡 Product checks were green when checked; qa-changes was still in progress. |
| Functional Verification | ✅ Real SDK scripts exercised the new store and FK lifecycle successfully. |
Functional Verification
Test 1: New AgentProfileStore capability exists only after this PR
Step 1 — Establish baseline without the PR:
Ran git checkout --detach origin/agent-profile-model && uv run python /tmp/qa_agent_profile_store_baseline.py:
ImportError: cannot import name 'AgentProfileStore' from 'openhands.sdk.profiles'
This shows the base branch does not provide the new store API.
Step 2 — Apply the PR changes:
Checked out 795a32b69f9d59f820dab091e0c157f15d3e5a6d.
Step 3 — Re-run with the PR behavior:
Ran uv run python /tmp/qa_agent_profile_store_pr.py, which created real temp-backed AgentProfileStore and LLMProfileStore instances, saved OpenHands + ACP profiles, encrypted a skill MCP secret, attempted a guarded delete, cascaded an LLM profile rename, renamed the agent profile, and deleted an unreferenced LLM profile:
{
"agent_profiles": ["codex-helper.json", "daily-driver-renamed.json"],
"blocked_delete_referrers": ["daily-driver"],
"cascade_rewritten": ["daily-driver"],
"final_agent_name": "daily-driver-renamed",
"final_llm_ref": "renamed-default",
"llm_profiles": ["renamed-default.json"],
"secret_cleartext_found_on_disk": false
}This confirms the advertised store operations work, referenced LLM deletion is blocked, cascade rename updates llm_profile_ref, and the MCP secret was not stored in cleartext.
Test 2: Concurrent FK rename access
Ran a user-style concurrency script with 30 agent profiles referencing one LLM profile, 6 concurrent readers, and one rename_llm_profile(...) call:
{
"errors": [],
"rewritten_count": 30,
"default_referrers_after": 0,
"renamed_referrers_after": 30,
"llm_profiles_after": ["renamed-default.json"],
"observations_recorded": 180
}This confirms the cross-store rename completed without reader errors and all 30 referrers ended on the renamed LLM profile.
Test 3: Store edge behavior
Ran a real SDK script for user-facing edge cases:
{
"invalid_name_result": "Invalid profile name",
"max_profiles_result": "Profile limit reached (1).",
"stored_profiles": ["one.json"]
}This confirms invalid names are rejected and max_profiles blocks creating a second profile over the configured cap.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: AgentProfileStore + FK lifecycle
🟢 Good taste — Clean, well-scoped implementation. No fundamental issues.
Summary
This PR introduces AgentProfileStore (a standalone persistence layer for AgentProfile launch specs) and profile_refs (FK lifecycle helpers between LLM profiles and agent profiles). The design mirrors the existing LLMProfileStore pattern, which is a good choice — consistency beats novelty.
Files Reviewed
| File | LOC | Assessment |
|---|---|---|
agent_profile_store.py |
307 | 🟢 Well-structured |
profile_refs.py |
173 | 🟢 Concise, focused |
__init__.py |
18 | 🟢 Clean exports |
| Tests | 864 | 🟢 Comprehensive |
Architecture
The FK lifecycle (find_referrers, cascade_rename, delete_llm_profile, rename_llm_profile) correctly holds the agent-profiles lock first before touching the LLM profile store — this prevents deadlocks and TOCTOU races. The comment documenting lock acquisition order is valuable and should stay.
The surgical JSON rewrite in _rewrite_refs is elegant: it only touches llm_profile_ref, leaving encrypted mcp_tools and stable id untouched without needing a cipher. Good separation of concerns.
Minor Observations
[agent_profile_store.py, Line 75] The docstring mentions "secret-free at rest" — this is accurate given the masking serializer on Skill.mcp_tools. The deferred decryption note (#3717) is appropriate.
[profile_refs.py, Line 16-21] The ProfileReferenced exception with a referrers attribute is a clean pattern. Callers can inspect the list programmatically rather than parsing error messages.
Testing
The test suite covers:
- CRUD operations (save, load, rename, delete)
- Validation (invalid names, max_profiles limit)
- Concurrency (10-thread concurrent saves, mixed read/write)
- FK lifecycle (find_referrers, cascade_rename, guarded delete/rename)
- Edge cases (corrupt JSON, non-dict files, invalid filenames)
The concurrency tests are especially valuable given the file-based storage. Tests are real (no mocks), fast (temp dir), and deterministic.
[RISK ASSESSMENT]
⚠️ Risk Assessment: 🟢 LOW- New module additions with no impact on existing APIs
- Comprehensive test coverage including concurrency
- Follows established patterns from
LLMProfileStore
VERDICT:
✅ Worth merging — Solid implementation. The FK lifecycle closes a real gap (preventing dangling references on LLM profile deletion/rename), and the store design is clean and extensible.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
AgentProfileStore mirrors LLMProfileStore (one JSON file per profile under
~/.openhands/agent-profiles, filename = name, stable uuid id inside the file):
save/load/delete/rename/list_summaries, filelock, atomic write, name-pattern
validation, max-profiles cap, and a cipher param. With a cipher, skills[].
mcp_tools env/headers are encrypted at rest (recoverable) rather than redacted;
every other field is a reference and dumps in the clear. list_summaries
projects {id, name, agent_kind, revision, llm_profile_ref, mcp_server_refs}
without instantiating secrets.
profile_refs.py keeps llm_profile_ref from dangling: find_referrers,
cascade_rename (surgical raw-JSON ref rewrite, leaves encrypted mcp_tools and
the id untouched), ProfileReferenced(referrers), and guarded cross-store
delete/rename. Lock order is agent-profiles then llm-profiles, held across
check+mutate to close the referrer TOCTOU. FK scope is llm_profile_ref only;
mcp_server_refs are resolve-time (#3717), not constrained here.
Skill.mcp_tools has a masking serializer but no symmetric decrypt validator, so
encrypted values load as ciphertext; decryption is deferred to the resolver
(#3717). Tests cover both variants round-tripping with/without a cipher,
no-cleartext-at-rest, restrict-on-delete, cascade-rename under concurrency, and
the list_summaries no-decrypt projection.
Stacked on #3757 (AgentProfile union); rebase onto main once that merges.
Closes #3716
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address review nit: spell out that the guarded delete/rename hold the agent-profiles lock and delegate to the llm_store (which manages its own lock), preserving the agent-profiles-before-llm-profiles order. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
795a32b to
740fc7c
Compare
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
HUMAN:
The persistence store for agent profiles plus the foreign-key
lifecycle that keeps an
llm_profile_reffrom dangling when the underlying LLMprofile is renamed or deleted. Stacked on the model PR #3757.
AGENT:
Why
Reference composition (epic #3713) needs a real store for
AgentProfiles and aforeign-key lifecycle so
llm_profile_refnever silently dangles. This logicbelongs in the store/SDK layer (not routers) because it must hold the
AgentProfileStorefile lock while scanning + rewriting refs — a save-time FKenforced only at the HTTP layer would be a TOCTOU.
mcp_server_refsare keysinto the user's independently-mutable global
mcp_config, so they are checkedat resolve-time (#3717), not constrained here.
Summary
AgentProfileStore(openhands/sdk/profiles/agent_profile_store.py) —clones the
LLMProfileStoreshape under~/.openhands/agent-profiles: oneJSON file per profile, filename =
name, stableid(uuid) inside the file.save/load/delete/rename/list/list_summaries,filelock, atomictemp-file +
Path.replacewrite, name-pattern validation,max_profilescap,and a
cipherparam. With a cipher,skills[].mcp_toolsenv/headers areencrypted at rest (recoverable) rather than redacted; every other field is
a reference and dumps in the clear.
renamekeeps the in-filenamein syncvia a surgical raw-JSON edit (no cipher needed,
idpreserved).list_summariesprojects{id, name, agent_kind, revision, llm_profile_ref, mcp_server_refs}without instantiating secrets.openhands/sdk/profiles/profile_refs.py) —find_referrers(store, llm_profile_name),cascade_rename(store, old, new)(surgical ref rewrite that leaves encrypted
mcp_toolsand theiduntouched),
ProfileReferenced(referrers)(routers map → 409), plus guardedcross-store
delete_llm_profile/rename_llm_profile. Lock order isagent-profiles then llm-profiles, held across check+mutate to close the
referrer TOCTOU (documented in the module docstring).
Skill.mcp_toolshas a masking serializer(fix(skills): mask Skill.mcp_tools credentials at rest #3774) but no symmetric decrypt validator, so encrypted values load back as
ciphertext. The store guarantees no-cleartext-at-rest and threads the cipher
through
loadfor parity; decryption ofmcp_toolsis deferred to theresolver ([AgentProfile][sdk] resolve_agent_profile(): collision-checked composition + resource-specific secret channels #3717), which holds the cipher. Noted in
load's docstring and adedicated test. Adding a symmetric
Skill.mcp_toolsdecrypt validator is aclean follow-up to complete fix(skills): mask Skill.mcp_tools credentials at rest #3774's "decrypted on restore" intent — left out
here to keep this PR scoped to store + FK.
Issue Number
Closes #3716
How to Test
New unit tests mirror the
LLMProfileStoresuite (init / list / save / load /delete / rename /
list_summaries/max_profiles/ concurrency) and add theprofile-specific contracts: both union variants round-trip with and without a
cipher, no cleartext skill secret at rest, restrict-on-delete, cascade-rename
under concurrent access, and the
list_summariesno-decrypt projection.Acceptance criteria covered by tests:
test_save_with_cipher_encrypts_skill_secret/test_save_without_cipher_redacts_skill_secret— no cleartext skill secret at rest when a cipher is set.test_delete_llm_profile_blocked_when_referenced— deleting a referenced LLM profile raisesProfileReferencednaming the referrers.test_rename_llm_profile_renames_and_cascades+test_cascade_rename_atomic_under_concurrent_access— rename cascades to all referrers, atomic under concurrency.test_list_summaries_does_not_decrypt_secrets—list_summariesdoes not decrypt.test_find_referrers_matches_only_citing_openhands—mcp_server_refsare not FK-constrained at save-time (onlyllm_profile_refis scanned).Type
Notes
Stacked on #3757 (the
AgentProfilekind-discriminated union). The base ofthis PR is
agent-profile-model, so the diff shows only the store + FK work.Rebase onto
mainonce #3757 merges.Out of scope (deferred to their own epic issues): the resolver (#3717) and the
router (#3719). This PR is store + FK only.
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:740fc7c-pythonRun
All tags pushed for this build
About Multi-Architecture Support
740fc7c-python) is a multi-arch manifest supporting both amd64 and arm64740fc7c-python-amd64) are also available if needed