Skip to content

Audit: Bloat check on PR #1479, corrections to #1480, and what actually matters #1490

@MervinPraison

Description

@MervinPraison

Audit: Bloat check on PR #1479, corrections to #1480, and what actually matters

Overview

Honest, evidence-first re-audit of recent self‑improving‑agent work with a strict "only ship what adds real user value" lens. Outcomes:

  1. One concrete bloat to remove in current main (dead code shipped by PR Claude/issue 1471 20260420 0633 #1479).
  2. Two gaps in Feature: Round-2 Gap Closure — Interrupts, Large Tool-Result Storage, Parallel-Tool Path Overlap, Error Classifier, Shared Subagent Budget, Title Auto-Gen, Dialectic User Model, Credential Pool, Reasoning-Block Handling, Safe Skill Installs #1480 are invalid (features already exist — I missed them).
  3. Four gaps should be dropped (cargo‑cult, low real‑world value).
  4. Four gaps should be kept but narrowed in scope to minimal, opt‑in impl.
  5. Two gaps are genuinely critical and should proceed as specified.

Net effect: praisonaiagents gets smaller, not bigger. Only features that make users safer, faster, or more confident land.


Method

  • Grep‑based evidence, not assumptions. Every claim below is tied to a line reference.
  • Lens: "Would a real user notice? Would they file a bug if this were missing? Does it prevent data loss / cost loss / embarrassment?"
  • If no → drop.

Part 1 — Bloat already in main (remove)

B1. PushClient and concrete transports — zero internal callers

Shipped by PR #1479 (commit 936f39ed):

$ grep -rn "PushClient\b" src/praisonai-agents src/praisonai --include="*.py" \
    | grep -v "push/\|test_push\|__init__"
# (no output)
  • praisonaiagents/push/client.py (388 LOC)
  • praisonaiagents/push/transports.py (214 LOC)
  • No file in either package imports or instantiates PushClient.
  • The SERVER side (WebSocketGateway, GatewaySession) is used — by praisonai/cli/features/gateway.py, cli/features/serve.py, cli/commands/gateway.py. That's legit.
  • The CLIENT side was shipped speculatively "for future SDK users" but is dead code today.

Decision: close this out via issue #1482 (already filed). Move PushClient + transports to praisonai wrapper or delete entirely if there is no concrete user ask. Every LOC that ships has to earn its place — 600+ LOC of unused consumer SDK code does not.

Acceptance test for #1482: grep -rn "PushClient" src/praisonai-agents/ returns zero hits.


Part 2 — #1480 gaps that I got wrong (close these items)

❌ G1 — "Add praisonai init wizard" → ALREADY EXISTS

$ cat src/praisonai/praisonai/cli/commands/setup.py
  • praisonai setup (+ alias praisonai setup wizard) is a 173‑line typer command that already:
    • Offers 5 providers (OpenAI, Anthropic, Google, Ollama, Custom)
    • Accepts --non-interactive --provider --api-key --model
    • Delegates to features/setup/handler.py
    • Is idempotent, safe to re‑run

Verdict: duplicate of existing setup command. Drop G1 entirely. Recommendation: just add one doc line to README pointing first‑time users to praisonai setup. No new CLI command, no new 150 LOC.

❌ G10 — "Structured dialectic user model" → PersonaStore is already enough

$ grep -n "class PersonaStore" src/praisonai-agents/praisonaiagents/memory/learn/stores.py
208:class PersonaStore(BaseStore):
  • PersonaStore already captures user facts through LearnConfig(persona=True).
  • Structured categorisation (preference / goal / expertise / constraint) adds marginal value for 95% of users. It's the kind of feature that looks impressive in a spec and is never queried in practice.
  • If categorisation is wanted, it costs one optional column in PersonaStore.observe() — no new store, no new protocol.

Verdict: drop G10. If anyone wants dimensions, open a focused 20‑LOC PR against PersonaStore.


Part 3 — #1480 gaps to drop (cargo‑cult from reference codebase)

❌ G6 — Surrogate/non‑ASCII sanitiser

Reference codebase scrubs surrogates because it supports a messaging transport where bytes round‑trip through shells with weird locales. PraisonAI users predominantly hit litellm, which already handles unicode. No bug has been filed. No user has asked for this. Adding 80 LOC to defend against a hypothetical is classic premature hardening.

Verdict: drop G6. If a surrogate crash is ever reported, add a 3‑line .encode("utf-16","surrogatepass").decode("utf-16","replace") at the one call site that needs it. Not a feature.

❌ G7 — Shared iteration budget across parent + subagents

The one use case: "user sets max_iterations=20, agent spawns 3 subagents each with 20 → total 80 tool calls, user surprised". Real‑world hit rate: low. PraisonAI users rarely build deep subagent trees, and those who do can set max_iterations explicitly on the delegator.

Verdict: drop G7. Revisit only if a user reports runaway subagent cost.

❌ G11 — Credential pool / multi‑API‑key rotation

  • Most users: one key. They don't care.
  • Heavy users already have litellm provider‑level failover.
  • Adding a credential pool doubles the key‑management surface area for a niche.

Verdict: drop G11 from core. If someone really wants key rotation, they can supply a litellm router config; PraisonAI doesn't need to own this.

❌ G9 (originally thinking‑block handling) — scope too wide

The genuine user pain: with reasoning models, <think>…</think> leaks into user‑visible chat. That's a one‑line fix at the render step, not a new module.

Verdict: don't add thinking.py. Instead, in the existing display/render path, strip <think>…</think> when detected. ~15 LOC, inside the wrapper's output/ module. Keep as a minor enhancement ticket, not a gap.


Part 4 — Gaps to keep, with narrowed scope

✅ G2 — Interrupt (reduce scope: wire the EXISTING protocol, don't invent)

$ grep -n "def interrupt" src/praisonai-agents/praisonaiagents/agent/protocols.py
406:    def interrupt(self) -> None:

The protocol stub exists but Agent has no concrete implementation:

$ grep -n "def interrupt" src/praisonai-agents/praisonaiagents/agent/agent.py \
                          src/praisonai-agents/praisonaiagents/agent/*mixin.py
# (no results)

Revised scope (not a new module):

  • Add Agent.interrupt() → sets an Event flag.
  • In execution_mixin._run_loop: one if self._interrupt_flag.is_set(): break check per iteration.
  • Propagate flag to child agents via delegator.
  • Total diff: ~50 LOC + 1 test, not 120. No new InterruptController class.

✅ G3 — Large tool result storage (reduce scope: upgrade existing truncation, don't bolt on a new protocol)

PraisonAI already has truncation infrastructure I missed:

$ grep -rn "truncate\|max_output_size\|tool_outputs_truncated" src/praisonai-agents/praisonaiagents
tools/python_tools.py:313   # stdout head+tail truncation
tools/mentions.py:183,283   # file content truncation
context/store.py:200        # "Tag messages as truncated (non-destructive)"
context/instrumentation.py:53   # tool_outputs_truncated counter
context/aggregator.py:208   # "Merge and truncate to fit token budget"
context/models.py:199       # TRUNCATE = "truncate" mode

Revised scope:

  • Add one optional "retrieval escape hatch": when a tool output is truncated, store the full content to a session‑scoped file and expose load_truncated(ref) tool so the agent can fetch full content on demand.
  • Reuse context/store.py and context/aggregator.py. No new ToolResultStoreProtocol.
  • Total diff: ~150 LOC + 2 tests, all in praisonai wrapper (no core SDK change).

✅ G4 — Parallel tool path‑overlap guard

Keep exactly as filed. Concurrent writes to the same path is a real data‑loss class of bug. Low LOC (~100), high safety value.

✅ G5 — Error classifier (reduce scope: only CONTEXT_LIMIT, drop the rest)

The only error category with a distinct automated recovery is context‑length overflow → auto‑compress → retry. AUTH doesn't need a classifier (one check at init). TRANSIENT is already handled by litellm retry. RATE_LIMIT is already handled.

Revised scope:

  • Add a single helper is_context_overflow(err) -> bool (~10 LOC).
  • Wire in llm/llm.py: on detection → trigger existing context/compressor.py → retry once.
  • No new ErrorCategory enum, no classifier module. Total: ~30 LOC + 1 test.

✅ G8 — Session title auto‑gen

Keep as filed. Real UX win for anyone browsing past chats. ~100 LOC, lazy, failure‑silent. Worth it.

✅ G12 — OSV scan on skill install

Keep, gated. Only proceeds when Skills Hub (#1471 Phase 6) ships a remote skill fetcher. Before then, the vulnerability surface is zero. Don't build in advance.


Part 5 — What PraisonAI already does BETTER than the reference

These aren't gaps — they're strengths to retain / market:

Capability PraisonAI External reference
Protocol‑driven core 60+ Protocol classes; lazy wrapper loading Monolithic module
Multi‑provider LLM litellm → 100+ models Hand‑maintained adapters per provider
Pluggable memory/RAG Chroma, Mem0, LangMem, AG‑Memory, Zep, MongoDB, Qdrant, PGVector One memory provider
Hierarchical process w/ manager validation process: hierarchical + manager_llm Not available
Built‑in injection defense praisonai.security.enable_security() (68 tests) External only
Observability bridge LangSmith, Langfuse, Opik, Braintrust, Datadog, MLflow, LangWatch Logging only
AgentOS dashboard praisonai dashboard N/A
Hooks system HookEvent bus with before/after agent/LLM/tool Not formalized
TypeScript port Full praisonai-ts Python only

Don't lose these by bloating the core with features nobody asked for.


Revised scorecard (what this audit actually ships)

Item Original #1480 estimate Revised estimate Decision
B1. Delete dead PushClient 600 LOC removed Ship via #1482
G1 init wizard 150 LOC 0 Drop (exists)
G2 interrupt 120 LOC ~50 LOC Keep, narrow scope
G3 large tool result 350 LOC ~150 LOC (wrapper only) Keep, narrow scope
G4 path overlap 100 LOC 100 LOC Keep as‑is
G5 error classifier 250 LOC ~30 LOC Keep, narrow to context‑overflow
G6 surrogate sanitise 80 LOC 0 Drop
G7 shared budget 200 LOC 0 Drop
G8 title auto‑gen 100 LOC 100 LOC Keep as‑is
G9 thinking blocks 150 LOC ~15 LOC render‑path strip Keep, narrow scope
G10 user model 300 LOC 0 Drop (PersonaStore suffices)
G11 credential pool 250 LOC 0 Drop
G12 OSV scan 150 LOC 150 LOC (gated) Keep, gated on Skills Hub
Total additions ~2200 LOC ~595 LOC ‑73%
Net change to core SDK +~1200 LOC ≤150 LOC (only G4 + interrupt impl) protocol‑first preserved

Proposed execution order

  1. Refactor: Move PushClient and concrete transports out of core SDK into praisonai wrapper #1482 merges → 600 LOC concrete impl moves to wrapper or deletes. Import time ≤30 ms.
  2. G2 interrupt → 50 LOC, all in core Agent (protocol stub already there).
  3. G4 path overlap → 100 LOC in tools/call_executor.py.
  4. G5 context‑overflow auto‑recovery → 30 LOC in llm/llm.py wired to existing compressor.
  5. G8 title auto‑gen → 100 LOC in praisonai wrapper, reuse existing session.set_title.
  6. G9 <think> strip → 15 LOC in wrapper's output formatter.
  7. G3 retrieval escape hatch for truncated results → 150 LOC wrapper side, reuse existing truncation plumbing.
  8. G12 OSVonly when Skills Hub lands.

Net new LOC in praisonaiagents (core SDK): ~150. Net new LOC in praisonai (wrapper): ~300. Removed from praisonaiagents: ~600 (PushClient move). Final: core SDK is smaller and sharper.


Acceptance criteria for this audit


Principles adopted from this audit (for future work)

  1. Don't mirror a reference codebase feature‑for‑feature. Match when user value overlaps; diverge where we're better.
  2. Every new Protocol needs an in‑tree caller. No speculative SDK surfaces.
  3. Prefer extending existing plumbing over new modules. Truncation → retrieval, persona → categorised, not a new store.
  4. If no bug ticket exists, it's not a gap. Surrogate scrubbing, credential pools, etc.
  5. Wrapper is the default home for new concrete code. Core SDK only accepts protocols + tiny dataclass configs.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingclaudeAuto-trigger Claude analysisjavascriptPull requests that update javascript codesecurity

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions