Skip to content

Commit 9454eca

Browse files
RBKunnelaclaude
andcommitted
feat: add Veritas trust framework documentation and JS SDK types
- README: add Veritas Trust Layer section (#6) with trust scoring and verified retrieval code examples, update benchmark comparison table with Trust/Verification column, add "Verify" step to How It Works - JS SDK: add v0.10.0 trust/verification types (TrustLevel, AgentTrustProfile, VerificationStatus, VerifiedMemory, VerifiedResults, TrustWeights, etc.) - JS SDK: bump version 0.9.0 → 0.10.0 to match Python package - Add Veritas trust layer architecture diagram (Excalidraw + PNG) - gitignore: add docs/sales/ for internal pitch documents Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent d0debb3 commit 9454eca

7 files changed

Lines changed: 2116 additions & 12 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,7 @@ docs/reports/
102102
docs/reviews/
103103
docs/frontend/
104104
docs/api/
105+
docs/sales/
105106
docs/ROADMAP_TO_FUNDING.md
106107
docs/VIDEO_TALKING_POINTS.md
107108
docs/GOOD_FIRST_ISSUES.md

README.md

Lines changed: 97 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -79,14 +79,14 @@ ALMA is benchmarked against [LongMemEval](https://xiaowu0162.github.io/long-mem-
7979

8080
![ALMA Benchmark Comparison](docs/diagrams/alma-benchmark-results.png)
8181

82-
| System | LongMemEval | API Keys | Memory Types | Feedback Loop |
83-
|--------|-------------|----------|--------------|---------------|
84-
| **ALMA** | **R@5=0.964** | None | 5 | Yes (v1.0) |
85-
| Mem0 | ~49% acc.* | GPT-4o | 2 | No |
86-
| Zep | 71.2% acc.* | GPT-4o | 1 | No |
87-
| Letta | Not published | GPT-4o | 2 | No |
88-
| Beads | Not published | None | N/A (tasks) | No |
89-
| RuVector | Not published | None | N/A (vectors) | Self-learning |
82+
| System | LongMemEval | API Keys | Memory Types | Trust/Verification | Feedback Loop |
83+
|--------|-------------|----------|--------------|-------------------|---------------|
84+
| **ALMA** | **R@5=0.964** | None | 5 | **Veritas (built-in)** | Yes (v1.0) |
85+
| Mem0 | ~49% acc.* | GPT-4o | 2 | No | No |
86+
| Zep | 71.2% acc.* | GPT-4o | 1 | No | No |
87+
| Letta | Not published | GPT-4o | 2 | No | No |
88+
| Beads | Not published | None | N/A (tasks) | No | No |
89+
| RuVector | Not published | None | N/A (vectors) | No | Self-learning |
9090

9191
*Accuracy (end-to-end with LLM) vs ALMA's Recall@5 (retrieval-only). Different metrics — not directly comparable.
9292

@@ -113,7 +113,9 @@ Full methodology: [BENCHMARK-REPORT.md](docs/benchmarks/BENCHMARK-REPORT.md)
113113

114114
![ALMA Retrieval Pipeline](docs/diagrams/alma-retrieval-pipeline.png)
115115

116-
**Retrieve:** Your agent asks ALMA for relevant memories. ALMA searches using FAISS vector similarity, scores results by relevance + recency + success rate + confidence, and returns the most useful context.
116+
**Retrieve:** Your agent asks ALMA for relevant memories. ALMA searches using FAISS vector similarity, scores results by relevance + recency + success rate + confidence, and returns the most useful context. With Veritas trust scoring enabled, memories from trusted agents rank higher automatically.
117+
118+
**Verify:** For high-stakes decisions, ALMA's verified retrieval cross-checks memories against each other. Contradictions are flagged before your agent acts on bad data.
117119

118120
**Learn:** After the task, ALMA records what happened — success or failure, what strategy was used, how long it took.
119121

@@ -180,6 +182,89 @@ ALMA is a library, not a service. Your database, your rules.
180182
| **Qdrant / Pinecone / Chroma** | Managed vector DB | Varies |
181183
| **Azure Cosmos DB** | Enterprise | Azure pricing |
182184
185+
### 6. Veritas Trust Layer — trust your agent's memories
186+
187+
![Veritas Trust Layer](docs/diagrams/alma-veritas-trust-layer.png)
188+
189+
When you run multiple agents, memories can conflict. Agent A says "lead is disqualified." Agent B says "lead is engaged." Which one does your agent trust?
190+
191+
ALMA includes the **Veritas trust framework** — built-in trust scoring and memory verification so your agents don't act on bad data.
192+
193+
**Trust Scoring** — Every agent builds a trust profile over time. Memories from trusted agents rank higher.
194+
195+
```python
196+
from alma.retrieval.trust_scoring import TrustAwareScorer, AgentTrustProfile
197+
198+
# Create trust-aware scorer
199+
scorer = TrustAwareScorer()
200+
201+
# Set trust profiles for your agents
202+
scorer.set_trust_profile(AgentTrustProfile(
203+
agent_id="senior-dev",
204+
sessions_completed=50,
205+
total_actions=200,
206+
total_violations=2, # Very few mistakes
207+
consecutive_clean_sessions=15,
208+
))
209+
210+
scorer.set_trust_profile(AgentTrustProfile(
211+
agent_id="new-intern-bot",
212+
sessions_completed=3,
213+
total_actions=10,
214+
total_violations=4, # Lots of mistakes early on
215+
))
216+
217+
# Score memories — senior-dev's memories rank higher automatically
218+
scored = scorer.score_with_trust(memories, agent="senior-dev")
219+
```
220+
221+
Trust scores factor in 5 behavioral dimensions: verification-before-claim, loud-failure, honest-uncertainty, paper-trail, and diligent-execution. Trust decays over time if an agent goes inactive (30-day half-life), so stale agents don't get trusted blindly.
222+
223+
**Verified Retrieval** — For high-stakes decisions, ALMA can verify memories before your agent uses them.
224+
225+
```python
226+
from alma.retrieval.verification import VerifiedRetriever, VerificationConfig
227+
228+
retriever = VerifiedRetriever(
229+
retrieval_engine=alma.retrieval_engine,
230+
llm_client=my_llm, # Optional — works without LLM too
231+
config=VerificationConfig(
232+
enabled=True,
233+
default_method="cross_verify", # Verify against other memories
234+
confidence_threshold=0.7,
235+
)
236+
)
237+
238+
results = retriever.retrieve_verified(
239+
query="What's the status of lead #1234?",
240+
agent="voice-agent",
241+
project_id="my-project",
242+
)
243+
244+
# Only use memories you can trust
245+
for memory in results.verified:
246+
print(f"Safe to use: {memory.memory}")
247+
248+
for memory in results.contradicted:
249+
print(f"CONFLICT: {memory.memory}{memory.verification.reason}")
250+
251+
# Quick summary
252+
print(results.summary())
253+
# {'verified': 3, 'uncertain': 1, 'contradicted': 1, 'unverifiable': 0,
254+
# 'usable_ratio': 0.8, 'verification_time_ms': 45}
255+
```
256+
257+
Every retrieved memory gets a status:
258+
259+
| Status | Meaning | Should your agent use it? |
260+
|--------|---------|--------------------------|
261+
| **VERIFIED** | Confirmed accurate against ground truth or other memories | Yes |
262+
| **UNCERTAIN** | No conflicting evidence, but unconfirmed | Yes, with caution |
263+
| **CONTRADICTED** | Conflicts with other memories detected | No — review needed |
264+
| **UNVERIFIABLE** | Can't be verified (no other sources) | Use your judgment |
265+
266+
This is critical for multi-agent systems. Without verification, your voice agent might call a lead that your email agent already disqualified — because both agents stored conflicting memories about the same person.
267+
183268
---
184269

185270
## Install
@@ -313,11 +398,13 @@ Connect ALMA directly to Claude with 22 MCP tools:
313398
| Metric | Value |
314399
|--------|-------|
315400
| LongMemEval R@5 | **0.964** (#1 open-source) |
316-
| Tests passing | 2,121 |
401+
| Tests passing | 2,121+ |
317402
| Storage backends | 7 |
318403
| Graph backends | 4 |
319404
| MCP tools | 22 |
320405
| Memory types | 5 |
406+
| Trust scoring | Veritas framework (per-agent, 5 behavioral dimensions) |
407+
| Verified retrieval | 4-status verification (VERIFIED / CONTRADICTED / UNCERTAIN / UNVERIFIABLE) |
321408
| Chat formats ingested | 6 |
322409
| Monthly cost (local) | $0.00 |
323410
| API keys needed | None |

0 commit comments

Comments
 (0)