Skip to content

Commit b560195

Browse files
committed
docs: hype LLM tracing in README
- Tagline updated to mention LLM observability - New 'LLM Observability' section after Quick Start with real demo console output, per-field breakdown table, and four unique selling points (cost-per-call, PII exposure, injection detection, llm-dominates-request latency correlation) - App Type Presets table: add 'llm' row and common combo examples - Events Reference: add 'llm' event row; expand 'anomaly' to list all four LLM anomaly types including llm-dominates-request - Production Safety Reference: add .withLLMTracing() row - withLLMTracing section: fix console output format to match real output; add llm-dominates-request to anomaly events table
1 parent 2a6cd63 commit b560195

1 file changed

Lines changed: 101 additions & 35 deletions

File tree

README.md

Lines changed: 101 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Argus
22

3-
> **Privacy-first APM and performance diagnostics for Node.js — zero sidecar, zero raw data exported.**
3+
> **Privacy-first APM for Node.js — runtime diagnostics, LLM observability, zero sidecar, zero raw data exported.**
44
55
[![CI](https://github.com/sharon77242/Argus/actions/workflows/ci.yml/badge.svg)](https://github.com/sharon77242/Argus/actions/workflows/ci.yml)
66
[![Sponsor](https://img.shields.io/badge/Sponsor-%E2%9D%A4-pink?logo=github)](https://github.com/sponsors/sharon77242)
@@ -21,11 +21,12 @@ Named after **Argus Panoptes**, the hundred-eyed watchman of Greek mythology. A
2121

2222
1. [Why This Exists](#why-this-exists)
2323
2. [Quick Start](#quick-start)
24-
3. [Privacy Guarantees](#privacy-guarantees)
25-
4. [Requirements](#requirements)
26-
5. [Build from Source](#build-from-source)
27-
6. [Demo App](#demo-app)
28-
7. [Profile API (recommended)](#profile-api-recommended)
24+
3. [LLM Observability](#llm-observability)
25+
4. [Privacy Guarantees](#privacy-guarantees)
26+
5. [Requirements](#requirements)
27+
6. [Build from Source](#build-from-source)
28+
7. [Demo App](#demo-app)
29+
8. [Profile API (recommended)](#profile-api-recommended)
2930
- [Environment Presets](#environment-presets)
3031
- [App Type Presets](#app-type-presets)
3132
- [Auto-Detection](#auto-detection)
@@ -39,16 +40,16 @@ Named after **Argus Panoptes**, the hundred-eyed watchman of Greek mythology. A
3940
- [Adaptive Sampler](#adaptive-sampler)
4041
- [Job Queue Tracing](#job-queue-tracing)
4142
- [Messaging Tracing](#messaging-tracing)
42-
9. [Instance Methods](#instance-methods)
43-
10. [Events Reference](#events-reference)
44-
11. [Environment Variables](#environment-variables)
45-
12. [Production Safety Reference](#production-safety-reference)
46-
13. [Architecture Layers](#architecture-layers)
47-
14. [Project Structure](#project-structure)
48-
15. [Low-Level API](#low-level-api)
49-
16. [Self-Host Your OTLP Endpoint](#self-host-your-otlp-endpoint)
50-
17. [Roadmap](#roadmap)
51-
18. [License](#license)
43+
10. [Instance Methods](#instance-methods)
44+
11. [Events Reference](#events-reference)
45+
12. [Environment Variables](#environment-variables)
46+
13. [Production Safety Reference](#production-safety-reference)
47+
14. [Architecture Layers](#architecture-layers)
48+
15. [Project Structure](#project-structure)
49+
16. [Low-Level API](#low-level-api)
50+
17. [Self-Host Your OTLP Endpoint](#self-host-your-otlp-endpoint)
51+
18. [Roadmap](#roadmap)
52+
19. [License](#license)
5253

5354
---
5455

@@ -60,6 +61,7 @@ Standard APM products either require heavy agents, compile steps, or sacrifice d
6061
- **AST-first privacy** — SQL/NoSQL query values are shredded at the AST layer before they ever touch a metric
6162
- **Entropy-checked logs** — Shannon entropy scanning strips JWT tokens, API keys, and any other high-entropy string from `console` payloads automatically
6263
- **Zero prototype pollution** — all DB interception goes through `node:diagnostics_channel`, the official Node.js observability primitive
64+
- **LLM-aware** — intercepts OpenAI and Anthropic SDK calls to surface cost, token usage, PII exposure, and prompt injection attempts with zero code changes
6365

6466
---
6567

@@ -85,6 +87,66 @@ const agent = await ArgusAgent.createProfile({
8587
8688
---
8789

90+
## LLM Observability
91+
92+
Add `appType: 'llm'` and Argus intercepts every OpenAI and Anthropic call — cost per request, token counts, PII exposure, and prompt injection attempts, all in a single console line with zero code changes:
93+
94+
```
95+
19:51:02.160 [LLM] openai/gpt-4o /api/chat 1240ms $0.0043 in:342 out:89 ⚠ PII: [EMAIL×1] — sanitized ⚠ INJECTION ATTEMPT
96+
```
97+
98+
**What each field means:**
99+
100+
| Field | Example | Description |
101+
|---|---|---|
102+
| Provider / model | `openai/gpt-4o` | SDK and model used |
103+
| Endpoint | `/api/chat` | HTTP route that triggered the call |
104+
| Latency | `1240ms` | Wall-clock time for the full LLM round-trip |
105+
| Cost | `$0.0043` | Calculated from token counts × per-model pricing |
106+
| Tokens | `in:342 out:89` | Prompt and completion tokens |
107+
| PII warning | `⚠ PII: [EMAIL×1] — sanitized` | Detected and redacted before telemetry export |
108+
| Injection warning | `⚠ INJECTION ATTEMPT` | Prompt injection pattern detected |
109+
110+
**Four things no other Node.js APM shows you:**
111+
112+
1. **Your real LLM bill, per request.** Not an estimate — computed from the actual token counts the model reports. Cost spike detection fires automatically when a single call runs 10× over your rolling average.
113+
114+
2. **Your users' emails are in those prompts.** Argus redacts PII (emails, phone numbers, SSNs, card numbers, IPs) from the telemetry record before export. The raw prompt reaches the model unchanged — your observability data never sees it.
115+
116+
3. **Prompt injection attempts, logged before damage is done.** Six regex patterns covering `ignore previous instructions`, role-override, and data-exfil attempts. Wire one listener to your security log.
117+
118+
4. **When your LLM owns your latency budget.** The `llm-dominates-request` rule fires an `'anomaly'` event when LLM time exceeds 80% of the HTTP request duration — the exact signal you need to decide whether to cache, stream, or move the call off the hot path.
119+
120+
```typescript
121+
const agent = await ArgusAgent.createProfile({
122+
environment: "prod",
123+
appType: ["web", "llm"], // or just "llm"
124+
}).start();
125+
126+
// That's it. All OpenAI / Anthropic calls are traced from this point.
127+
128+
// Optional: react to anomalies
129+
agent.on("anomaly", (event) => {
130+
if (event.type === "llm-dominates-request") {
131+
// LLM took >80% of the HTTP request budget — consider caching or streaming
132+
}
133+
if (event.type === "llm-cost-spike") {
134+
// Single call cost spiked 10× — worth investigating
135+
}
136+
});
137+
138+
// Optional: react to security events
139+
agent.on("llm", (event) => {
140+
if (event.injectionAttemptDetected) {
141+
securityLog.warn("prompt injection attempt", { endpoint: event.endpoint, traceId: event.traceId });
142+
}
143+
});
144+
```
145+
146+
→ Full API reference: [`withLLMTracing(options?)`](#withllmtracingoptions)
147+
148+
---
149+
88150
## Privacy Guarantees
89151

90152
### What this agent collects
@@ -249,13 +311,15 @@ const agent = await ArgusAgent.createProfile({
249311

250312
### App Type Presets
251313

252-
| `appType` | Modules Enabled | Optimization Target |
253-
| ----------------------- | ------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------- |
254-
| `'web'` | HttpTracing, ResourceLeakMonitor, Auto-Patching | **Latency** — request/response & socket tracking |
255-
| `'db'` | QueryAnalysis, SlowQueryMonitor, ResourceLeakMonitor, Auto-Patching | **Data Access** — query patterns & connection safety |
256-
| `'worker'` | RuntimeMonitor (CPU/Mem), GcMonitor, ResourceLeakMonitor, Auto-Patching, **JobTracing, MessagingTracing** | **Throughput** — long-running safety, loop health & queue visibility |
257-
| `['web','db']` | Union of `web` + `db` | **Hybrid** — full HTTP + query coverage |
258-
| `['web','db','worker']` | All modules | **Full-Stack** — maximum observability |
314+
| `appType` | Modules Enabled | Optimization Target |
315+
| ------------------------------ | -------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- |
316+
| `'web'` | HttpTracing, ResourceLeakMonitor, Auto-Patching | **Latency** — request/response & socket tracking |
317+
| `'db'` | QueryAnalysis, SlowQueryMonitor, ResourceLeakMonitor, Auto-Patching | **Data Access** — query patterns & connection safety |
318+
| `'worker'` | RuntimeMonitor (CPU/Mem), GcMonitor, ResourceLeakMonitor, Auto-Patching, **JobTracing, MessagingTracing** | **Throughput** — long-running safety, loop health & queue visibility |
319+
| `'llm'` | LLMTracing (OpenAI + Anthropic), HttpTracing | **AI** — cost, tokens, PII, injection, latency correlation |
320+
| `['web','db']` | Union of `web` + `db` | **Hybrid** — full HTTP + query coverage |
321+
| `['web','db','llm']` | Union of `web` + `db` + `llm` | **AI App** — full-stack + LLM observability |
322+
| `['web','db','worker','llm']` | All modules | **Full-Stack** — maximum observability |
259323

260324
Each `.with*()` call is **idempotent** — combining types never double-registers a module.
261325

@@ -326,12 +390,11 @@ const response = await openai.chat.completions.create({
326390
});
327391
```
328392

329-
**What you see in dev mode:**
393+
**What you see in dev mode** (real output from the demo app):
330394

331395
```
332-
[ARGUS] LLM openai/gpt-4o POST /api/chat 1,240ms $0.0043 in:342 out:89
333-
[ARGUS] ⚠ PII: [EMAIL×1] — sanitized before export
334-
[ARGUS] LLM anthropic/claude-3-5-sonnet POST /api/summarize 890ms $0.0012
396+
19:51:02.160 [LLM] openai/gpt-4o /api/chat 1240ms $0.0043 in:342 out:89 ⚠ PII: [EMAIL×1] — sanitized ⚠ INJECTION ATTEMPT
397+
19:51:05.302 [LLM] anthropic/claude-3-5-sonnet /api/summarize 890ms $0.0012 in:150 out:62
335398
```
336399

337400
**Options:**
@@ -343,10 +406,10 @@ const response = await openai.chat.completions.create({
343406

344407
**Events emitted:**
345408

346-
| Event | When |
347-
| ----------- | -------------------------------------------------------- |
348-
| `'llm'` | Every completed LLM call |
349-
| `'anomaly'` | `n-llm-calls`, `llm-cost-spike`, `context-window-growth` |
409+
| Event | When |
410+
| ----------- | ------------------------------------------------------------------------------------------------- |
411+
| `'llm'` | Every completed LLM call |
412+
| `'anomaly'` | `n-llm-calls` · `llm-cost-spike` · `context-window-growth` · `llm-dominates-request` |
350413

351414
**What is sanitized:**
352415

@@ -656,9 +719,10 @@ The agent is an `EventEmitter`. All events are emitted on the `ArgusAgent` insta
656719

657720
| Event | Payload | When |
658721
| ------------------- | -------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- |
659-
| `'job'` | `JobEvent` | Job completed, failed, retried, or stalled (BullMQ, Bull, pg-boss, Agenda) |
660-
| `'message'` | `MessageEvent` | Message produced or consumed (KafkaJS, amqplib) |
661-
| `'anomaly'` | `ProfilerEvent` | Memory leak, event loop lag, CPU spike, cross-signal compound anomaly, or job/message rule violation |
722+
| `'job'` | `JobEvent` | Job completed, failed, retried, or stalled (BullMQ, Bull, pg-boss, Agenda) |
723+
| `'message'` | `MessageEvent` | Message produced or consumed (KafkaJS, amqplib) |
724+
| `'llm'` | `LLMEvent` | LLM call completed — provider, model, endpoint, durationMs, costUsd, tokens, piiDetected, injectionAttemptDetected, suggestions |
725+
| `'anomaly'` | `ProfilerEvent` | Memory leak, event loop lag, CPU spike, cross-signal anomaly, job/message rule violation, or LLM anomaly (`n-llm-calls`, `llm-cost-spike`, `context-window-growth`, `llm-dominates-request`) |
662726
| `'query'` | `{ sanitizedQuery, durationMs, driver?, traceId?, correlationId?, cacheHit?, suggestions? }` | DB query completed |
663727
| `'slow-query'` | `SlowQueryRecord` | Query exceeded the per-driver threshold |
664728
| `'transaction'` | `TransactionEvent` | BEGIN/COMMIT/ROLLBACK pattern completed |
@@ -680,8 +744,9 @@ The agent is an `EventEmitter`. All events are emitted on the `ArgusAgent` insta
680744

681745
```typescript
682746
agent.on("anomaly", (event) => {
683-
// runtime: 'memory-leak' | 'event-loop-lag' | 'cpu-spike'
747+
// runtime: 'memory-leak' | 'event-loop-lag' | 'cpu-spike'
684748
// cross-signal: 'correlated-slow-endpoint' | 'pool-starvation-by-slow-query' | 'n-plus-one-in-transaction'
749+
// llm: 'n-llm-calls' | 'llm-cost-spike' | 'context-window-growth' | 'llm-dominates-request'
685750
console.log(event.type);
686751
console.log(event.heapSnapshotPath); // only set when a snapshot write succeeded
687752
});
@@ -776,6 +841,7 @@ All thresholds can be overridden without code changes, making the agent CI/CD an
776841
| `.withCrashGuard()` | ✅ Yes | Very Low | Intercepts `uncaughtException`; emits event for `unhandledRejection` |
777842
| `.withResourceLeakMonitor(opts?)` | ✅ Yes | Low | Tracks OS handles; rate-limited by `alertCooldownMs` |
778843
| `.withGracefulShutdown(opts?)` | ✅ Yes | Very Low | Registers SIGTERM/SIGINT; awaits `agent.stop()` before `process.exit` |
844+
| `.withLLMTracing(opts?)` | ✅ Yes | Very Low | OpenAI + Anthropic call interception — cost, tokens, PII redaction, injection detection, anomaly rules |
779845
| `.withInstrumentation(opts?)` | ✅ Yes | Low | DB/IO tracing via `diagnostics_channel` (17 drivers) |
780846
| `.withHttpTracing()` | ✅ Yes | Low | HTTP request inspection & slow-request detection |
781847
| `.withLogTracing(opts?)` | ✅ Yes | Low | `console.*` override with entropy-scrubbed payloads |

0 commit comments

Comments
 (0)