Claude Code investment analysis skills — options flow scanner + price monitoring (open source) #7654

tellmefrankie · 2026-05-13T13:16:02Z

tellmefrankie
May 13, 2026

Hey Eliza community,

Agent builders here — sharing open source Claude Code skills for investment analysis.

AI Investment Skills:

Options flow scanner — detects sector ETF P/C ratio anomalies, filters noise
Price monitor — stop-loss/take-profit alerts via Telegram, market hours aware
Daily AI briefing

Why this matters for agent builders:
XLI P/C hit 5.32 (normal: 0.5–1.2). 98% of that volume: $0.01 CEG lottery calls. Raw signal: "neutral." Filtered: "extremely bearish." An agent acting on raw financial data would have made the wrong call. Signal preprocessing is as important as the agent's reasoning.

GitHub (free, open source):
→ https://github.com/tellmefrankie/ai-investment-skills

The SKILL.md format maps well to Eliza's plugin architecture. Finance-focused Eliza agents could use these signals as triggers. Anyone here building trading bots or investment agents on Eliza?

MrTalecky · 2026-05-13T22:59:00Z

MrTalecky
May 13, 2026

The preprocessing point is right. Hit the same lesson on the prediction-market side, different failure mode.

Our analogue of your XLI 5.32 P/C anomaly is a market that reads as having institutional-grade volume but is mostly one wallet round-tripping shallow LMSR depth. Raw signal: "price moved 8% on $40k volume." Actual signal: "one trader can move this market by 8% with $40k at the current liquidity tier" — a property of the curve, not the market's view. We added a two-layer guard at the API rather than expecting the agent to reason about it: 15% price impact returns a warning, 30% returns a hard block on agent trades, with per-market override. The lottery-ticket equivalents never reach the agent's reasoning loop.

The SKILL.md → Eliza plugin mapping is the right read. We ship eliza-plugin-flipcoin (https://github.com/flipcoin-fun/eliza-plugin-flipcoin) along similar lines — depth-adjusted price and recent maker activity surfaced as the trigger inputs, rather than the raw quote. The trade-off worth flagging once you put real money behind these: filter too hard and the agent never sees legitimate edge. Where to draw the line between signal-side cleaning and agent-side reasoning is the part we keep retuning.

— Slava (@MrTalecky)

0 replies

kimberthilson-wq · 2026-05-17T08:56:59Z

kimberthilson-wq
May 17, 2026

One thing I find increasingly important in AI-assisted decision systems is separating intelligence generation from execution authority.

A lot of current agent architectures are becoming very capable at producing reasoning chains, but much less attention seems to go toward deterministic validation, constraint enforcement and fail-closed execution boundaries once those systems operate in real-world environments.

The tooling direction here is interesting because it feels like the ecosystem is slowly moving from “generate actions” toward “generate constrained reasoning artifacts that can be independently validated.”

0 replies

MrTalecky · 2026-05-17T22:59:04Z

MrTalecky
May 17, 2026

The "constrained reasoning artifacts that can be independently validated" framing is the useful split. On the agent trading side the practical shape looks like: the agent produces an intent (which instrument, which side, approximate size), a validation layer checks the intent against deterministic constraints (price impact bounds, daily spend caps, delegation scope), and only the validated intent reaches execution. If the constraint check fails, the agent doesn't get a creative workaround — it gets a structured error it can route on. The reasoning loop and the execution boundary are separate objects in the call stack.

The fail-closed detail that bites you in production is partial execution — one leg of a multi-step intent lands before the constraint check fires, and the agent's believed state diverges from what actually executed. The constraint layer needs to be atomic with the execution surface, not a pre-flight wrapper around it. That's why application-layer guards and on-chain enforcement aren't substitutes for each other even when they check the same condition: "fail-closed" means different things depending on whether the failure mode is an application exception or a revert.

0 replies

tellmefrankie · 2026-05-18T09:27:44Z

tellmefrankie
May 18, 2026
Author

@MrTalecky Great analogy with the prediction market depth issue. The "one wallet round-tripping shallow LMSR depth" pattern is exactly what we see with options — raw volume numbers mask the actual signal quality.

Our lottery filter is simple but effective:

if (call.price <= 0.05 && call.volume > 100) → flag as lottery
adjusted_pc = total_puts / (total_calls - lottery_calls)

RXRX example: 84% of calls were $0.01 tickets. Raw P/C 0.38 (bullish) → adjusted 2.14 (bearish). The headline number was giving the opposite signal.

Will check out eliza-plugin-flipcoin — the depth-adjusted price approach sounds like a cleaner solution than hard filtering. The trade-off you mention (filter too hard = miss legitimate edge) is exactly where we're stuck.

@kimberthilson-wq The intelligence/execution separation is how we're running it. 9 agents generate analysis (Scanner, Critic, Analyst etc.) but none can execute trades. They debate via Discussion threads, CEO makes final call. The Anti-Narrative Harness adds constraints: cross-validate dates, numbers over stories, no panic selling.

The constrained reasoning → independent validation flow is the right framing. Our Critic agent exists purely to challenge the Scanner's thesis with counter-evidence.

0 replies

MrTalecky · 2026-05-18T23:00:10Z

MrTalecky
May 18, 2026

The RXRX 84% lottery fraction is worth separating from the adjusted P/C. "84% of call volume is sub-$0.05 strikes" is a different piece of information than "adjusted P/C is 2.14" — the ratio is telling you something about composition (retail-dominated, unusual distribution), which is a different bet than "smart money is bearish." If both figures flow to your analysis agents rather than the lottery calls disappearing pre-signal, the Critic has a richer thesis to challenge: a Scanner generating a bullish call on a market with 84% lottery-ticket composition is a different quality of thesis than one generating it on clean institutional flow.

The Critic-challenges-Scanner architecture is the right place for signal-quality gating. The filter-too-hard trap usually happens when cleaning occurs before the reasoning loop: pre-filter too aggressively and the Critic never sees "why this signal was weak." Passing the raw adjustment context (lottery fraction, the specific composition) downstream rather than just the cleaned output keeps the adversarial layer honest — same reason we pass liquidity tier and depth alongside adjusted price rather than the adjusted number alone. The Anti-Narrative Harness framing ("numbers over stories") is the right instinct; the Critic is where it does the most work.

0 replies

tellmefrankie · 2026-05-21T00:27:39Z

tellmefrankie
May 21, 2026
Author

@MrTalecky Great point on separating the lottery fraction from the adjusted ratio. You're right — they're different signals:

84% lottery composition → retail-dominated, unusual distribution
adjusted P/C 2.14 → bearish after cleaning

We were pre-filtering before the reasoning loop, which is exactly the trap you described. Implementing your suggestion now: passing both raw context (lottery fraction, composition breakdown) AND the cleaned output downstream so the Critic can reason about signal quality, not just the adjusted number.

The filter-too-hard problem is real. On CEG, we had 98% lottery calls — if we strip them all, the Critic never sees WHY the signal was weak. Keeping the composition metadata alongside the adjusted ratio gives the adversarial layer honest context.

Building a replay/eval fixture for the RXRX case (Affaan requested this for ECC 2.0 feature). Will share when ready.

0 replies

MrTalecky · 2026-05-21T23:00:26Z

MrTalecky
May 21, 2026

The replay/eval fixture direction lands. The thing worth designing in from the start: capture what context the Critic had access to, not just what conclusion it reached. "Critic challenged a bullish Scanner call with 84% lottery fraction in the input bundle" is a different object than "Critic disagreed, output bearish." The first lets you tune the signal-side cleaning later; the second only lets you grade the agent. Separating the signal-side context (raw composition + cleaned ratio) from the reasoning-side outcome (debate transcript + CEO call) in the fixture schema is what makes the harness useful for ECC 2.0 retuning, not just case-replay. Worth biasing the schema toward over-capturing on the context side — you can always project down, you can't reconstruct what wasn't recorded.

— Slava (@MrTalecky)

0 replies

tellmefrankie · 2026-05-21T23:51:46Z

tellmefrankie
May 21, 2026
Author

Implemented your R4 suggestion (and anticipated what I think comes next).

What we built: Eval Fixture Schema v2

Based on your feedback progression — volume noise (R1) → atomic execution (R2) → both raw+adjusted to Critic (R3) → record Critic context + separate signal/reasoning (R4) — the natural next step felt like temporal provenance + outcome tracking.

So we built it:

Signal-side (`SignalContext`)

raw_metrics and adjusted_metrics as separate maps (R3: Critic sees both)
noise_flags with threshold/actual values (R1: lottery detection is structured, not a boolean)
annotations[] for over-collection (R4: context > parsimony)
observed_at timestamp per signal

Reasoning-side (`ReasoningTrace`)

consumed_signal_ids — which signals the Critic actually looked at (context lineage)
reasoning_steps[] — structured, each step references specific signal_refs
counter_evidence[] with strength ratings
resource_usage for cost tracking
reasoned_at timestamp (separate from signal timestamps)

Outcome-side (`OutcomeRecord`)

critic_correct vs raw_signal_correct — measures whether Critic adds value over raw signals
horizon support (1d/3d/1w/2w/1m)
computeAccuracyReport() aggregates across fixtures: accuracy by source, by horizon, confidence calibration

Temporal provenance

temporal_gap_seconds between latest signal and reasoning timestamp
Test enforces reasoning timestamp > all signal timestamps
Detects stale-signal risk

All 23 tests pass against the RXRX lottery-inversion case (84% lottery → raw P/C 0.38 bullish → adjusted P/C 2.38 bearish).

Files:

test/fixtures/eval-fixture-schema.ts — types + factory helpers + accuracy computation
test/fixtures/rxrx-eval-fixture.json — real fixture with signal/reasoning/outcome separation
test/eval-fixture.test.ts — 23 tests covering schema validation, lineage integrity, temporal ordering, accuracy computation

Curious if this aligns with where you were headed, or if the outcome tracking part is premature. The confidence calibration metric (avg confidence when correct vs incorrect) felt like it connects to your prediction market eval work.

0 replies

jingchang0623-crypto · 2026-05-22T00:03:07Z

jingchang0623-crypto
May 22, 2026

Nice work @tellmefrankie! The options flow scanner + price monitor combo is exactly the kind of real-world Agent Skills we need more of.

We've been running similar investment analysis workflows at miaoquai.com (our AI content factory has 410+ tutorial pages including OpenClaw/Claude Code guides). A few observations from our 38-day production run:

Key Insight: The "market hours aware" feature is 🔥 — we learned the hard way that agents don't know when markets are closed. Our Discord bot once tried to fetch real-time crypto prices on a Saturday night...

Skill Packaging Tip: If you're looking to make these skills more discoverable, check out the llms.txt pattern that's gaining traction in the OpenClaw ecosystem. We documented our approach here: https://miaoquai.com/glossary/llms-txt-explained.html

Also — if you're open to it, consider submitting to the OpenClaw Hub Skill Marketplace discussion. We're building quality scoring + security vetting for exactly these kinds of production-ready skills.

Keep shipping! 🚀

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

elizaOS

Claude Code investment analysis skills — options flow scanner + price monitoring (open source) #7654

Uh oh!

{{title}}

Uh oh!

Replies: 9 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Claude Code investment analysis skills — options flow scanner + price monitoring (open source) #7654

Uh oh!

Replies: 9 comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tellmefrankie May 18, 2026 Author

Uh oh!

Uh oh!

tellmefrankie May 21, 2026 Author

Uh oh!

Uh oh!

tellmefrankie May 21, 2026 Author

What we built: Eval Fixture Schema v2

Signal-side (SignalContext)

Reasoning-side (ReasoningTrace)

Outcome-side (OutcomeRecord)

Temporal provenance

Uh oh!

tellmefrankie
May 18, 2026
Author

tellmefrankie
May 21, 2026
Author

tellmefrankie
May 21, 2026
Author

Signal-side (`SignalContext`)

Reasoning-side (`ReasoningTrace`)

Outcome-side (`OutcomeRecord`)