Skip to content

feat: update research script#163

Open
JohnGuilding wants to merge 1 commit intomainfrom
feat/reseearch-script-updates
Open

feat: update research script#163
JohnGuilding wants to merge 1 commit intomainfrom
feat/reseearch-script-updates

Conversation

@JohnGuilding
Copy link
Copy Markdown
Contributor

@JohnGuilding JohnGuilding commented Apr 21, 2026

What this PR does:
Restructure the research script into a multi-phase flow using Claude code skills:

  1. Add the protocol to scripts/research-config.ts.
  2. Run skill: /research-sources <id> in Claude Code to write scripts/research-cache/<id>.json.
  3. Run script: pnpm run research <id> to write src/data/evaluations/<id>.json.
  4. Run skill: /review-evaluation <id> to check against the rules in scripts/research-prompts.ts.
  5. Prompt rules are automatically updated with review feedback

Above steps are composed into single skill:
/evaluate-protocol <id>

This reduces token usage significantly since the API is used only for citation feature.

  • Introduce the new flow: /research-sources discovers URLs and writes scripts/research-cache/{id}.json, scripts/research-protocol.ts reads the cache, fetches each URL once, and calls the Anthropic API per property to get citations
  • Switch the API call to the tool-use API. Claude now writes the note as a plain text block and records the structured value via a separate tool call (record_evaluation). Separating the two outputs gives the Citations API a clean text channel to attach source quotes to. The rule changes such as guidance on paraphrasing meant that citation feature stopped working, this changes makes citations more robust
  • Enable prompt caching on the document block (document blocks needed by citation feature). Fetched pages are multiple KB each and previously re-sent per property. Caching the document means only the first per-URL call pays the full input cost, and the shared URLs across properties (5-10 unique URLs per 30-property run) hit the cache for the rest.
  • Add many new rules in research-prompts.ts
  • Rewrite rules in research-prompts.ts: reframe Cryptographic verifiability, fix Number of secrets minimum, add Selective-disclosure carveout to CROSS_CHECK_RULES, tighten WRITING/VALUE_FORMAT/PROPERTY rules
  • Expand research-config.ts: add context field to ProtocolConfig and add Nullmask/Houdiniswap/Mirage etc entries with documentation and sourceUrls
  • Upgrade model to claude-opus-4-7
  • Reframe Type of compliance taxonomy in schema.ts (remove KYT; add Programmatic policies / KYC/KYB / POI/ASP). This aligns our taxonomy with Predicate and INCO report
  • Add minimal README.md outlining the research flow
  • Add .gitignore (ignore *.json and other local sources)

Skills were generated by claude and are added in next PR

@JohnGuilding JohnGuilding self-assigned this Apr 21, 2026
Copy link
Copy Markdown
Member

@NicoSerranoP NicoSerranoP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! The only priority question I have is why are we reducing property definitions and in some cases deleting property options?

I can see some reasons in this PR: #165 . My main concern was KYT because it is used for compliance:

  • KYC: know your user information (name, phone, nationality. etc.)
  • POI: your address is not in a sanctioned list
  • KYT: your address has not interacted with sanctioned/malicious addresses?

`cache: read=${usage.cache_read_input_tokens ?? 0} write=${usage.cache_creation_input_tokens ?? 0} miss=${usage.input_tokens}`,
);

if (process.env.DEBUG_RESEARCH) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add this DEBUG_RESEARCH env variable to the example file? I know it is an internal tool but that way we know what options we have (normal, debug, etc)

let toolInput: { value?: string; insufficient_data: boolean } | null = null;

for (const block of response.content) {
if (block.type === "tool_use" && block.name === "record_evaluation") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference between "tool_use" and "record_evaluation"?

{
name: "Asset privacy",
group: "Privacy",
description:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we deleting the N/A explainers around the different properties?

"Proof of innocence (POI) / ASP",
"Selective disclosure",
"KYC/KYB",
"KYT",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove KYT? 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants