Skip to content

v2.45.20

Choose a tag to compare

@j-mendez j-mendez released this 05 Feb 21:04
· 579 commits to main since this release

What's New

Relevance Gate for Remote Multimodal Crawling

Added a relevance_gate config that instructs the LLM to return a "relevant": true|false field in its JSON response. When a page is deemed irrelevant, its wildcard budget credit is refunded so the crawler discovers more relevant content.

New config fields:

  • relevance_gate: bool — enables the feature
  • relevance_prompt: Option<String> — optional custom relevance criteria

How it works:

  1. When enabled, the system prompt instructs the LLM to include "relevant": true|false
  2. If the model returns false, a budget credit is atomically accumulated
  3. Credits are drained in the crawl loop to restore the wildcard budget
  4. Default fallback is true (assume relevant) if the model omits the field

Example:

let cfgs = RemoteMultimodalConfigs::new(api_url, model)
    .with_relevance_gate(Some("Only pages about Rust programming".into()));

Full Changelog

  • feat(agent): add relevance_gate and relevance_prompt to RemoteMultimodalConfig
  • feat(agent): add atomic relevance_credits counter to RemoteMultimodalConfigs
  • feat(agent): add relevant: Option<bool> to AutomationResult and AutomationResults
  • feat(agent): extend system prompt and extraction with relevance gate instructions
  • feat(spider): add restore_wildcard_budget() for budget refund
  • feat(spider): drain relevance credits in crawl loop dequeue