exa-rb/readme/contents.md at master · EndlessInternational/exa-rb

The contents endpoint retrieves full page content, summaries, and highlights for a list of URLs. Results are returned from cache when available, with live crawling as a fallback.

Basic Usage

request = Exa::ContentsRequest.new( api_key: ENV[ 'EXA_API_KEY' ] )

urls = [ 'https://example.com/page1', 'https://example.com/page2' ]
response = request.submit( urls )

if response.success?
  response.result.each do | result |
    puts result.title
    puts result.text
  end
end

Options

Options can be passed as a hash or built using the DSL:

# using DSL
options = Exa::ContentsOptions.build do
  text { max_characters 2000 }
  highlights { num_sentences 3 }
end

# using hash
options = {
  text: { max_characters: 2000 },
  highlights: { num_sentences: 3 }
}

response = request.submit( urls, options )

Text Retrieval

options = Exa::ContentsOptions.build do
  text do
    max_characters 5000
    include_html_tags false
  end
end

Option	Type	Description
`max_characters`	integer	Maximum characters to retrieve
`include_html_tags`	boolean	Include HTML structure markers

Highlights

options = Exa::ContentsOptions.build do
  highlights do
    num_sentences 3
    highlights_per_url 5
    query 'Focus on technical implementation'
  end
end

Option	Type	Description
`num_sentences`	integer	Sentences per snippet (min: 1)
`highlights_per_url`	integer	Snippets per result (min: 1)
`query`	string	Custom direction for LLM selection

Summary

options = Exa::ContentsOptions.build do
  summary do
    query 'Summarize the key technical concepts'
  end
end

Option	Type	Description
`query`	string	Custom summarization directive

Crawling Control

options = Exa::ContentsOptions.build do
  livecrawl :fallback
  livecrawl_timeout 15000
end

Option	Type	Default	Description
`livecrawl`	symbol	`:fallback`	`:never`, `:fallback`, or `:always`
`livecrawl_timeout`	integer	10000	Timeout in milliseconds

Livecrawl Modes

Mode	Description
`:never`	Only return cached content
`:fallback`	Use cache if available, crawl if not
`:always`	Always fetch fresh content

Subpages

options = Exa::ContentsOptions.build do
  subpages 5
  subpage_target 'documentation'
end

Option	Type	Description
`subpages`	integer	Number of subpages to crawl
`subpage_target`	string	Term for targeting specific subpages

Response

When the request succeeds, response.result is a ContentsResult object.

Result Accessors

Accessor	Type	Description
`request_id`	string	Unique request identifier
`results`	array	Array of `ContentsResultItem` objects

ContentsResultItem Accessors

Accessor	Type	Description
`id`	string	Unique result identifier
`url`	string	The URL that was fetched
`title`	string	Page title
`author`	string	Author name
`text`	string	Page text (if requested)
`highlights`	array	Highlight snippets (if requested)
`highlight_scores`	array	Scores for each highlight
`summary`	string	Summary text (if requested)
`image`	string	Image URL
`favicon`	string	Favicon URL

Success Check

response = Exa.contents( urls, options )

if response.success?
  result = response.result
  puts "Retrieved #{ result.count } pages"
  result.each do | item |
    puts "#{ item.title }: #{ item.text[ 0, 100 ] }..."
  end
end

Examples

Basic Content Retrieval

Exa.api_key ENV[ 'EXA_API_KEY' ]

urls = [
  'https://ruby-doc.org/core/Array.html',
  'https://ruby-doc.org/core/Hash.html'
]

response = Exa.contents( urls )

if response.success?
  response.result.each do | result |
    puts result.title
    puts result.text[ 0, 500 ]
    puts
  end
end

Content with Highlights

options = Exa::ContentsOptions.build do
  highlights do
    num_sentences 2
    highlights_per_url 3
    query 'Key methods and usage examples'
  end
end

response = Exa.contents( urls, options )

if response.success?
  response.result.each do | result |
    puts "=" * 60
    puts result.title
    puts "-" * 60
    result.highlights&.each do | highlight |
      puts "- #{ highlight }"
    end
    puts
  end
end

Content with Summaries

options = Exa::ContentsOptions.build do
  summary do
    query 'Summarize the main functionality and common use cases'
  end
end

response = Exa.contents( urls, options )

if response.success?
  response.result.each do | result |
    puts result.title
    puts result.summary
    puts
  end
end

Fresh Content with Live Crawling

options = Exa::ContentsOptions.build do
  livecrawl :always
  livecrawl_timeout 20000
  text { max_characters 10000 }
end

response = Exa.contents( urls, options )

Combined with Search

A common pattern is to search first, then retrieve full content for specific results:

# First, search for relevant pages
search_response = Exa.search( 'Ruby concurrency patterns', {
  num_results: 5
} )

if search_response.success?
  # Extract URLs from search results
  urls = search_response.result.map( &:url )

  # Fetch full content for those URLs
  contents_options = Exa::ContentsOptions.build do
    text { max_characters 5000 }
    summary { query 'Summarize the concurrency approach' }
  end

  contents_response = Exa.contents( urls, contents_options )

  if contents_response.success?
    contents_response.result.each do | result |
      puts result.title
      puts result.summary
      puts
    end
  end
end

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contents

Basic Usage

Options

Text Retrieval

Highlights

Summary

Crawling Control

Livecrawl Modes

Subpages

Response

Result Accessors

ContentsResultItem Accessors

Success Check

Examples

Basic Content Retrieval

Content with Highlights

Content with Summaries

Fresh Content with Live Crawling

Combined with Search

FilesExpand file tree

contents.md

Latest commit

History

contents.md

File metadata and controls

Contents

Basic Usage

Options

Text Retrieval

Highlights

Summary

Crawling Control

Livecrawl Modes

Subpages

Response

Result Accessors

ContentsResultItem Accessors

Success Check

Examples

Basic Content Retrieval

Content with Highlights

Content with Summaries

Fresh Content with Live Crawling

Combined with Search