TriviaAdvisor

TriviaAdvisor helps you find and track pub quiz nights and trivia events in your area. Think of it as a "Yelp for pub quizzes" - helping trivia enthusiasts discover new venues and keep track of their favorite quiz nights.

Repository Information

GitHub Repository: holden/trivia_advisor
Note: This project uses an underscore in its name (trivia_advisor), not a hyphen.

Features

🎯 Find trivia nights near you
📅 Track recurring events by venue
🌐 Aggregates data from multiple trivia providers
🗺️ Map integration for easy venue discovery
📱 Mobile-friendly interface

Scraper Design Pattern

🏗️ File Structure

lib/trivia_advisor/scraping/scrapers/[source_name]/
├── scraper.ex          # Main scraper module
└── venue_extractor.ex  # HTML extraction logic

🔄 Main Scraper Flow

# 1. Index Job to fetch venue list
def perform(%Oban.Job{id: job_id}) do
  # Question One: RSS feed pagination
  # Inquizition: API endpoint
  venues = fetch_venues()

  # 2. Process venues by scheduling detail jobs
  total_venues = length(venues)
  processed_venues = schedule_detail_jobs(venues)

  # 3. Update Job Metadata
  JobMetadata.update_index_job(job_id, %{
    total_venues: total_venues,
    enqueued_jobs: processed_venues,
    metadata: %{
      started_at: start_time,
      completed_at: DateTime.utc_now()
    }
  })
  
  :ok
end

🏢 Venue Processing Steps

# 1. Extract Basic Venue Data
venue_data = %{
  name: extracted_data.title,
  address: extracted_data.address,
  phone: extracted_data.phone,
  website: extracted_data.website
}

# 2. Process Through VenueStore
{:ok, venue} = VenueStore.process_venue(venue_data)
# This handles:
# - Google Places API lookup
# - Country/City creation
# - Venue creation/update

# 3. Process Event
event_data = %{
  name: "#{source.name} at #{venue.name}",
  venue_id: venue.id,
  day_of_week: day,
  start_time: time,
  frequency: frequency,
  description: description,
  entry_fee_cents: parse_currency(fee_text)
}

# 4. Create/Update Event
{:ok, event} = EventStore.process_event(venue, event_data, source.id)

⚠️ Error Handling Pattern

# 1. Top-level Job error handling
def perform(%Oban.Job{id: job_id}) do
  try do
    # Main scraping logic
    :ok
  rescue
    e ->
      JobMetadata.update_error(job_id, Exception.format(:error, e, __STACKTRACE__))
      Logger.error("Scraper failed: #{Exception.message(e)}")
      {:error, e}
  end
end

# 2. Individual venue rescue
try do
  # Venue processing
rescue
  e ->
    Logger.error("Failed to process venue: #{inspect(e)}")
    nil  # Skip this venue but continue with others
end

📝 Logging Standards

# 1. Start of scrape
Logger.info("Starting #{source.name} scraper")

# 2. Venue count
Logger.info("Found #{venue_count} venues")

# 3. Individual venue processing
Logger.info("Processing venue: #{venue.name}")

# 4. VenueHelpers.log_venue_details for consistent format
VenueHelpers.log_venue_details(%{
  raw_title: raw_title,
  title: clean_title,
  address: address,
  time_text: time_text,
  day_of_week: day_of_week,
  start_time: start_time,
  frequency: frequency,
  fee_text: fee_text,
  phone: phone,
  website: website,
  description: description,
  hero_image_url: hero_image_url,
  url: source_url
})

✅ Data Validation Requirements

Venue must have:
- Valid name
- Valid address
- Day of week
- Start time
Event must have:
- Valid venue_id
- Valid day_of_week
- Valid start_time
- Valid frequency

🗄️ Database Operations Order

Country (find or create)
City (find or create, linked to country)
Venue (find or create, linked to city)
Event (find or create, linked to venue)
EventSource (find or create, linked to event and source)

Oban Job Design Pattern

🔄 Index and Detail Job Structure

lib/trivia_advisor/scraping/oban/[source_name]_index_job.ex   # Lists venues and schedules detail jobs
lib/trivia_advisor/scraping/oban/[source_name]_detail_job.ex  # Processes individual venues/events

📊 Metadata Handling

All scrapers should use the centralized JobMetadata module for updating job metadata:

# In detail jobs:
def perform(%Oban.Job{args: args, id: job_id}) do
  # Process the venue and event
  result = process_venue(args["venue_data"], source)
  
  # Handle the result and update metadata
  handle_processing_result(result, job_id, source)
end

# Handle the processing result uniformly
defp handle_processing_result(result, job_id, source) do
  case result do
    {:ok, %{venue: venue, event: event}} ->
      # Update metadata with the JobMetadata helper
      metadata = %{
        "venue_name" => venue.name,
        "venue_id" => venue.id,
        "event_id" => event.id,
        # Additional fields...
      }
      
      JobMetadata.update_detail_job(job_id, metadata, %{venue_id: venue.id, event_id: event.id})
      
      {:ok, %{venue_id: venue.id, event_id: event.id}}
      
    {:error, reason} ->
      # Update error metadata
      JobMetadata.update_error(job_id, reason)
      {:error, reason}
  end
end

🖼️ Image Handling Pattern

For consistently handling venue/event images:

# Download and attach hero images for events
hero_image_url = venue_data["image_url"]
if hero_image_url && hero_image_url != "" do
  # Pass force_refresh_images flag to control image refresh
  force_refresh_images = Process.get(:force_refresh_images, false)
  case ImageDownloader.download_event_hero_image(hero_image_url, force_refresh_images) do
    {:ok, upload} ->
      Logger.info("✅ Successfully downloaded hero image")
      Map.put(event_data, :hero_image, upload)
    {:error, reason} ->
      Logger.warning("⚠️ Failed to download hero image: #{inspect(reason)}")
      event_data
  end
else
  event_data
end

🔄 Force Refresh Images

All scrapers support the force_refresh_images flag that ensures images are always fresh:

How It Works:
- When enabled, existing images are deleted before downloading new ones
- Bypasses image caching in ImageDownloader
- Propagates through the entire process from index job to detail job to EventStore
Usage in Jobs:

# Through Oban job args
{:ok, _job} = Oban.insert(
  TriviaAdvisor.Scraping.Oban.PubquizIndexJob.new(%{
    "force_refresh_images" => true,
    "limit" => 5
  })
)

# Through mix task flags
mix scraper.test_pubquiz_index --limit=3 --force-refresh-images

Implementation:
- Index job passes flag to detail jobs
- Detail job sets Process.put(:force_refresh_images, true)
- ImageDownloader checks flag to force redownload
- EventStore explicitly deletes existing images when flag is true
Supported Scrapers:
- Question One
- Quizmeisters
- Geeks Who Drink
- PubQuiz

⚠️ Important Notes

NEVER make DB migrations without asking first
Always follow the existing pattern for consistency
Maintain comprehensive logging
Handle errors gracefully
Use the VenueHelpers module for common functionality
NEVER write repetitive case statements that do the same thing with different data structures - see Scraping Best Practices for details
NEVER hardcode Unsplash or other external image URLs directly in the code - use database or configuration
Prefer database-backed data over static lists when possible
Focus on optimizing queries rather than replacing with hardcoded data

Name		Name	Last commit message	Last commit date
Latest commit History 710 Commits
.github/workflows		.github/workflows
assets		assets
config		config
debugging		debugging
docs		docs
lib		lib
priv		priv
rel		rel
test		test
.dockerignore		.dockerignore
.formatter.exs		.formatter.exs
.gitignore		.gitignore
DESIGN.md		DESIGN.md
Dockerfile		Dockerfile
FORCE_REFRESH_FIX.md		FORCE_REFRESH_FIX.md
FORCE_REFRESH_IMAGES_FLAG.md		FORCE_REFRESH_IMAGES_FLAG.md
HERO_IMAGE_UPDATE_ISSUE.md		HERO_IMAGE_UPDATE_ISSUE.md
IMAGE_REFRESH.md		IMAGE_REFRESH.md
IMAGE_REFRESH_DIAGNOSIS.md		IMAGE_REFRESH_DIAGNOSIS.md
LOG_DRIVEN_TESTING.md		LOG_DRIVEN_TESTING.md
README.md		README.md
TIMESTAMP_FIX.md		TIMESTAMP_FIX.md
TRIVIA_ADVISOR_SCRAPER_ISSUES.md		TRIVIA_ADVISOR_SCRAPER_ISSUES.md
cleanup_branches.sh		cleanup_branches.sh
debug-output.txt		debug-output.txt
debug_output.log		debug_output.log
final_output.log		final_output.log
fixed_output.log		fixed_output.log
fly.toml		fly.toml
force_refresh_test_logs.txt		force_refresh_test_logs.txt
github_issue.md		github_issue.md
google_maps_test.html		google_maps_test.html
mix.exs		mix.exs
mix.lock		mix.lock
output.log		output.log
run_force_refresh_tests.sh		run_force_refresh_tests.sh
scraper_log.txt		scraper_log.txt
test_output.log		test_output.log
tigris_keys.txt		tigris_keys.txt
verification_output.log		verification_output.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TriviaAdvisor

Repository Information

Features

Scraper Design Pattern

🏗️ File Structure

🔄 Main Scraper Flow

🏢 Venue Processing Steps

⚠️ Error Handling Pattern

📝 Logging Standards

✅ Data Validation Requirements

🗄️ Database Operations Order

Oban Job Design Pattern

🔄 Index and Detail Job Structure

📊 Metadata Handling

🖼️ Image Handling Pattern

🔄 Force Refresh Images

⚠️ Important Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

razrfly/trivia_advisor

Folders and files

Latest commit

History

Repository files navigation

TriviaAdvisor

Repository Information

Features

Scraper Design Pattern

🏗️ File Structure

🔄 Main Scraper Flow

🏢 Venue Processing Steps

⚠️ Error Handling Pattern

📝 Logging Standards

✅ Data Validation Requirements

🗄️ Database Operations Order

Oban Job Design Pattern

🔄 Index and Detail Job Structure

📊 Metadata Handling

🖼️ Image Handling Pattern

🔄 Force Refresh Images

⚠️ Important Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages