Skip to content

Remove faceted search — bot magnet tying up server resources #1110

@rdhyee

Description

@rdhyee

Context

Eric flagged that faceted search pages are heavily crawled by bots, tying up server resources. The combinatorial URL space of facets creates a near-infinite crawl surface that bots love to explore.

Problem

Faceted search generates many URL permutations (subject × language × format × ...) that:

  • Bots crawl exhaustively, consuming server CPU and DB queries
  • robots.txt can't practically block all combinations
  • Each facet page triggers DB queries that are expensive at scale

Proposed approach

  • Identify which faceted search views/URLs exist
  • Evaluate whether any real users depend on them
  • Remove or simplify to reduce the crawl surface
  • Consider replacing with a simpler search that doesn't generate combinatorial URLs

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions