feat: add explore() destination discovery (supersedes #20) by saraswatayu · Pull Request #28 · saraswatayu/swoop

saraswatayu · 2026-06-02T18:22:07Z

Summary

Adds explore() — swoop's fourth primitive, "where could I go from here?" — via the Google Flights GetExploreDestinations RPC. This reshapes #20 from first principles after live de-risking, and supersedes it.

The headline finding from a live trace: GetExploreDestinations returns no price (the map's prices come from a separate client-side path). So explore is a metadata/discovery primitive, and pricing composes via a bridge — mirroring how deals() pairs with price_deal(). explore answers the same question as deals() ("where can I go from X?") but with a different selection algorithm: inspiration/relevance (scenic + leisure destinations) vs. bargains. At JFK they overlap on only ~11 of ~24 destinations.

Public API

explore(origin, *, cabin, one_way, max_stops, passengers, transport) -> ExploreResult — one-way or roundtrip (one-way verified supported, unlike deals).
price_explore(destination) -> PriceResult | None — prices a chosen destination's cheapest itinerary through the existing search/price pipeline.
ExploreDestination / ExploreResult dataclasses; ExploreDestination.to_search_kwargs() as the composition seam. origin + query context are denormalized so the bridge is self-contained.
swoop explore JFK CLI with table/JSON/CSV/brief output, --one-way, and deals-style client-side filters (--destination, --exclude-destination, --region, --trip-length). Origin validated via the Click IATA type for the rich error.

Design vs. PR #20 (the delta)

Metadata-only (price call traced and found not reliably extractable) → price_explore bridge, no --prices flag (consistent with deals).
One-way support (PR hardcoded roundtrip).
destination_* field naming aligned with Deal; dropped always-null distance/parent_place_id.
Reuses rpc._post_with_retry; anti-bot detection; in-code response-schema map; max_stops validated in the public function.
Live canary added to tests/test_live_contract.py (the PR shipped mock-only tests).

Full rationale: docs/superpowers/specs/2026-06-02-explore-endpoint-design.md.

Test plan

make check (pyright + offline suite) — 1442 passed, 54 skipped, 0 type errors.
New tests/test_explore.py covers payload building (one-way + roundtrip), parsing against fixtures, to_search_kwargs(), price_explore wiring, and CLI for all four formats + the rich IATA error.
tests/test_api_surface.py updated for 2 new public types + 2 functions + __all__.
Frozen-surface, README, CHANGELOG, and CLAUDE.md (gotchas, architecture, file map) updated.
Live canary not yet run live — the probe endpoint soft-blocked (HTTP 400) after heavy use during development; runs in live-canary.yml / can be run once unthrottled.

Known follow-ups (documented, non-blocking)

place_id full set: the IATA-origin form returns a region-scoped subset (~24); a Google place_id origin + flag 4 returns the full worldwide set (~85). Shipping the IATA form per the spec's D8 fallback; place_id resolution is additive and won't change the public API.

Also included

chore: remove dead code and unused imports across codebase — a repo-wide pyright unused-symbol sweep (3 dead functions, dead imports, unused locals). Project gate stays clean; intentional cross-module re-exports left in place.

🤖 Generated with Claude Code

Design for swoop's fourth primitive (destination discovery), reshaping PR #20 from first principles after live de-risking: - metadata-only — browser-traced the Explore map's price call; prices are not reliably extractable from GetExploreDestinations, so prices come from composition via a price_explore() bridge - one-way + roundtrip (verified the RPC supports both) - place_id origin form returns the full ~85-destination set vs 24 for IATA - destination_* model aligned with Deal; drop always-null fields - live-canary gate (tests/test_live_contract.py) as the credibility bar - CLI surface matches the shared search/deals/price vocabulary Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Remove unused functions: _filter_by_flight_number (__init__), _stderr_console + _trip_lines + _trip_leg_line (cli/formatters) - Remove unused typing imports: Union (builders, decoder), datetime (cli/utils) - Drop/underscore unused loop variables (_selection, cli/commands) make typecheck clean; full offline suite (1422 tests) passes. Intentional cross-module re-exports (rpc.py _booking re-exports consumed via the rpc.* namespace by tests/scripts, _regions._airportsdata_available, utils.format_duration) are left in place since removing them breaks consumers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…DE.md)

The =/+/-/@/tab/CR formula-injection guard was copy-pasted in three CSV formatters (price, deals, explore). Promote it to a module-level _csv_safe() / _DANGEROUS_PREFIXES so the security control has one definition; each formatter now aliases _s = _csv_safe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The ~15-line MIN-MAX nights parse/validate block was a verbatim copy in deals_cmd and explore_cmd. Extract _parse_trip_length() so the flag's syntax, 0..365 bounds, and error wording live in one place. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…xplore Both bridges ran an identical 'search then price the cheapest itinerary' body. Hoist it into _price_cheapest() so the cheapest-selection contract has one definition. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Replace the repeated `list(xs[:limit]) if limit else list(xs)` idiom with `list(xs[:limit])` in all four explore formatters. A slice already honors limit=None (xs[:None] returns the whole list), so the conditional was redundant; as a bonus --limit 0 now yields zero rows instead of all. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

_build_explore_payload hardcoded the passenger slot to [1,0,0,0], so explore(passengers=...) was silently ignored upstream while destinations were still labeled query_adults=N — and price_explore then priced a party size the explore query never performed. Thread Passengers through to slot 6 ([adults, children, infants_in_seat, infants_on_lap]), mirroring rpc._build_request. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

_parse_destination read return_date from row [12] unconditionally, so the one-way 'no return date' invariant depended on the server omitting [12]. If Google ever returns it, price_explore would inject return_date and price a roundtrip for a one-way request. Thread one_way through the parser and null return_date for one-way queries. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

to_search_kwargs guarded a missing destination but not a missing departure_date, so it emitted date=None and price_explore crashed deep in validate_date with a confusing 'date must be a date string, got NoneType'. Guard departure_date too and document it; covered by a new test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

--trip-length needs a return date, so combining it with --one-way made _nights() drop every destination, leaving the user with a misleading 'No destinations found' and exit 1. Reject the combination up front with a clear error and exit 2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… consent check _extract_inner only parsed the first [[ line and raised on any failure, so a reordered/empty leading chunk or a garbled first chunk could yield empty or missing destinations. Now scan every wrb.fr frame, skip unparseable ones, and prefer the chunk carrying destinations. Also test the consent host on the lowercased text so a mixed-case 'Consent.Google.com' is still detected as a block instead of a confusing parse error. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Row index [17] is a ground/drive time for nearby drivable suggestions (it correlates with the 'NNN mi' distance at [8] and is null for fly-to destinations), not a flight duration — the Explore RPC returns none. The field claimed flight duration and the CLI showed it under 'Duration', which is misleading. Rename the public field to drive_minutes, document it, and relabel the table column 'Drive'. Updates the frozen API-surface test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The Explore RPC returns no flight duration; the field surfaces drive time to nearby drivable destinations (now drive_minutes). Fix the changelog description to match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Filtering (destination whitelist/blacklist, region, trip_length) lived only in explore_cmd, so library callers of explore() got the full set with no way to narrow it — inconsistent with deals(). Add the same filter params to explore(), centralize the logic in a new _explore_filter module (mirroring _deals_filter), and have the CLI delegate to it. explore() now also rejects trip_length + one_way at the right altitude (the library), which the CLI surfaces. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fetch_explore did a full page GET on every call just to re-scrape the stable bl/f.sid identifiers. Cache them per (proxy, impersonate, country) so repeated explore() calls skip the page fetch, and invalidate + refetch once on a parse failure so a genuinely stale session still recovers. A looped explore() now pays one page GET instead of one per call. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

price_explore prices one destination (search + price_selector, two round trips), so pricing a whole explore page meant 2N sequential calls. Add price_explore_all(destinations) which prices them in parallel via a thread pool (mirroring deals' multi-origin fetch), preserves input order, and yields None for unpriceable entries instead of raising — one bad entry never aborts the batch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…hing Reflect the new public surface in CLAUDE.md (architecture tree, flows, file map, public-API line), README (filter + bulk-pricing examples), and CHANGELOG. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rors The bl/f.sid cache only invalidated inside the _extract_inner try, so a stale session that produced a 4xx (raised by _post_with_retry before parsing) left the bad params cached permanently — every later call re-failed with no self-heal. Invalidate on any rejection (parse error OR HTTP error), retry once only when the failed attempt actually used cached params (no pointless double fetch on a cold-cache failure), and let rate-limit errors propagate untouched since the per-request backoff already ran.

…_all batch price_explore_all only caught ValueError, so a transport/parse failure on a single destination propagated out of pool.map() — which re-raises on the first erroring item and abandons every other already-computed price. Catch SwoopError per item, log it at WARNING, and return None for that entry so partial results survive (mirroring _fetch_deals_per_origin).

Three related defects in the bl/f.sid session handling, all in _explore.py: - Cache poisoning: _get_browser_params cached whatever _browser_params() returned — including an empty {} (both regexes miss on a markup change) or a partial dict (only bl, or only f.sid) — and the read guard treated {} as a hit (`if cached is not None`). A single bad page fetch then pinned a broken, session-less request for every later call, process-wide, until something raised. Now only complete params (both tokens) are cached, and the read guard is truthy, so an incomplete extraction can never poison the cache. - Cold-miss stampede + racy retry: the lock was dropped between the cache read and the page GET, so N concurrent callers each fetched the page; and the retry decision used a pre-snapshot (had_cached_params) that a concurrent populate could invalidate. Now a per-key lock makes the fetch single-flight, and the retry keys off whether the failing attempt actually used cached params (returned as `served_from_cache`), set at the moment of use. - Wasted retry on hard blocks: every parse/HTTP failure was treated as a stale session and retried with a fresh page. An HTML/consent interstitial is a hard block a fresh session can't clear, so it now raises _ExploreBlockedError (a private SwoopParseError subclass) and propagates without the pointless second round trip. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The batch-survival handler caught SwoopRateLimitError (a SwoopError subclass) the same as any per-item failure: it logged, returned None, and kept dispatching all workers — so the moment Google returned 429, swoop fired up to max_workers more concurrent multi-RPC searches at the limiter and handed the caller a quietly-incomplete list of mostly None with no signal to back off. Now the first 429 sets an event that short-circuits the not-yet-started tasks (stop adding load) and re-raises SwoopRateLimitError after the pool drains, so the caller knows to wait and retry. Non-rate-limit transport/parse failures still yield None per destination and preserve the rest of the batch (now via order-stable index assignment rather than pool.map's return order). The default max_workers drops 8 -> 4 to match deals's multi-origin cap, since each pricing call is itself several RPCs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ExploreDestination only persisted query_cabin and query_adults, so price_explore() silently re-priced against a default search: - max_stops was dropped — exploring with --nonstop (max_stops=0) and then pricing could pick a cheaper 1-2 stop connection, returning a price for a trip the user had constrained away. - children / infants influenced which destinations and dates Google suggested (they are wired into the discovery payload) but were not recorded, so pricing used a smaller, cheaper party than the one explored. Add query_children, query_infants_in_seat, query_infants_on_lap, and query_max_stops to ExploreDestination (frozen surface test updated), thread the full Passengers + max_stops through parse_explore_payload/_parse_destination, and emit them from to_search_kwargs (max_stops via `is not None` so nonstop survives; passengers only when the party is non-default). Also tidies destination_country to use the _opt_str helper its sibling fields already use. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

A negative TransportConfig.retries made `range(1 + retries)` empty, so the loop never ran, the function fell off the end and returned None, and callers that dereference `res.text` (e.g. fetch_explore) crashed with AttributeError instead of a clean error. The CLI --retries option has no lower bound, so `swoop explore JFK --retries -1` reached this. Clamp the loop to at least one iteration; negative retries now behave like zero retries (one attempt). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Two output-rendering bugs in the explore formatters: - All four formatters sliced `result.destinations[:limit]` unconditionally, so `swoop explore JFK -l 0` showed ZERO destinations while the sibling `swoop deals JFK -l 0` shows all. Match deals: `[:limit] if limit else all`. - The table and brief formatters gated the drive column on `if d.drive_minutes`, so a real drive_minutes == 0 ("you're basically there") rendered as the "—"/"" no-data placeholder. Use `is not None` so 0 renders as 0m. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- One-way coverage: the existing tests only asserted return_date is None, which the parser hardcodes regardless of the wire — so a one-way response layout drift would go undetected. Add an offline parser test asserting the other wire-index fields (name [2], country [4], departure [11], IATA [15]) parse on the one-way path, and strengthen the live one-way canary to assert a real departure date + name/place id (not just the forced-None return date). - Cache isolation: add an autouse conftest fixture that clears the module-global _browser_params_cache and per-key locks around every test. Tests seed/populate that process-wide cache, which would otherwise leak into later tests (and live runs) and make behavior order-dependent. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Small, behavior-preserving cleanups from the review: - Drop the speculative `origin_flag` parameter from _build_explore_payload and inline the single-use _origin_block helper. The flag was never passed non-zero (the place_id + flag-4 worldwide form is an unimplemented future enhancement); inline it with a comment until that lands (YAGNI). - Remove the dead `return` statements after `ctx.exit()` in explore_cmd, which raises SystemExit — matching deals_cmd, which already omits them. To let the same cleanup apply in the shared _parse_trip_length (where the exit is mid-function before lo/hi are used), type its `ctx` parameter as click.Context so the type checker can see ctx.exit is NoReturn; the returns were otherwise load-bearing only to satisfy pyright with an untyped ctx. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The review flagged that _extract_inner might handle only the length-prefixed line framing (the fixtures' shape) and silently return empty on the flat-array framing _deals documents. It actually handles both — its per-line scan json.loads()es each `[[`-prefixed line and walks every entry for wrb.fr frames, so a single flat array of frames parses fine. Pin that with a regression test rather than refactor three subtly-different batchexecute parsers (explore reads [3,0], deals [3,9]) into one shared helper across shipped code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Bump __version__ to 0.6.0 and finalize the CHANGELOG section: explore() + price_explore()/price_explore_all() + ExploreDestination/ExploreResult, with the session's hardening folded in (carried query context incl. max_stops, rate-limit propagation in price_explore_all, complete-params-only single-flight session cache). No MIGRATION entry — explore is purely additive, matching how the 0.5 deals feature was handled. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

saraswatayu and others added 30 commits June 2, 2026 12:05

docs: add explore() implementation plan

deea833

feat: add ExploreDestination/ExploreResult models and to_search_kwargs

5e8fa4c

test: add explore RPC response fixtures

f94f8df

feat: add explore RPC client (payload builder, parser, fetch_explore)

39ef5b3

feat: add public explore() and price_explore() bridge

9eed70c

feat: add swoop explore CLI command and formatters

9c21424

test: add explore live canary to live contract suite

fa11dbd

docs: document explore() and price_explore() (README, CHANGELOG, CLAU…

c388c5d

…DE.md)

saraswatayu and others added 6 commits June 2, 2026 16:00

saraswatayu merged commit 887929a into main Jun 3, 2026
5 checks passed

saraswatayu mentioned this pull request Jun 3, 2026

feat: add explore() endpoint and CLI command for Google Flights Explore #20

Closed

4 tasks

saraswatayu deleted the worktree-enchanted-napping-crescent branch June 3, 2026 12:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add explore() destination discovery (supersedes #20)#28

feat: add explore() destination discovery (supersedes #20)#28
saraswatayu merged 36 commits into
mainfrom
worktree-enchanted-napping-crescent

saraswatayu commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

saraswatayu commented Jun 2, 2026

Summary

Public API

Design vs. PR #20 (the delta)

Test plan

Known follow-ups (documented, non-blocking)

Also included

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant