Skip to content

Is it possible to retrieve OTA/reseller prices (Expedia, Gotogate, eDreams, etc.) in addition to airline-direct fares? #168

@housine35

Description

@housine35

Question

The current SearchFlights.get_booking_options() method calls Google's GetBookingResults endpoint and returns only airline-direct fare cards (e.g. for an Air France flight, just the Air France entries with Basic/Main/Plus tiers). This is consistent with the existing fixture test test_booking_options_live_fixture.py which asserts exactly 3 AA fare cards for JFK→LAX.

However, the same endpoint, called from a real browser session on the Google Flights booking page, returns a response ~5x larger (74 KB vs 14 KB) containing the full list of resellers: Expedia, Gotogate, eDreams, Booking.com, Kiwi.com, Mytrip, Trip.com, BudgetAir, lastminute.com, wego, Flightnetwork, etc.

Is retrieving this complete reseller list something the library could support, or is it out of scope?

What I tested

For CDG↔HND round-trip (Air France nonstop, 2026-06-10 / 2026-06-24, EUR locale):

  • Via fli (SearchFlights.get_booking_options()): 1 vendor returned — Air France direct, 1482 EUR. Response body: 14 KB.
  • Captured from browser (HAR file, same itinerary, same session): 12 vendors returned by the same GetBookingResults endpoint. Response body: 74 KB.

Vendors returned to the browser:

# Type Vendor Price EUR Code
1 DIRECT Air France 1482 AF
2 OTA Flightnetwork 1489 ETRAVELI_FLIGHTNETWORK
3 OTA Gotogate 1490 ETRAVELI_GOTOGATE
4 OTA Mytrip 1490 ETRAVELI_MYTRIP
5 OTA Booking.com 1492 BOOKING
6 OTA BudgetAir 1495 TRAVIX
7 OTA Expedia 1495 EXPEDIA
8 OTA eDreams 1498 EDREAMS
9 OTA lastminute.com 1500 LASTMINUTE
10 OTA wego 1536 WEGO
11 OTA Kiwi.com 1548 KIWI
12 OTA Trip.com 1593 CTRIP

Notably, the existing parse_booking_chunk() parser already handles all 12 rows correctly when fed the captured 74 KB response — BookingOption.is_airline_direct resolves correctly, vendor_code / vendor_name / price / booking_url / google_click_url are all populated. So the parser is not the limitation.

Differences between the two requests

I compared the live-captured browser request and the request fli sends:

  1. Inner booking token, field 2: browser sends "AF279" (just airline+flight); build_booking_token() sends "AF279#1" (with #leg_index suffix). Tested both — no effect on the response.
  2. Inner booking token, field 14: browser sends a different integer than field 1 (e.g. 148137 vs 172631); the current code duplicates field 1 in field 14. Doesn't appear load-bearing either.
  3. Main filter struct uses Freebase MIDs: [[['/m/05qtj', 5]]] (Paris city) and [[['/m/07dfk', 5]]] (Tokyo city) instead of IATA airport codes [[['CDG', 0]]]. Tested with MIDs — no effect.
  4. outer[3]: browser sends 1, current code sends 0. Tested with 1 — no effect.
  5. URL query parameters sent by browser but not by fli: f.sid, bl, _reqid, soc-app=162, soc-platform=1, soc-device=1. Tested — no effect.
  6. Special HTTP headers sent by browser: X-Same-Domain: 1, x-goog-ext-259736195-jspb, Referer: <booking page URL>. Tested — no effect.
  7. Cookies: browser sends AEC, SOCS, NID, __Secure-STRP, DV, OTZ. Tested with full session warmup (homepage → /travel/flights → search → booking-page GET → API POST). No effect on the API response size.
  8. X-Goog-BatchExecute-Bgr header: a long base64-ish token (likely a BotGuard / browser-environment-attestation challenge response generated by Google's obfuscated JS). The current code does not send this. This appears to be the gatekeeper — when absent, Google returns the stripped 14 KB / airline-direct-only response; when present (real browser), it returns the full 74 KB with all OTAs.

I also tried replaying the exact captured request (same f.sid, same Bgr token, same cookies, same body) shortly after capture — Google returned the stripped 14 KB response. The Bgr token appears to be single-use and/or bound to the originating session/IP, so replay isn't a path.

Question for maintainers

Two angles I'd appreciate input on:

  1. Is this a known limitation? The current behavior (1 vendor for AF, ~3 fare cards for AA in the fixture) is consistent — GetBookingResults without BotGuard validation returns only airline-direct. The README and BookingOption model docstring describe the result as "a list of vendors (airline direct and OTAs)", which suggests the OTA case was intended to work at some point. Was it ever returning OTAs in practice, or has Google tightened the gate since the library was written?

  2. Is there an HTTP-only path you've explored? I tested URL params, headers, cookies, Referer, session warmup, exact replay — nothing without BotGuard returns OTAs. Two known paths to bypass it (Playwright-driven Chromium with network interception; running the BotGuard JS bundle in a JS runtime with a mocked browser env) both have significant downsides. Would either be welcome as an opt-in extension, or do you see another angle I'm missing?

Happy to share the redacted HAR file and the experimental scripts if useful.

Thanks for the library — get_booking_options() returning even the airline-direct fares programmatically is already extremely valuable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions