Skip to content

Commit 8fe062e

Browse files
authored
Merge pull request #25 from saraswatayu/docs/booking-ota-examples
Surface OTA seller diversity in booking results (fixture, tests, example, docs)
2 parents a36bf91 + 62ffc7c commit 8fe062e

9 files changed

Lines changed: 290 additions & 1 deletion

File tree

.claude/docs/booking-options-proto-notes.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,33 @@ Each parsed `BookingOption` dataclass includes:
6161
- `_brand_attribute_vector` (scalar-normalized diagnostic slice)
6262
- `_registry_version`
6363

64+
## Seller list (airline-direct vs OTA)
65+
66+
`option[1]` is a list of seller entries; `option[1][0]` is
67+
`[seller_code, seller_name, logo_code_or_None, is_airline_direct_bool]`. The
68+
parser takes the first entry per option (multi-seller `option[1]` is permitted
69+
by the wire format but unobserved — logged at DEBUG if it ever appears).
70+
71+
Each booking option is one **booking channel**, not just a fare tier. Whether
72+
Google returns OTAs is route-dependent, not a swoop limitation:
73+
74+
- Premium cabins and major US carriers (AA/DL/UA/B6) → airline-direct only.
75+
- International economy on foreign carriers → airline-direct + a wall of OTAs.
76+
77+
OTA options carry `is_airline_direct = False`, a non-IATA `seller_code`, and
78+
typically **null brand fields** (no `brand_label`/`brand_code`) — they still
79+
carry cabin class at `[6][0][0]`, so the parser keeps them via the option[21]
80+
fallback rather than dropping them.
81+
82+
Seller codes observed live on SFO→MNL economy (Philippine Airlines), captured
83+
as fixture `booking_intl_economy_ota_response.txt` / contract case
84+
`booking_intl_economy_ota_v1`:
85+
86+
`PR` (airline-direct), `EXPEDIA`, `FLIGHTHUB`, `CHEAPOAIR`, `JUSTFLY`,
87+
`ONETRAVEL`, `BYOJET`, `OVAGO`, `OOJO`, `WEGO`, `TRAVOMINT`, `SCHOLAR_TRIP`,
88+
`ADAMVACATIONS`, `BUSINESSCLASS`, `ARANGRANT`. Other captures also show
89+
`ETRAVELI_*` prefixed codes (e.g. `ETRAVELI_Gotogate`, `ETRAVELI_FALLBACK`).
90+
6491
## Basic economy signal
6592

6693
Observed robust signal:

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,5 +26,8 @@ htmlcov/
2626
# Local tooling artifacts (gstack browse audits, etc.)
2727
.gstack/
2828

29+
# Transient gstack/agent worktree checkouts (duplicate the repo tree)
30+
.claude/worktrees/
31+
2932
# examples/price_drop_watcher.py cache (written to CWD)
3033
.swoop-watch-cache.json

README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,13 @@ for opt in options:
259259
print(f" book at: {opt.booking_url}")
260260
```
261261

262+
Each `BookingOption` is a booking *channel*, not just a fare tier: the
263+
operating airline (`is_airline_direct`) plus any OTAs Google offers (Expedia,
264+
FlightHub, …), each with its own `seller_code` and `booking_url`. If every
265+
option shares one seller, the itinerary is airline-direct — see
266+
[`examples/booking_options.py`](examples/booking_options.py) for splitting
267+
channels and where OTAs show up.
268+
262269
### Deals discovery
263270

264271
`deals()` is the third primitive: instead of "what flights from A to B?"

examples/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Each script uses only the public API from `swoop.__all__`. No CLI, no extra depe
99
- **`price_drop_watcher.py`** — Watch a known flight for price drops on a schedule, cache the last-seen price to disk, and print when the price falls. Similar in spirit to what [Perch](https://perchtravel.com) uses in production to save users an average of $247 per trip.
1010
- **`multi_city_finder.py`** — Run an official multi-city / open-jaw search across arbitrary legs, show the top 5 trip options with per-leg details, and tune the beam search knobs (`max_results`, `beam_width`, `time_budget`).
1111
- **`deals_watcher.py`** — Discovery-style watcher: run a `swoop.deals()` query (with the full filter surface — region, budget, trip-length, depart-window, discount), diff against the prior run, and print new deals + price changes. Mirrors the single-watcher-per-cache-file pattern from `price_drop_watcher.py` but at the exploration layer.
12+
- **`booking_options.py`** — Show every booking channel for one itinerary: split airline-direct from online travel agencies (Expedia, FlightHub, CheapOair, …), print price/seller/brand per channel, and surface the cheapest `booking_url`. Defaults to an OTA-rich route (SFO→MNL economy) so the seller diversity is visible out of the box. Answers "why do all my options show the same seller?" — OTAs only appear on the itineraries Google offers them for.
1213

1314
## Notes
1415

examples/booking_options.py

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
#!/usr/bin/env python3
2+
"""Inspect booking-option seller diversity with swoop.get_booking_results().
3+
4+
Google Flights sells a single itinerary through more than one channel: the
5+
operating airline ("airline-direct") and, for many fares, a list of online
6+
travel agencies (OTAs) such as Expedia, FlightHub, or CheapOair.
7+
``get_booking_results()`` returns one ``BookingOption`` per channel, each with
8+
its own ``seller_code`` / ``seller_name`` and ``booking_url``.
9+
10+
Whether OTAs show up is decided by Google, not swoop:
11+
12+
* Premium cabins and most major US carriers (AA/DL/UA/B6 ...) are usually
13+
airline-direct only — you will see one seller, the airline, across
14+
several fare brands. That is expected, not a bug.
15+
* International economy on foreign carriers (PR, OZ, CI, BR, LO ...) often
16+
returns a wall of OTAs alongside the airline. That is where seller
17+
diversity shows up.
18+
19+
So if every option you get back shares one ``seller_code``, try an
20+
international economy itinerary on a foreign carrier before concluding the
21+
flight has no OTAs.
22+
23+
Usage:
24+
# Default: SFO->MNL economy ~90 days out (an OTA-rich route)
25+
python examples/booking_options.py
26+
27+
# Any route / cabin; --index selects which itinerary (0 = top result)
28+
python examples/booking_options.py JFK DEL 2026-09-15 --cabin economy
29+
python examples/booking_options.py JFK LAX 2026-06-15 --cabin first --index 2
30+
"""
31+
from __future__ import annotations
32+
33+
import argparse
34+
import sys
35+
from datetime import date, timedelta
36+
37+
import swoop
38+
39+
40+
def _default_date() -> str:
41+
return (date.today() + timedelta(days=90)).isoformat()
42+
43+
44+
def main() -> int:
45+
parser = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
46+
parser.add_argument("origin", nargs="?", default="SFO", help="origin IATA code (default SFO)")
47+
parser.add_argument("destination", nargs="?", default="MNL", help="destination IATA code (default MNL)")
48+
parser.add_argument("date", nargs="?", default=None, help="YYYY-MM-DD (default ~90 days out)")
49+
parser.add_argument(
50+
"--cabin",
51+
default="economy",
52+
choices=["economy", "premium-economy", "business", "first"],
53+
help="cabin class (default economy — where OTAs are most common)",
54+
)
55+
parser.add_argument("--index", type=int, default=0, help="which itinerary to price (0 = top result)")
56+
args = parser.parse_args()
57+
58+
origin = args.origin.upper()
59+
destination = args.destination.upper()
60+
when = args.date or _default_date()
61+
62+
try:
63+
result = swoop.search(origin, destination, when, cabin=args.cabin)
64+
except swoop.SwoopError as exc:
65+
print(f"search failed: {exc}", file=sys.stderr)
66+
return 1
67+
68+
if not result.results:
69+
print(f"No itineraries for {origin}->{destination} on {when}.", file=sys.stderr)
70+
return 1
71+
72+
if args.index >= len(result.results):
73+
print(f"--index {args.index} out of range ({len(result.results)} results).", file=sys.stderr)
74+
return 1
75+
76+
trip = result.results[args.index]
77+
itinerary = trip.legs[0].itinerary
78+
79+
try:
80+
options = swoop.get_booking_results(itinerary, cabin=args.cabin)
81+
except swoop.SwoopError as exc:
82+
print(f"booking lookup failed: {exc}", file=sys.stderr)
83+
return 1
84+
85+
if not options:
86+
print("No booking options returned for this itinerary.", file=sys.stderr)
87+
return 1
88+
89+
direct = [opt for opt in options if opt.is_airline_direct]
90+
otas = [opt for opt in options if not opt.is_airline_direct]
91+
92+
print(f"\n{origin} -> {destination} {when} ({args.cabin})")
93+
print(f"Itinerary #{args.index}: {itinerary.airline_code} from ${trip.price}\n")
94+
95+
def _row(opt: swoop.BookingOption) -> str:
96+
seller = opt.seller_name or opt.seller_code or "airline-direct"
97+
brand = opt.brand_label or opt.fare_family or "—"
98+
return f" ${opt.price:<6} {seller:<22} {brand}"
99+
100+
print(f"Airline-direct ({len(direct)}):")
101+
for opt in direct or [None]:
102+
print(" (none)" if opt is None else _row(opt))
103+
104+
print(f"\nOnline travel agencies ({len(otas)}):")
105+
for opt in sorted(otas, key=lambda o: o.price) or [None]:
106+
print(" (none)" if opt is None else _row(opt))
107+
108+
distinct = sorted({opt.seller_code for opt in options if opt.seller_code})
109+
print(f"\n{len(options)} options across {len(distinct)} distinct sellers: {', '.join(distinct)}")
110+
if len(distinct) <= 1:
111+
print(
112+
"Only one seller here — this itinerary is airline-direct. Try an "
113+
"international economy route on a foreign carrier (e.g. SFO MNL) "
114+
"to see the OTA list.",
115+
)
116+
117+
cheapest = min(options, key=lambda o: o.price)
118+
if cheapest.booking_url:
119+
seller = cheapest.seller_name or cheapest.seller_code or "the airline"
120+
print(f"\nCheapest (${cheapest.price}) is via {seller}:\n {cheapest.booking_url}")
121+
122+
return 0
123+
124+
125+
if __name__ == "__main__":
126+
raise SystemExit(main())

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ include = ["swoop/", "README.md", "LICENSE"]
5353
[tool.pyright]
5454
pythonVersion = "3.10"
5555
typeCheckingMode = "basic"
56-
exclude = ["swoop/flights_pb2.py", "tests/"]
56+
exclude = ["swoop/flights_pb2.py", "tests/", ".claude/worktrees"]
5757

5858
[tool.pytest.ini_options]
5959
testpaths = ["tests"]

tests/fixtures/contract_corpus_manifest.json

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -508,6 +508,34 @@
508508
"seller_codes": ["B6", "B6", "B6", "B6"],
509509
"is_airline_direct": [true, true, true, true]
510510
}
511+
},
512+
{
513+
"id": "booking_intl_economy_ota_v1",
514+
"path": "corpus/booking_intl_economy_ota_response.txt",
515+
"registry_version": "2026-06-02",
516+
"expected": {
517+
"prices": [553, 525, 553, 536, 536, 533, 542, 553, 536, 553, 553, 554, 553, 3067, 3098],
518+
"brands": ["", "", "", "", "", "", "", "", "", "", "", "", "", "", ""],
519+
"cabin_buckets": [
520+
"economy", "economy", "economy", "economy", "economy", "economy",
521+
"economy", "economy", "economy", "economy", "economy", "economy",
522+
"economy", "business", "business"
523+
],
524+
"fare_families": [
525+
"unknown", "basic", "unknown", "unknown", "unknown", "basic",
526+
"unknown", "unknown", "unknown", "unknown", "unknown", "unknown",
527+
"unknown", "unknown", "unknown"
528+
],
529+
"seller_codes": [
530+
"PR", "ADAMVACATIONS", "EXPEDIA", "FLIGHTHUB", "TRAVOMINT",
531+
"SCHOLAR_TRIP", "WEGO", "CHEAPOAIR", "JUSTFLY", "ONETRAVEL",
532+
"OOJO", "BYOJET", "OVAGO", "BUSINESSCLASS", "ARANGRANT"
533+
],
534+
"is_airline_direct": [
535+
true, false, false, false, false, false, false, false,
536+
false, false, false, false, false, false, false
537+
]
538+
}
511539
}
512540
]
513541
}

tests/fixtures/corpus/booking_intl_economy_ota_response.txt

Lines changed: 3 additions & 0 deletions
Large diffs are not rendered by default.

tests/test_booking.py

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
"""Corpus-backed regression tests for the OTA booking-option path.
2+
3+
Issue #24 asked whether ``get_booking_results()`` can surface the full
4+
third-party seller list (Expedia, FlightHub, ...) that the Google Flights
5+
web UI shows, or whether swoop is limited to airline-direct fares.
6+
7+
The answer is that the parser already surfaces every seller Google returns —
8+
OTAs included — but it only does so for itineraries Google chooses to offer
9+
them on (international economy on foreign carriers). The original booking
10+
corpus was entirely premium-cabin / major-carrier itineraries, all of which
11+
are airline-direct, so nothing in the test suite exercised the OTA path. The
12+
synthetic ``_extract_seller`` unit tests live in ``test_rpc.py``; this file
13+
locks in the behaviour against a *real* captured OTA-bearing response so a
14+
future reshape of ``opt[1]`` fails loudly.
15+
16+
Fixture: ``corpus/booking_intl_economy_ota_response.txt`` — SFO->MNL economy
17+
on Philippine Airlines (PR), the exact carrier that motivated the null-brand
18+
OTA parsing fix in commit b9dfd7d. One airline-direct option (PR) plus a wall
19+
of OTAs.
20+
"""
21+
22+
from __future__ import annotations
23+
24+
from pathlib import Path
25+
26+
from swoop._booking import _parse_booking_rpc_response
27+
28+
FIXTURES = Path(__file__).parent / "fixtures"
29+
OTA_FIXTURE = FIXTURES / "corpus" / "booking_intl_economy_ota_response.txt"
30+
REGISTRY_VERSION = "2026-06-02"
31+
32+
33+
def _ota_options():
34+
text = OTA_FIXTURE.read_text()
35+
return _parse_booking_rpc_response(text, registry_version=REGISTRY_VERSION)
36+
37+
38+
def test_ota_fixture_surfaces_full_seller_list() -> None:
39+
"""The OTA list is surfaced — not collapsed to a single airline seller.
40+
41+
This is the direct answer to issue #24: when Google returns OTAs, swoop
42+
returns one BookingOption per seller with a distinct seller_code.
43+
"""
44+
options = _ota_options()
45+
distinct = {opt.seller_code for opt in options}
46+
47+
# A wall of distinct sellers, not "all the same airline".
48+
assert len(distinct) >= 10, f"expected a diverse OTA list, got {sorted(distinct)}"
49+
# Recognizable OTAs the web UI shows must come through verbatim.
50+
assert {"EXPEDIA", "FLIGHTHUB", "CHEAPOAIR"} <= distinct, sorted(distinct)
51+
52+
53+
def test_ota_fixture_distinguishes_airline_direct_from_otas() -> None:
54+
"""is_airline_direct cleanly splits the operating carrier from resellers."""
55+
options = _ota_options()
56+
57+
direct = [opt for opt in options if opt.is_airline_direct]
58+
otas = [opt for opt in options if not opt.is_airline_direct]
59+
60+
# Exactly the airline (PR) is flagged direct; everything else is a reseller.
61+
assert [opt.seller_code for opt in direct] == ["PR"]
62+
assert otas, "fixture should contain OTA (non-direct) options"
63+
assert all(opt.seller_code for opt in otas), "every OTA must carry a seller_code"
64+
# No OTA is mislabeled as the operating carrier.
65+
assert "PR" not in {opt.seller_code for opt in otas}
66+
67+
68+
def test_ota_fixture_parses_null_brand_options_instead_of_dropping_them() -> None:
69+
"""OTA/codeshare options carry no brand text but must still be parsed.
70+
71+
Guards the option[21] null-brand fallback in _parse_booking_rpc_response.
72+
Before commit b9dfd7d these options were silently dropped, so for routes
73+
like this one (SFO->MNL via PR) ALL options vanished and callers saw zero
74+
booking options.
75+
"""
76+
options = _ota_options()
77+
78+
# This particular itinerary has empty brand text on every option.
79+
assert options, "null-brand OTA options must not be dropped"
80+
assert all(opt.brand_label == "" for opt in options)
81+
# ...yet each still carries a usable seller_code and a known cabin bucket
82+
# sourced from option[21][6][0][0] rather than brand text.
83+
assert all(opt.seller_code for opt in options)
84+
assert all(opt._cabin_bucket in {"economy", "business"} for opt in options)
85+
86+
87+
def test_ota_fixture_options_are_bookable() -> None:
88+
"""At least one OTA option exposes a booking_url so callers can route out."""
89+
options = _ota_options()
90+
ota_urls = [opt for opt in options if not opt.is_airline_direct and opt.booking_url]
91+
92+
assert ota_urls, "OTA options should carry google.com/travel/clk redirects"
93+
for opt in ota_urls:
94+
assert opt.booking_url.startswith("https://www.google.com/travel/clk/f?")

0 commit comments

Comments
 (0)