feat(airlines): Add automated ICAO-IATA code generator script#4309
feat(airlines): Add automated ICAO-IATA code generator script#4309Aswinesag wants to merge 1 commit into
Conversation
Introduces scripts/generate-airline-codes.mjs to fetch from OpenFlights airlines.dat. Formats server/_shared/airline-codes.ts with BEGIN/END markers to allow automated JSON-safe injection while preserving the manual OVERRIDE map. Refreshes the current list to ensure baseline accuracy.
|
@Aswinesag is attempting to deploy a commit to the World Monitor Team on Vercel. A member of the Team first needs to authorize it. |
Greptile SummaryThis PR replaces the ~100-entry hand-curated ICAO→IATA map in
Confidence Score: 3/5The public lookup functions are structurally intact and consumers are unaffected at runtime, but the committed data already contains corrupt entries and the script will regenerate them on every quarterly refresh until the filter is fixed. The filter gap is an active defect — non-alphanumeric garbage rows from the OpenFlights dataset (e.g. "--+", "...") are already in the committed file and will recur on each re-run. The public-facing icaoToIata / toIataCallsign functions won't surface these entries in practice (the parseCallsign regex requires [A-Z] prefixes), but the data is durably corrupted and the script as written is the wrong tool to run quarterly without the fix. The master-branch URL and unnecessary exports are secondary concerns. Both changed files warrant attention: scripts/generate-airline-codes.mjs needs the alphanumeric filter fix before the next refresh, and server/_shared/airline-codes.ts should have the spurious export keywords removed from OVERRIDE and GENERATED. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A([Developer runs\nnode scripts/generate-airline-codes.mjs]) --> B[fetch OpenFlights airlines.dat\nfrom GitHub master branch]
B --> C{response.ok?}
C -- No --> D[throw Error]
C -- Yes --> E[Split CSV into lines\nparseCSVLine each]
E --> F{active=Y AND\niata length=2 AND\nicao length=3?}
F -- Pass --> G[newAirlines Map\nicao → iata + name]
F -- Fail --> H[skip row]
G --> I[Sort ICAO keys alphabetically]
I --> J[Build TS block via JSON.stringify]
J --> K[Read airline-codes.ts]
K --> L{Find BEGIN/END markers?}
L -- No --> M[throw Error]
L -- Yes --> N[Extract oldBlock for diff count]
N --> O[Replace block + fs.writeFile]
O --> P([Print added/removed summary])
Reviews (1): Last reviewed commit: "feat(airlines): add generator script and..." | Re-trigger Greptile |
| if ( | ||
| active === 'Y' && | ||
| iata && iata.length === 2 && iata !== '\\N' && iata !== 'null' && | ||
| icao && icao.length === 3 && icao !== '\\N' && icao !== 'null' | ||
| ) { |
There was a problem hiding this comment.
Missing alphanumeric validation on ICAO/IATA codes — invalid entries such as
"--+" (iata: "-+") and "..." (iata: "..") have passed through the filter and are now committed in airline-codes.ts. ICAO prefixes must be 2–4 uppercase letters (A-Z), not arbitrary 3-character strings, and standard IATA codes are 2 uppercase alphanumerics. Checking only length and the literal strings "\N" / "null" is insufficient to exclude test/corrupt rows from the OpenFlights dataset.
| if ( | |
| active === 'Y' && | |
| iata && iata.length === 2 && iata !== '\\N' && iata !== 'null' && | |
| icao && icao.length === 3 && icao !== '\\N' && icao !== 'null' | |
| ) { | |
| if ( | |
| active === 'Y' && | |
| iata && /^[A-Z0-9]{2}$/i.test(iata) && | |
| icao && /^[A-Z]{3}$/.test(icao) | |
| ) { |
|
|
||
| // OpenFlights airlines.dat - Public Domain | ||
| // Pinned to master, consider using a specific commit hash for strict auditability | ||
| const SOURCE_URL = 'https://raw.githubusercontent.com/jpatokal/openflights/master/data/airlines.dat'; |
There was a problem hiding this comment.
The source URL is pinned to the mutable
master branch. Any push to jpatokal/openflights will silently change what this script produces, making quarterly refreshes non-reproducible and potentially introducing supply-chain drift without a visible code change. The script's own comment flags this. Pin to a specific commit SHA so the refresh is auditable and intentional.
| const SOURCE_URL = 'https://raw.githubusercontent.com/jpatokal/openflights/master/data/airlines.dat'; | |
| // TODO: pin to a specific commit SHA before the next refresh, e.g.: | |
| // https://raw.githubusercontent.com/jpatokal/openflights/<commit>/data/airlines.dat | |
| const SOURCE_URL = 'https://raw.githubusercontent.com/jpatokal/openflights/master/data/airlines.dat'; |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
|
||
| async function run() { | ||
| console.log(`Fetching airline data from ${SOURCE_URL}...`); | ||
| const response = await fetch(SOURCE_URL); |
There was a problem hiding this comment.
Per the repo's critical conventions in
AGENTS.md, all server-side fetch calls must include a User-Agent header. While this is a developer utility script rather than an Edge Function, the same rule improves compatibility and avoids raw fetch calls being blocked or rate-limited by GitHub's CDN.
| const response = await fetch(SOURCE_URL); | |
| const response = await fetch(SOURCE_URL, { | |
| headers: { 'User-Agent': 'worldmonitor/generate-airline-codes (+https://github.com/koala73/worldmonitor)' }, | |
| }); |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| export const OVERRIDE: Record<string, { iata: string; name: string }> = { | ||
| // Example: 'XYZ': { iata: 'X2', name: 'Corrected Name' }, | ||
| }; | ||
|
|
||
| const GENERATED = new Map<string, { iata: string; name: string }>([ | ||
| ['AAL', { iata: 'AA', name: 'American Airlines' }], | ||
| ['AAY', { iata: 'G4', name: 'Allegiant Air' }], | ||
| ['ACA', { iata: 'AC', name: 'Air Canada' }], | ||
| ['ADR', { iata: 'JP', name: 'Adria Airways' }], | ||
| ['AFL', { iata: 'SU', name: 'Aeroflot' }], | ||
| ['AFR', { iata: 'AF', name: 'Air France' }], | ||
| ['AIC', { iata: 'AI', name: 'Air India' }], | ||
| ['AMX', { iata: 'AM', name: 'Aeromexico' }], | ||
| ['ANZ', { iata: 'NZ', name: 'Air New Zealand' }], | ||
| ['ASA', { iata: 'AS', name: 'Alaska Airlines' }], | ||
| ['ASH', { iata: 'YV', name: 'Mesa Airlines' }], | ||
| ['AUA', { iata: 'OS', name: 'Austrian Airlines' }], | ||
| ['AVA', { iata: 'AV', name: 'Avianca' }], | ||
| ['AZA', { iata: 'AZ', name: 'ITA Airways' }], | ||
| ['AZU', { iata: 'AD', name: 'Azul Brazilian Airlines' }], | ||
| ['BAW', { iata: 'BA', name: 'British Airways' }], | ||
| ['BBS', { iata: 'BG', name: 'Biman Bangladesh Airlines' }], | ||
| ['BEL', { iata: 'SN', name: 'Brussels Airlines' }], | ||
| ['BSK', { iata: 'B2', name: 'Belavia' }], | ||
| ['CCA', { iata: 'CA', name: 'Air China' }], | ||
| ['CES', { iata: 'MU', name: 'China Eastern Airlines' }], | ||
| ['CHH', { iata: 'HU', name: 'Hainan Airlines' }], | ||
| ['CPA', { iata: 'CX', name: 'Cathay Pacific' }], | ||
| ['CSN', { iata: 'CZ', name: 'China Southern Airlines' }], | ||
| ['CSO', { iata: 'OK', name: 'Czech Airlines' }], | ||
| ['CTN', { iata: 'OU', name: 'Croatia Airlines' }], | ||
| ['DAL', { iata: 'DL', name: 'Delta Air Lines' }], | ||
| ['DLH', { iata: 'LH', name: 'Lufthansa' }], | ||
| ['EIN', { iata: 'EI', name: 'Aer Lingus' }], | ||
| ['ELY', { iata: 'LY', name: 'El Al' }], | ||
| ['ETD', { iata: 'EY', name: 'Etihad Airways' }], | ||
| ['ETH', { iata: 'ET', name: 'Ethiopian Airlines' }], | ||
| ['EWG', { iata: 'EW', name: 'Eurowings' }], | ||
| ['EZS', { iata: 'DS', name: 'easyJet Switzerland' }], | ||
| ['EZY', { iata: 'U2', name: 'easyJet' }], | ||
| ['FDB', { iata: 'FZ', name: 'flydubai' }], | ||
| ['FFT', { iata: 'F9', name: 'Frontier Airlines' }], | ||
| ['FIN', { iata: 'AY', name: 'Finnair' }], | ||
| ['GFA', { iata: 'GF', name: 'Gulf Air' }], | ||
| ['GLO', { iata: 'G3', name: 'Gol Transportes Aéreos' }], | ||
| ['HAL', { iata: 'HA', name: 'Hawaiian Airlines' }], | ||
| ['HLX', { iata: '5K', name: 'Hi Fly' }], | ||
| ['IBE', { iata: 'IB', name: 'Iberia' }], | ||
| ['IBS', { iata: 'I2', name: 'Iberia Express' }], | ||
| ['IGO', { iata: '6E', name: 'IndiGo' }], | ||
| ['IRM', { iata: 'IR', name: 'Iran Air' }], | ||
| ['JAI', { iata: '9W', name: 'Jet Airways' }], | ||
| ['JAT', { iata: 'JU', name: 'Air Serbia' }], | ||
| ['JBU', { iata: 'B6', name: 'JetBlue' }], | ||
| ['JST', { iata: 'JQ', name: 'Jetstar' }], | ||
| ['KAL', { iata: 'KE', name: 'Korean Air' }], | ||
| ['KLM', { iata: 'KL', name: 'KLM Royal Dutch Airlines' }], | ||
| ['LOT', { iata: 'LO', name: 'LOT Polish Airlines' }], | ||
| ['MAS', { iata: 'MH', name: 'Malaysia Airlines' }], | ||
| ['MSR', { iata: 'MS', name: 'EgyptAir' }], | ||
| ['NAX', { iata: 'DY', name: 'Norwegian Air Shuttle' }], | ||
| ['NKS', { iata: 'NK', name: 'Spirit Airlines' }], | ||
| ['OAL', { iata: 'OA', name: 'Olympic Air' }], | ||
| ['PGA', { iata: 'NI', name: 'Portugália Airlines' }], | ||
| ['PGT', { iata: 'PC', name: 'Pegasus Airlines' }], | ||
| ['PKC', { iata: 'PK', name: 'Pakistan International Airlines' }], | ||
| ['QFA', { iata: 'QF', name: 'Qantas' }], | ||
| ['QTR', { iata: 'QR', name: 'Qatar Airways' }], | ||
| ['RAM', { iata: 'AT', name: 'Royal Air Maroc' }], | ||
| ['ROU', { iata: 'RO', name: 'TAROM' }], | ||
| ['RYR', { iata: 'FR', name: 'Ryanair' }], | ||
| ['SAS', { iata: 'SK', name: 'Scandinavian Airlines' }], | ||
| ['SHY', { iata: 'ZY', name: 'Sky Airlines' }], | ||
| ['SIA', { iata: 'SQ', name: 'Singapore Airlines' }], | ||
| ['SVN', { iata: 'SV', name: 'Saudia' }], | ||
| ['SWA', { iata: 'WN', name: 'Southwest Airlines' }], | ||
| ['SWG', { iata: 'WG', name: 'Sunwing Airlines' }], | ||
| ['WJA', { iata: 'WS', name: 'WestJet' }], | ||
| ['SWR', { iata: 'LX', name: 'Swiss International Air Lines' }], | ||
| ['SXB', { iata: 'S5', name: 'SpiceJet' }], | ||
| ['TAM', { iata: 'LA', name: 'LATAM Airlines' }], | ||
| ['TAP', { iata: 'TP', name: 'TAP Air Portugal' }], | ||
| ['TGW', { iata: 'TR', name: 'Scoot' }], | ||
| ['OMA', { iata: 'WY', name: 'Oman Air' }], | ||
| ['THA', { iata: 'TG', name: 'Thai Airways' }], | ||
| ['THY', { iata: 'TK', name: 'Turkish Airlines' }], | ||
| ['TJK', { iata: '7J', name: 'Tajik Air' }], | ||
| ['TOM', { iata: 'BY', name: 'TUI Airways' }], | ||
| ['TRA', { iata: 'HV', name: 'Transavia' }], | ||
| ['TSC', { iata: 'TS', name: 'Air Transat' }], | ||
| ['TUN', { iata: 'TU', name: 'Tunisair' }], | ||
| ['UAE', { iata: 'EK', name: 'Emirates' }], | ||
| ['UAL', { iata: 'UA', name: 'United Airlines' }], | ||
| ['UZB', { iata: 'HY', name: 'Uzbekistan Airways' }], | ||
| ['VIR', { iata: 'VS', name: 'Virgin Atlantic' }], | ||
| ['VLG', { iata: 'VY', name: 'Vueling' }], | ||
| ['VOI', { iata: 'Y4', name: 'Volaris' }], | ||
| ['VOZ', { iata: 'VA', name: 'Virgin Australia' }], | ||
| ['WIF', { iata: 'WF', name: 'Widerøe' }], | ||
| ['WRF', { iata: 'RB', name: 'Syrian Arab Airlines' }], | ||
| ['WZZ', { iata: 'W6', name: 'Wizz Air' }], | ||
| ]); | ||
| // --- BEGIN GENERATED AIRLINES --- | ||
| export const GENERATED: Record<string, { iata: string; name: string }> = { |
There was a problem hiding this comment.
Unnecessary export of internal tables
OVERRIDE and GENERATED are now export const, but neither was exported before and neither is part of the module's intended public API (parseCallsign, icaoToIata, toIataCallsign). Exporting them lets callers bypass the merged AIRLINES map, which is the only structure that correctly applies overrides on top of the generated baseline. If a consumer calls GENERATED["AFL"] directly instead of icaoToIata("AFL"), they will silently miss any active override. These constants should remain unexported.
| let added = 0; | ||
| let removed = 0; | ||
|
|
||
| for (const key of newKeys) if (!oldKeys.has(key)) added++; |
There was a problem hiding this comment.
Diff-count regex excludes non-alphanumeric keys
The regex /"([A-Z0-9]{3})"/g used to extract old keys for the added/removed summary will not match entries whose ICAO key contains non-alphanumeric characters (e.g. "--+", "..."). If the alphanumeric filter fix is applied, those entries will disappear from the new output but the old-key extraction will also miss them, so the removed counter will under-report on the very next run. A broader pattern like /"([^"]{3})"/g would produce accurate counts.
Resolves #1995
Description
This PR addresses the manual curation bottleneck for the airline ICAO-to-IATA mapping by introducing an automated generator script that pulls directly from the OpenFlights
airlines.datpublic domain dataset.Key Changes:
scripts/generate-airline-codes.mjs: A Node.js script that fetches, parses, and safely filters the OpenFlights CSV for active airlines with valid 2-char IATA and 3-char ICAO codes.server/_shared/airline-codes.ts: Implemented// --- BEGIN GENERATED AIRLINES ---markers to allow automated JSON-safe injection.OVERRIDEtable is completely untouched during quarterly refreshes.Testing / Validation
node scripts/generate-airline-codes.mjslocally.OVERRIDEtable survive the re-generation process.