Skip to content

feat(airlines): Add automated ICAO-IATA code generator script#4309

Open
Aswinesag wants to merge 1 commit into
koala73:mainfrom
Aswinesag:main
Open

feat(airlines): Add automated ICAO-IATA code generator script#4309
Aswinesag wants to merge 1 commit into
koala73:mainfrom
Aswinesag:main

Conversation

@Aswinesag

Copy link
Copy Markdown

Resolves #1995

Description

This PR addresses the manual curation bottleneck for the airline ICAO-to-IATA mapping by introducing an automated generator script that pulls directly from the OpenFlights airlines.dat public domain dataset.

Key Changes:

  • Added scripts/generate-airline-codes.mjs: A Node.js script that fetches, parses, and safely filters the OpenFlights CSV for active airlines with valid 2-char IATA and 3-char ICAO codes.
  • Formatted server/_shared/airline-codes.ts: Implemented // --- BEGIN GENERATED AIRLINES --- markers to allow automated JSON-safe injection.
  • Preserved Overrides: The script specifically targets the generated block, ensuring the manual OVERRIDE table is completely untouched during quarterly refreshes.
  • Refreshed Baseline: Ran the script to update the current list to the latest OpenFlights data.

Testing / Validation

  • Executed node scripts/generate-airline-codes.mjs locally.
  • Verified the script correctly calculates the diff (added/removed entries) vs. the previous run.
  • Confirmed that TypeScript syntax remains valid and strings with quotes/commas are properly escaped.
  • Verified that manual entries in the OVERRIDE table survive the re-generation process.

Introduces scripts/generate-airline-codes.mjs to fetch from OpenFlights airlines.dat. Formats server/_shared/airline-codes.ts with BEGIN/END markers to allow automated JSON-safe injection while preserving the manual OVERRIDE map. Refreshes the current list to ensure baseline accuracy.
@vercel

vercel Bot commented Jun 13, 2026

Copy link
Copy Markdown

@Aswinesag is attempting to deploy a commit to the World Monitor Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions Bot added the trust:caution Brin: contributor trust score caution label Jun 13, 2026
@greptile-apps

greptile-apps Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR replaces the ~100-entry hand-curated ICAO→IATA map in server/_shared/airline-codes.ts with an ~840-entry dataset generated from the public OpenFlights airlines.dat file, and introduces scripts/generate-airline-codes.mjs to automate future quarterly refreshes.

  • The generator script fetches, CSV-parses, filters, and JSON-encodes airline entries into a delimited block in airline-codes.ts, preserving a manual OVERRIDE table outside that block.
  • The filter only validates code length and two literal sentinel strings (\N, null), allowing invalid entries like "--+" and "..." to pass through and land in the committed file; additionally, OVERRIDE and GENERATED are now exported even though neither belongs in the module's public API.

Confidence Score: 3/5

The public lookup functions are structurally intact and consumers are unaffected at runtime, but the committed data already contains corrupt entries and the script will regenerate them on every quarterly refresh until the filter is fixed.

The filter gap is an active defect — non-alphanumeric garbage rows from the OpenFlights dataset (e.g. "--+", "...") are already in the committed file and will recur on each re-run. The public-facing icaoToIata / toIataCallsign functions won't surface these entries in practice (the parseCallsign regex requires [A-Z] prefixes), but the data is durably corrupted and the script as written is the wrong tool to run quarterly without the fix. The master-branch URL and unnecessary exports are secondary concerns.

Both changed files warrant attention: scripts/generate-airline-codes.mjs needs the alphanumeric filter fix before the next refresh, and server/_shared/airline-codes.ts should have the spurious export keywords removed from OVERRIDE and GENERATED.

Important Files Changed

Filename Overview
scripts/generate-airline-codes.mjs New generator script that fetches and parses the OpenFlights CSV — has a data-quality bug where non-alphanumeric ICAO/IATA codes (e.g. "--+", "...") pass through the filter, plus a mutable master-branch URL and missing User-Agent header.
server/_shared/airline-codes.ts Replaced ~100-entry hand-curated Map with a ~840-entry generated Record; internal OVERRIDE and GENERATED constants are now unnecessarily exported, and several invalid ICAO entries (e.g. "--+", "...") are present in the committed generated block.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A([Developer runs\nnode scripts/generate-airline-codes.mjs]) --> B[fetch OpenFlights airlines.dat\nfrom GitHub master branch]
    B --> C{response.ok?}
    C -- No --> D[throw Error]
    C -- Yes --> E[Split CSV into lines\nparseCSVLine each]
    E --> F{active=Y AND\niata length=2 AND\nicao length=3?}
    F -- Pass --> G[newAirlines Map\nicao → iata + name]
    F -- Fail --> H[skip row]
    G --> I[Sort ICAO keys alphabetically]
    I --> J[Build TS block via JSON.stringify]
    J --> K[Read airline-codes.ts]
    K --> L{Find BEGIN/END markers?}
    L -- No --> M[throw Error]
    L -- Yes --> N[Extract oldBlock for diff count]
    N --> O[Replace block + fs.writeFile]
    O --> P([Print added/removed summary])
Loading

Reviews (1): Last reviewed commit: "feat(airlines): add generator script and..." | Re-trigger Greptile

Comment on lines +63 to +67
if (
active === 'Y' &&
iata && iata.length === 2 && iata !== '\\N' && iata !== 'null' &&
icao && icao.length === 3 && icao !== '\\N' && icao !== 'null'
) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing alphanumeric validation on ICAO/IATA codes — invalid entries such as "--+" (iata: "-+") and "..." (iata: "..") have passed through the filter and are now committed in airline-codes.ts. ICAO prefixes must be 2–4 uppercase letters (A-Z), not arbitrary 3-character strings, and standard IATA codes are 2 uppercase alphanumerics. Checking only length and the literal strings "\N" / "null" is insufficient to exclude test/corrupt rows from the OpenFlights dataset.

Suggested change
if (
active === 'Y' &&
iata && iata.length === 2 && iata !== '\\N' && iata !== 'null' &&
icao && icao.length === 3 && icao !== '\\N' && icao !== 'null'
) {
if (
active === 'Y' &&
iata && /^[A-Z0-9]{2}$/i.test(iata) &&
icao && /^[A-Z]{3}$/.test(icao)
) {


// OpenFlights airlines.dat - Public Domain
// Pinned to master, consider using a specific commit hash for strict auditability
const SOURCE_URL = 'https://raw.githubusercontent.com/jpatokal/openflights/master/data/airlines.dat';

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The source URL is pinned to the mutable master branch. Any push to jpatokal/openflights will silently change what this script produces, making quarterly refreshes non-reproducible and potentially introducing supply-chain drift without a visible code change. The script's own comment flags this. Pin to a specific commit SHA so the refresh is auditable and intentional.

Suggested change
const SOURCE_URL = 'https://raw.githubusercontent.com/jpatokal/openflights/master/data/airlines.dat';
// TODO: pin to a specific commit SHA before the next refresh, e.g.:
// https://raw.githubusercontent.com/jpatokal/openflights/<commit>/data/airlines.dat
const SOURCE_URL = 'https://raw.githubusercontent.com/jpatokal/openflights/master/data/airlines.dat';

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!


async function run() {
console.log(`Fetching airline data from ${SOURCE_URL}...`);
const response = await fetch(SOURCE_URL);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Per the repo's critical conventions in AGENTS.md, all server-side fetch calls must include a User-Agent header. While this is a developer utility script rather than an Edge Function, the same rule improves compatibility and avoids raw fetch calls being blocked or rate-limited by GitHub's CDN.

Suggested change
const response = await fetch(SOURCE_URL);
const response = await fetch(SOURCE_URL, {
headers: { 'User-Agent': 'worldmonitor/generate-airline-codes (+https://github.com/koala73/worldmonitor)' },
});

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +5 to +10
export const OVERRIDE: Record<string, { iata: string; name: string }> = {
// Example: 'XYZ': { iata: 'X2', name: 'Corrected Name' },
};

const GENERATED = new Map<string, { iata: string; name: string }>([
['AAL', { iata: 'AA', name: 'American Airlines' }],
['AAY', { iata: 'G4', name: 'Allegiant Air' }],
['ACA', { iata: 'AC', name: 'Air Canada' }],
['ADR', { iata: 'JP', name: 'Adria Airways' }],
['AFL', { iata: 'SU', name: 'Aeroflot' }],
['AFR', { iata: 'AF', name: 'Air France' }],
['AIC', { iata: 'AI', name: 'Air India' }],
['AMX', { iata: 'AM', name: 'Aeromexico' }],
['ANZ', { iata: 'NZ', name: 'Air New Zealand' }],
['ASA', { iata: 'AS', name: 'Alaska Airlines' }],
['ASH', { iata: 'YV', name: 'Mesa Airlines' }],
['AUA', { iata: 'OS', name: 'Austrian Airlines' }],
['AVA', { iata: 'AV', name: 'Avianca' }],
['AZA', { iata: 'AZ', name: 'ITA Airways' }],
['AZU', { iata: 'AD', name: 'Azul Brazilian Airlines' }],
['BAW', { iata: 'BA', name: 'British Airways' }],
['BBS', { iata: 'BG', name: 'Biman Bangladesh Airlines' }],
['BEL', { iata: 'SN', name: 'Brussels Airlines' }],
['BSK', { iata: 'B2', name: 'Belavia' }],
['CCA', { iata: 'CA', name: 'Air China' }],
['CES', { iata: 'MU', name: 'China Eastern Airlines' }],
['CHH', { iata: 'HU', name: 'Hainan Airlines' }],
['CPA', { iata: 'CX', name: 'Cathay Pacific' }],
['CSN', { iata: 'CZ', name: 'China Southern Airlines' }],
['CSO', { iata: 'OK', name: 'Czech Airlines' }],
['CTN', { iata: 'OU', name: 'Croatia Airlines' }],
['DAL', { iata: 'DL', name: 'Delta Air Lines' }],
['DLH', { iata: 'LH', name: 'Lufthansa' }],
['EIN', { iata: 'EI', name: 'Aer Lingus' }],
['ELY', { iata: 'LY', name: 'El Al' }],
['ETD', { iata: 'EY', name: 'Etihad Airways' }],
['ETH', { iata: 'ET', name: 'Ethiopian Airlines' }],
['EWG', { iata: 'EW', name: 'Eurowings' }],
['EZS', { iata: 'DS', name: 'easyJet Switzerland' }],
['EZY', { iata: 'U2', name: 'easyJet' }],
['FDB', { iata: 'FZ', name: 'flydubai' }],
['FFT', { iata: 'F9', name: 'Frontier Airlines' }],
['FIN', { iata: 'AY', name: 'Finnair' }],
['GFA', { iata: 'GF', name: 'Gulf Air' }],
['GLO', { iata: 'G3', name: 'Gol Transportes Aéreos' }],
['HAL', { iata: 'HA', name: 'Hawaiian Airlines' }],
['HLX', { iata: '5K', name: 'Hi Fly' }],
['IBE', { iata: 'IB', name: 'Iberia' }],
['IBS', { iata: 'I2', name: 'Iberia Express' }],
['IGO', { iata: '6E', name: 'IndiGo' }],
['IRM', { iata: 'IR', name: 'Iran Air' }],
['JAI', { iata: '9W', name: 'Jet Airways' }],
['JAT', { iata: 'JU', name: 'Air Serbia' }],
['JBU', { iata: 'B6', name: 'JetBlue' }],
['JST', { iata: 'JQ', name: 'Jetstar' }],
['KAL', { iata: 'KE', name: 'Korean Air' }],
['KLM', { iata: 'KL', name: 'KLM Royal Dutch Airlines' }],
['LOT', { iata: 'LO', name: 'LOT Polish Airlines' }],
['MAS', { iata: 'MH', name: 'Malaysia Airlines' }],
['MSR', { iata: 'MS', name: 'EgyptAir' }],
['NAX', { iata: 'DY', name: 'Norwegian Air Shuttle' }],
['NKS', { iata: 'NK', name: 'Spirit Airlines' }],
['OAL', { iata: 'OA', name: 'Olympic Air' }],
['PGA', { iata: 'NI', name: 'Portugália Airlines' }],
['PGT', { iata: 'PC', name: 'Pegasus Airlines' }],
['PKC', { iata: 'PK', name: 'Pakistan International Airlines' }],
['QFA', { iata: 'QF', name: 'Qantas' }],
['QTR', { iata: 'QR', name: 'Qatar Airways' }],
['RAM', { iata: 'AT', name: 'Royal Air Maroc' }],
['ROU', { iata: 'RO', name: 'TAROM' }],
['RYR', { iata: 'FR', name: 'Ryanair' }],
['SAS', { iata: 'SK', name: 'Scandinavian Airlines' }],
['SHY', { iata: 'ZY', name: 'Sky Airlines' }],
['SIA', { iata: 'SQ', name: 'Singapore Airlines' }],
['SVN', { iata: 'SV', name: 'Saudia' }],
['SWA', { iata: 'WN', name: 'Southwest Airlines' }],
['SWG', { iata: 'WG', name: 'Sunwing Airlines' }],
['WJA', { iata: 'WS', name: 'WestJet' }],
['SWR', { iata: 'LX', name: 'Swiss International Air Lines' }],
['SXB', { iata: 'S5', name: 'SpiceJet' }],
['TAM', { iata: 'LA', name: 'LATAM Airlines' }],
['TAP', { iata: 'TP', name: 'TAP Air Portugal' }],
['TGW', { iata: 'TR', name: 'Scoot' }],
['OMA', { iata: 'WY', name: 'Oman Air' }],
['THA', { iata: 'TG', name: 'Thai Airways' }],
['THY', { iata: 'TK', name: 'Turkish Airlines' }],
['TJK', { iata: '7J', name: 'Tajik Air' }],
['TOM', { iata: 'BY', name: 'TUI Airways' }],
['TRA', { iata: 'HV', name: 'Transavia' }],
['TSC', { iata: 'TS', name: 'Air Transat' }],
['TUN', { iata: 'TU', name: 'Tunisair' }],
['UAE', { iata: 'EK', name: 'Emirates' }],
['UAL', { iata: 'UA', name: 'United Airlines' }],
['UZB', { iata: 'HY', name: 'Uzbekistan Airways' }],
['VIR', { iata: 'VS', name: 'Virgin Atlantic' }],
['VLG', { iata: 'VY', name: 'Vueling' }],
['VOI', { iata: 'Y4', name: 'Volaris' }],
['VOZ', { iata: 'VA', name: 'Virgin Australia' }],
['WIF', { iata: 'WF', name: 'Widerøe' }],
['WRF', { iata: 'RB', name: 'Syrian Arab Airlines' }],
['WZZ', { iata: 'W6', name: 'Wizz Air' }],
]);
// --- BEGIN GENERATED AIRLINES ---
export const GENERATED: Record<string, { iata: string; name: string }> = {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unnecessary export of internal tables

OVERRIDE and GENERATED are now export const, but neither was exported before and neither is part of the module's intended public API (parseCallsign, icaoToIata, toIataCallsign). Exporting them lets callers bypass the merged AIRLINES map, which is the only structure that correctly applies overrides on top of the generated baseline. If a consumer calls GENERATED["AFL"] directly instead of icaoToIata("AFL"), they will silently miss any active override. These constants should remain unexported.

let added = 0;
let removed = 0;

for (const key of newKeys) if (!oldKeys.has(key)) added++;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Diff-count regex excludes non-alphanumeric keys

The regex /"([A-Z0-9]{3})"/g used to extract old keys for the added/removed summary will not match entries whose ICAO key contains non-alphanumeric characters (e.g. "--+", "..."). If the alphanumeric filter fix is applied, those entries will disappear from the new output but the old-key extraction will also miss them, so the removed counter will under-report on the very next run. A broader pattern like /"([^"]{3})"/g would produce accurate counts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

trust:caution Brin: contributor trust score caution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chore(military): add generator script to refresh ICAO→IATA airline lookup table

1 participant