Skip to content

Commit 8a34058

Browse files
committed
fix(feeds): address Greptile P1 + P2 on validate-rss-feeds.mjs
P1: FAIL message claimed "4 allowlist mirrors" but the codebase enforces 5 (scripts/shared/rss-allowed-domains.json was missing from the list). A developer following the message would skip that mirror and hit a puzzling secondary failure from the `test:data` scripts/shared parity check. Listed all 5 mirrors. P2: The isConfigDrift predicate was prefix-matching against literal copies of the error message strings thrown by assertCiAllowed and fetchFeed. A future reword at either throwing site (e.g. dropping the "(--ci)" annotation, rewording the redirect error) would silently reclassify config drift as third-party rot, demoting a hard fail to a soft warning. Centralised the sentinel strings as a frozen CONFIG_DRIFT_REASONS object that both the throwing sites and the classifier consume — rename a reason and BOTH consumers move in lockstep. INVALID_URL is also now properly classified as config drift (was previously falling through to soft-fail despite being an actor- fixable bug in feeds.ts). Tested: - End-to-end run: 512 OK / 9 stale / 7 dead / 13 empty, EXIT=0 - Classifier unit test: all 8 representative cases correct (4 config-drift reasons → true, 4 third-party reasons → false)
1 parent cbb80e1 commit 8a34058

1 file changed

Lines changed: 23 additions & 11 deletions

File tree

scripts/validate-rss-feeds.mjs

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,18 @@ const FETCH_TIMEOUT = 15_000;
1414
const CONCURRENCY = 10;
1515
const STALE_DAYS = 30;
1616

17+
// Sentinel error-message prefixes for the SSRF/config guardrails. Centralised so
18+
// the throwing sites (assertCiAllowed, fetchFeed) and the isConfigDrift
19+
// classifier can never drift apart — rename a reason, BOTH consumers update in
20+
// lockstep. Without this, an innocuous reword (e.g. dropping `(--ci)`) would
21+
// silently reclassify hard failures as soft warnings.
22+
const CONFIG_DRIFT_REASONS = Object.freeze({
23+
INVALID_URL: 'Invalid URL',
24+
NON_HTTPS: 'Non-https scheme rejected in --ci mode:',
25+
HOST_NOT_ALLOWED: 'Host not in allowlist (--ci):',
26+
TOO_MANY_REDIRECTS: 'Too many redirects',
27+
});
28+
1729
// --ci flag hardens the validator for trusted-context CI runs (push-to-main
1830
// + schedule workflow). NOT enabled in PR CI — PR CI never runs this script
1931
// because PR contributors can rewrite feeds.ts to make GitHub runners hit
@@ -95,13 +107,13 @@ function assertCiAllowed(rawUrl) {
95107
try {
96108
parsed = new URL(rawUrl);
97109
} catch {
98-
throw new Error('Invalid URL');
110+
throw new Error(CONFIG_DRIFT_REASONS.INVALID_URL);
99111
}
100112
if (parsed.protocol !== 'https:') {
101-
throw new Error(`Non-https scheme rejected in --ci mode: ${parsed.protocol}`);
113+
throw new Error(`${CONFIG_DRIFT_REASONS.NON_HTTPS} ${parsed.protocol}`);
102114
}
103115
if (!isAllowedDomain(parsed.hostname)) {
104-
throw new Error(`Host not in allowlist (--ci): ${parsed.hostname}`);
116+
throw new Error(`${CONFIG_DRIFT_REASONS.HOST_NOT_ALLOWED} ${parsed.hostname}`);
105117
}
106118
return parsed;
107119
}
@@ -144,7 +156,7 @@ async function fetchFeed(url) {
144156
// the headers handshake is what we wanted bounded per hop.
145157
return await resp.text();
146158
}
147-
throw new Error('Too many redirects');
159+
throw new Error(CONFIG_DRIFT_REASONS.TOO_MANY_REDIRECTS);
148160
}
149161
const controller = new AbortController();
150162
const timer = setTimeout(() => controller.abort(), FETCH_TIMEOUT);
@@ -300,26 +312,26 @@ async function main() {
300312

301313
// Exit policy:
302314
// HARD-FAIL on config/SSRF-guard drift — these are bugs the maintainer can fix.
303-
// ("Host not in allowlist", "Non-https scheme rejected", "Too many redirects")
315+
// Reasons enumerated in CONFIG_DRIFT_REASONS (top of file). Both the throwing
316+
// sites and this classifier consume the same constants so a future reword
317+
// can't silently demote a hard fail to a warning.
304318
// SOFT-FAIL (exit 0 with warning) on third-party state — third-party 4xx/timeouts,
305319
// STALE feeds, EMPTY feeds. These rot naturally; failing the build on them
306320
// produces 100% CI noise and the prior workflow proved no one acts on it.
307321
// Promoting third-party failures to hard-fail requires a registry-cleanup PR
308322
// first; revisit once the long tail is groomed.
309323
const isConfigDrift = (r) =>
310-
typeof r.detail === 'string' && (
311-
r.detail.startsWith('Host not in allowlist') ||
312-
r.detail.startsWith('Non-https scheme rejected') ||
313-
r.detail === 'Too many redirects'
314-
);
324+
typeof r.detail === 'string' &&
325+
Object.values(CONFIG_DRIFT_REASONS).some(prefix => r.detail.startsWith(prefix));
315326
const configDrift = dead.filter(isConfigDrift);
316327
const thirdPartyDead = dead.filter(r => !isConfigDrift(r));
317328

318329
if (configDrift.length) {
319330
console.error(
320331
`\nFAIL: ${configDrift.length} feed(s) violate the CI guardrails ` +
321-
`(allowlist drift or plaintext URL). Fix src/config/feeds.ts and/or the 4 ` +
332+
`(allowlist drift or plaintext URL). Fix src/config/feeds.ts and/or the 5 ` +
322333
`allowlist mirrors (shared/rss-allowed-domains.json, .cjs, ` +
334+
`scripts/shared/rss-allowed-domains.json, ` +
323335
`api/_rss-allowed-domains.js, vite.config.ts:RSS_PROXY_ALLOWED_DOMAINS).`
324336
);
325337
process.exit(1);

0 commit comments

Comments
 (0)