feat(scan): add optional always_allow tier to location_filter#652
feat(scan): add optional always_allow tier to location_filter#652jrojomartinez wants to merge 1 commit into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughExports buildLocationFilter and adds an always_allow tier with higher precedence than block; documents precedence and matching rules; adds tests for normalization/case-insensitivity and guards main() so it runs only when executed directly. ChangesLocation filter always_allow tier
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
e3e061d to
3490e02
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scan.mjs`:
- Around line 129-133: The buildLocationFilter function currently assumes
locationFilter.always_allow, .allow, and .block are arrays of strings and will
throw on bad YAML or nulls; normalize each by coercing to an array (e.g., if
value is a string wrap it in an array, if null/undefined use []), filter out
non-string entries, then map remaining items to lowercase before assigning to
alwaysAllow, allow, and block so the function no longer crashes on malformed
config data.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 86b2ea42-33bb-4a17-b0a8-594a9f6c4232
📒 Files selected for processing (3)
scan.mjstemplates/portals.example.ymltest-all.mjs
3490e02 to
77feb8f
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
scan.mjs (1)
129-133:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNormalize
location_filterinputs defensively to avoid runtime crashes.Line 131–133 assumes arrays of strings; malformed YAML like scalar values, null entries, or mixed types will throw at runtime. Line 137 also assumes
locationis always a string. This can abort a scan instead of degrading gracefully.Proposed fix
export function buildLocationFilter(locationFilter) { if (!locationFilter) return () => true; - const alwaysAllow = (locationFilter.always_allow || []).map(k => k.toLowerCase()); - const allow = (locationFilter.allow || []).map(k => k.toLowerCase()); - const block = (locationFilter.block || []).map(k => k.toLowerCase()); + const normalizeKeywords = (value) => { + const list = Array.isArray(value) ? value : (typeof value === 'string' ? [value] : []); + return list + .filter((k) => typeof k === 'string') + .map((k) => k.toLowerCase().trim()) + .filter(Boolean); + }; + + const alwaysAllow = normalizeKeywords(locationFilter.always_allow); + const allow = normalizeKeywords(locationFilter.allow); + const block = normalizeKeywords(locationFilter.block); return (location) => { - if (!location) return true; - const lower = location.toLowerCase(); + if (typeof location !== 'string' || location.trim() === '') return true; + const lower = location.toLowerCase(); if (alwaysAllow.length > 0 && alwaysAllow.some(k => lower.includes(k))) return true; if (block.length > 0 && block.some(k => lower.includes(k))) return false; if (allow.length === 0) return true; return allow.some(k => lower.includes(k)); }; }As per coding guidelines,
**/*.mjs: “Ensure scripts handle missing data/ directories gracefully.”Also applies to: 136-138
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scan.mjs` around lines 129 - 133, The buildLocationFilter function assumes locationFilter.always_allow/allow/block are arrays of strings and that the passed-in location is a string; to fix, defensively normalize those fields by coercing missing fields to empty arrays, filtering out non-string items, trimming and lowercasing each entry (use the symbols alwaysAllow, allow, block within buildLocationFilter to locate the arrays), and when building the returned predicate validate the incoming location (in the predicate closure) is a string before calling toLowerCase — either coerce it to a safe string or return false for non-strings so malformed YAML or nulls do not throw at runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@scan.mjs`:
- Around line 129-133: The buildLocationFilter function assumes
locationFilter.always_allow/allow/block are arrays of strings and that the
passed-in location is a string; to fix, defensively normalize those fields by
coercing missing fields to empty arrays, filtering out non-string items,
trimming and lowercasing each entry (use the symbols alwaysAllow, allow, block
within buildLocationFilter to locate the arrays), and when building the returned
predicate validate the incoming location (in the predicate closure) is a string
before calling toLowerCase — either coerce it to a safe string or return false
for non-strings so malformed YAML or nulls do not throw at runtime.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e2e61344-0200-49ca-a8f4-e8c79a40c4a7
📒 Files selected for processing (3)
scan.mjstemplates/portals.example.ymltest-all.mjs
77feb8f to
e2e04a3
Compare
|
Thanks — fixed in the latest force-push. Added a |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
scan.mjs (1)
129-136:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winTrim normalized keywords to avoid silent non-matches.
normalizeKeywordListlowercases but does not trim, so entries like" belgium "from YAML will never match and silently break filtering behavior.Proposed fix
function normalizeKeywordList(value) { if (value == null) return []; const arr = Array.isArray(value) ? value : [value]; - return arr.filter(k => typeof k === 'string').map(k => k.toLowerCase()); + return arr + .filter(k => typeof k === 'string') + .map(k => k.toLowerCase().trim()) + .filter(Boolean); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scan.mjs` around lines 129 - 136, normalizeKeywordList currently lowercases entries but doesn't trim them, so values like " belgium " won't match; update normalizeKeywordList to first coerce value to an array (as it already does), then map each entry to a trimmed string, filter out non-strings and empty results (after trimming), and finally lowercase the survivors; reference the function name normalizeKeywordList and ensure trimming happens before filtering empties and lowercasing so whitespace-only entries are dropped.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@templates/portals.example.yml`:
- Around line 43-46: Update the commented example under the always_allow setting
so it matches the surrounding description that targets US-based locations:
replace the placeholder entries currently showing "Belgium" and "Brussels" with
appropriate US examples (e.g., "United States", "New York" or another home
region/city), keeping the comment style and the key name always_allow intact so
the sample and surrounding text are consistent.
---
Duplicate comments:
In `@scan.mjs`:
- Around line 129-136: normalizeKeywordList currently lowercases entries but
doesn't trim them, so values like " belgium " won't match; update
normalizeKeywordList to first coerce value to an array (as it already does),
then map each entry to a trimmed string, filter out non-strings and empty
results (after trimming), and finally lowercase the survivors; reference the
function name normalizeKeywordList and ensure trimming happens before filtering
empties and lowercasing so whitespace-only entries are dropped.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7022bbf5-0583-48ff-9807-4931a347d2c9
📒 Files selected for processing (3)
scan.mjstemplates/portals.example.ymltest-all.mjs
e2e04a3 to
025c314
Compare
|
Thanks — fixed in the latest force-push. The commented |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scan.mjs`:
- Around line 132-136: normalizeKeywordList currently preserves empty or
whitespace-only strings which, when later used in matching (e.g.,
lower.includes(k)), makes an empty k always match and can bypass block/allow
rules; update normalizeKeywordList to trim each string and filter out those that
are === '' (or only whitespace) so it returns only non-empty lowercase keywords,
and apply the same trimming+empty-filtering where keywords from always_allow /
block / allow are normalized before matching to ensure no empty entries can
cause a pass-all match.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 176bec03-f5ca-4a2b-a5af-58cd739795b7
📒 Files selected for processing (3)
scan.mjstemplates/portals.example.ymltest-all.mjs
Pre-emptive hardening following the same defensive pattern CodeRabbit flagged on PR santifer#652. All changes are within the providers shipped in this PR; no scan.mjs / framework changes. - All three providers: `careers_url` is now type-checked before .match() so a non-string YAML value (number, object, array) returns null from detect() rather than throwing. - smartrecruiters: ref-rewrite uses an anchored regex (`/^https:\/\/api\.smartrecruiters\.com\/v1\/companies\//`) so the replacement only fires at the URL prefix. The fallback URL path (when both j.ref AND j.id are missing) now returns an empty string instead of synthesising a URL containing the literal "undefined" — the empty string is the contract-allowed default for url per _types.js > Job. Magic 100 in the postings limit is now a named SR_PAGE_SIZE constant. - workable: parseWorkableMarkdown now extracts URLs via a line-level regex `/\[View\]\(([^)]+)\)/` rather than a column-position match, so a title containing a stray `|` doesn't shift cols[7] and silently drop the URL. Rows that still don't resolve a URL are skipped (no empty-URL entries leak into the dedup tracker). - test-all.mjs: 6 new assertions covering the defensive paths (non-string careers_url across all 3 providers, the SR no-ref/no-id fallback, the Workable stray-pipe survival, and a real Workable fetch() rejection test against an unresolvable careers_url). Refs santifer#651
025c314 to
75a1595
Compare
|
Thanks — fixed in the latest force-push. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scan.mjs`:
- Around line 150-153: The predicate currently calls location.toLowerCase()
without ensuring location is a string; update the returned function (the
(location) => { ... } predicate that uses alwaysAllow) to guard non-string
values before calling toLowerCase by checking typeof location === 'string' and
if it's not a string, return true (or otherwise handle gracefully) so truthy
non-string payloads won't throw; then use const lower = location.toLowerCase()
for the subsequent alwaysAllow.some(...) check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4358df07-24bd-479f-bc5c-50a7ceab2b32
📒 Files selected for processing (3)
scan.mjstemplates/portals.example.ymltest-all.mjs
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Adds an optional always_allow list checked BEFORE block. A location matching always_allow passes regardless of block. Fully backward compatible: a config without always_allow: behaves exactly as today. Motivation: the current filter checks block first and absolutely, so a multi-location posting like "Remote, Belgium or France" is dropped the moment "france" is in block — even though Belgium is an acceptable location in the same string. always_allow is the home-region escape hatch. Worked example with always_allow: ["belgium"], block: ["france"]: - "Remote, Belgium" pass (unchanged) - "Remote, Belgium or France" PASS (was REJECT) - "Remote, France" reject (unchanged) Also: - Adds `export` to buildLocationFilter + gates main() behind an import.meta.url check so the function is importable from tests without running scan.mjs as a script. - Adds test-all.mjs §11 covering the 6 boundary cases (home-region match, always_allow beats block, block still rejects when no always_allow hit, empty location, case-insensitivity, backward compatibility when always_allow is omitted). - templates/portals.example.yml documents the commented always_allow: example with an ordering note. Refs santifer#650
75a1595 to
e0d2a0c
Compare
|
Thanks — fixed in the latest force-push. The returned filter closure now guards |
Rebased onto upstream
main(1.8.0) —buildLocationFilteris unchanged in 1.8.0, so the always_allow tier applies cleanly. Marked ready for review.Summary
Adds an optional
always_allowlist tolocation_filter, checked beforeblock. A location matchingalways_allowpasses regardless ofblock. Fully backward-compatible.Motivation
The current filter checks
blockfirst and absolutely. A multi-location posting like "Remote, Belgium or France" is dropped whenfranceis inblock, even though Belgium is an acceptable option.Worked example
Config:
always_allow: ["belgium"],block: ["france"]Remote, BelgiumRemote, Belgium or FranceRemote, FranceChanges
scan.mjs:buildLocationFilterreadsalways_allow, checks it beforeblock(~2 lines + doc-comment refresh). Also addsexport+main()guard so the function is unit-testable.templates/portals.example.yml: commentedalways_allow:example.test-all.mjs: new §11 with 6 unit tests covering home-region pass, always_allow-beats-block, block-still-rejects, empty location, case-insensitivity, and backward compatibility.Test plan
node test-all.mjs --quickpasses locally (all sections + the 6 new always_allow cases)node scan.mjs --dry-runstill runs correctly as a CLI (main() guard preserves the script-mode behaviour)Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
Behavior