chore(release): v0.9.0 — mandatory --patterns + heuristic detectors + Chromium fix + lint cleanup#45
Merged
Merged
Conversation
Three changes close session-debt from v0.8.1 / v0.8.2. CI install single source of truth --------------------------------- scripts/install-ci-deps.sh defines the install line once. ci-local.sh and both ci.yml jobs invoke it, removing the duplication that caused v0.8.1's push regression where ci-local.sh's install profile drifted from the matrix. release.py polls for in-flight CI --------------------------------- check_ci_passed_on_head polls every 20s for up to 10min with progress feedback instead of failing on the first read. Release-discipline audit (A + B + E) ------------------------------------ Three reinforcing checks for the AI knowing-not-applying flaw observed across v0.8.1 -> v0.8.3 (three releases for what should have been one because Claude made decisions at "should I push?" that violated rules just written down): A. scan_for_anti_patterns greps recent git log for anti-pattern signatures. BLOCKER findings abort unless --acknowledged "<reason>" is supplied. B. print_audit_checklist prints a diff-grounded checklist on every invocation. Visibility-to-Ken is the actual gate; questions are rubber-stampable in isolation but harder to ignore when bundled with commit/file context. E. require_signoff blocks tag-push on the developer typing "RELEASE OK X.Y.Z" exactly. No --yes flag — the bypass would defeat the purpose. The unfakeable component. Pure helpers (scan_log_for_anti_patterns, check_signoff_phrase, expected_signoff_phrase) extracted for unit-test isolation; 28 tests in tests/test_scripts/test_release_audit.py cover regex behaviour and exact-match logic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
CLAUDE.md gains Architecture principle #7 (confidence boundary between deterministic and heuristic redaction layers), sourced from existing spec language. The principle was previously stated in SANITIZATION_SPEC invariant #11 and ARCHITECTURE's "Confidence boundary" paragraph but not surfaced at the entry-point file — a contributor reading CLAUDE.md as the primary briefing would miss the load-bearing contract that governs which layer redacts. Release Flow section moved to new docs/RELEASE.md so CLAUDE.md (218 -> 162 lines) is dominated by principles rather than reference content. Principles go from ~52% to ~72% of the file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ial #47) Three changes to network_device.json: - New serial_number detector with Netgear C7000v2 format ([0-9][A-Z]{2}[0-9]{4}[A-Z0-9]{6}, surfaced by #49) plus a broader uppercase-alphanumeric backstop for future vendor variants. - wifi_ssid detector extended with a default-SSID prefix whitelist (SPSETUP, MOTO, ATTwifi, XFINITY, HOMEHUB). - Detector order changed so keyword-based device_name runs before shape-based wifi_ssid, preventing NETGEAR-C7000 from being miscategorized. New regression test file tests/test_sanitization/test_pii_regressions.py + fixture keys cases on issue numbers so future user reports add a fixture row rather than a new test file. Closes #49. Partial #47 — SN, default SSID, and default password all flag correctly through heuristics + UI now. WPS PIN coverage is tracked separately as a regex-layer concern: pure-digit values hit the universal safe pattern by design, and disambiguating a WPS PIN from a packet counter requires the adjacent "WPS PIN" / "PIN Code" label, which is the regex layer's job per CLAUDE.md principle #7. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The work originally queued as v0.8.3 (CI install SSOT, release.py polling, release-discipline audit gates) folds into v0.9.0 along with the doc surfacing and heuristic detector additions on this branch. Per the branch refocus, the CHANGELOG section retitles 0.8.3 -> 0.9.0 rather than adding a parallel 0.9.0 section above. Version bumped in pyproject.toml and src/har_capture/__init__.py. Comparison link updated to compare against v0.8.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLAUDE.md rule 14: large test data lives in tests/fixtures/*.json.
Inline `har_data = {...}` blobs were scattered across CLI test files,
some 30+ lines each.
Moved out of test files into per-module fixtures:
- tests/test_cli/test_sanitize.py: valid_har, large_har (structural
template; 500 KB padding stays in Python because the size is the
behavioural point), already_redacted_har (template; 15-placeholder
string stays in Python), har_with_flagged_fields.
- tests/test_cli/test_validate.py: clean_har, har_with_secrets,
har_with_warnings, directory_clean_har, directory_dirty_har, and
the custom_secret pattern previously inlined in
test_validate_with_custom_patterns.
- tests/test_cli/test_patterns.py: the four --show test pattern
files (show_external_full_domain, show_minimal_only_description,
show_no_description, show_safe_pattern_without_comment).
- tests/test_cli/test_interactive.py: the sanitized_har_file fixture
used by the three apply_reviewed_redactions tests.
- tests/test_validation/test_secrets.py: validate_har_gzipped's HAR
(appended to the existing tests/fixtures/test_secrets.json).
Kept inline per CLAUDE.md rule 14's behavioural-tests carve-out:
- Intentionally-malformed strings used to exercise error paths.
- One-line behavioural dicts in test_apply_redactions.py /
test_appears_sanitized.py / test_salt_preservation.py where the
data IS the test scenario (specific patterns, specific structures
feeding specific assertions).
- The parametrized URL test and dynamic-base64 test in test_secrets.py
where the dynamic content is the behavioural point.
Full suite: 1983 passed, 18 deselected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…JSON \b trap Three v0.9.0 changes, all driven by the cable_modem_monitor privacy promise: 1. Mandatory --patterns (BREAKING) `get`, `sanitize`, `validate` now require --patterns. `base` is a reserved sentinel for universal-PII-only; named domains (`network-device`) or custom JSON paths are the alternatives. Missing --patterns prints a domain listing to stderr and exits 2. `validate`'s --patterns shape changed from single Path to repeatable list. `validation/secrets.py` widened to accept str|dict|None so multi- pattern merges work end-to-end. Closes the structural cause behind the #47 / #49 leaks: contributors running bare `har-capture get` without loading network-device silently shipped device PII to CMM issue threads. 2. WPS-PIN labeled-regex coverage (completes #47) pii.json gains a `wps_pin` pattern; html.py Pass 2d redacts 8-digit values whose label is `WPS PIN`, `PIN Code`, `Pairing PIN`, or `Default PIN`. Pure-digit values can't be flagged heuristically; the adjacent label is what makes 100%-confidence deterministic redaction achievable per CLAUDE.md principle #7. 3. JSON-escape-trap warning at pattern load (#51) `_load_custom_patterns` now scans regex strings for ASCII backspace and form-feed, logging a warning that identifies the offending key path and the corrected JSON escape. Doesn't reject the pattern - just makes the silent-no-op case loud. Mechanical: - tests/test_cli/* invocations gained `--patterns base` - test_patterns_resolver.py + fixture covers the new CLI helper - README/CLI_REFERENCE/USE_CASES/CUSTOM_PATTERNS examples updated - CHANGELOG: BREAKING entry + three Added bullets under [0.9.0] Full suite: 1993 passed, 18 deselected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y binary path (#50) `check_browser_installed` previously resolved Chromium's binary at a hardcoded Linux-only relative path (`chrome-linux64/chrome`), which never matched on Windows or macOS. The function returned False without consulting the dry-run fallback - re-prompting users to "install" a browser already on disk. Refactor to the platform-agnostic install marker: resolve and check the `<browser>-<revision>/` directory itself. Per-platform binary-layout drift between Playwright versions can no longer break detection. - Removed `_BROWSER_EXECUTABLES` per-platform mapping - Renamed `_get_browser_executable` -> `_get_browser_install_dir` - `check_browser_installed`: `is_dir()` + `any(iterdir())`, dry-run fallback unchanged - Tests updated; added empty-dir-falls-through-to-dry-run coverage Full suite: 1995 passed, 18 deselected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…regression tests Markdown lint cleanup across 17 .md files (no shortcuts - fixed the content, did not silence via .markdownlint.json): - MD024 (67 -> 0): version-qualified ### subsections in CHANGELOG (`### Added in 0.9.0`, etc.); command-qualified in CLI_REFERENCE (`### Examples (get)`, etc.); two duplicate `Problem` headings in CAPTURE_SPEC scoped. - MD029 (4 -> 0): CLAUDE.md principles rewritten as bullet list with bold-prefixed numbers - global 1-19 numbering preserved, per-section visual grouping preserved, `principle #N` references still work. - MD033 (9 -> 0): README.md three <details>/<summary> blocks converted to ### sections (Quick Start: Windows / macOS-Linux / Existing HAR). - MD036 (8 -> 0): USE_CASES.md **Capture**/**Sanitization**/etc. promoted to ### headings; three single-line **Note**/**Example** paragraphs promoted or de-emphasized to plain prose. - MD040 (43 -> 0): every bare opening fence content-classified and tagged (bash/python/json/text) by a state-machine pass that preserves open/close pairing. Plus two explicit regression tests surfaced during pre-release verification: - #47 SN portion: added `generic_uppercase_alnum_serial` (`7TH4582JK9QP`) to tests/fixtures/test_pii_regressions.json so the serial_number heuristic backstop has a directly-cited fixture row. - #50 Windows-layout: new test_deps.py case constructs an install dir containing only chrome-win64/chrome.exe (no Linux binary) and asserts check_browser_installed returns True. Would have failed against the pre-0c11040 Linux-only path lookup. Tests + ruff: 1997 passed, all ruff checks passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ting (#46) Related to #46 - stop relying on Playwright's default auto-dismiss behavior for interactive headed runs - watch for browser dialogs, surface them to the user, and record the resolved outcome in _solentlabs - add opened_at timestamps so repeated dialogs are captured as distinct events for HAR analysis - add test coverage for dialog capture and run the full test suite to check for regressions - document dialog behavior and backfill the missing popup coverage in the capture docs
…nups Three follow-ups on top of ccpk1's dialog work (b12881a) to make it defensible per CLAUDE.md principles: 1. Polling loop -> page.expose_function (principle #10: no shortcuts). The original implementation maintained a window-scoped outcome queue and polled it from a Python dialog handler with `while True: ...; time.sleep(0.1)`. That's a workaround for not using Playwright's first-class JS->Python bridge. Replaced with a two-event model: `page.on("dialog")` creates the open record; the exposed `__harCaptureDialogResolved` binding (called by the JS init script after the user clicks) updates it with the action. No polling, no deadlock surface. Match-by-(type, message) so nested or concurrent dialogs can't mis-correlate. 2. sys.stderr.write -> _LOGGER.info (principle #11: quality gates). Both call sites converted to match the module's 26 existing _LOGGER calls. `sys` import removed. 3. Revert ~250 lines of unrelated fixture reformatting. The original PR reformatted multiple unrelated test_browser.json sections from compact one-line JSON to multi-line. Restored the project's compact convention; kept only the 7 substantive has_dialogs field additions + the new with_dialogs case. Cumulative diff vs. main now 311+/18- (was 544+/77-). Full suite: 2002 passed, 18 deselected. Co-Authored-By: ccpk1 <64691424+ccpk1@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
Promote the dialog support entry into the 0.9.0 ### Added section now that PR #52 ships in 0.9.0 (not deferred to a later release). Expanded the entry to cite #46 and to note the page.expose_function architectural choice for future readers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(capture): surface browser dialogs for user-driven capture (closes #46)
ccpk1
pushed a commit
to ccpk1/har-capture
that referenced
this pull request
May 16, 2026
chore(release): v0.9.0 — mandatory --patterns + heuristic detectors + Chromium fix + lint cleanup
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Breaking:
--patternsis now required onget,sanitize, andvalidate. Closes the structural cause behind contributors silently shipping device PII tocable_modem_monitorissue threads when they didn't know to load the network-device domain.Seven commits, end-to-end test coverage for every issue addressed (TDD red→green where applicable, regression guards where the engine already caught it):
fd3c81bdocsdocs/RELEASE.mde545ccafeat(patterns)5f7edb1test0887118chore(release)9bf112cfeat(cli)!--patterns(BREAKING) + WPS-PIN labeled-regex (completes #47) + load-time warn on JSON\btrap (#51)0c11040fix(capture)f437709choreIssues addressed
test_pii_regressions.py.serial_numberheuristic detector with exact[0-9][A-Z]{2}[0-9]{4}[A-Z0-9]{6}regex from the issue; cited fixture row.\bsilent no-op: load-time warning identifies the offending key path and the\\bcorrection; cited test reproduces the exact failure mode.BREAKING CHANGE
--patternsrequired on every sanitization-running subcommand.baseis a reserved sentinel for core-universal-PII-only. Missing--patternsprints a domain listing to stderr and exits 2.validate's--patternsshape changed from single Path to repeatable list to matchgetandsanitize. Library API (sanitize_har_file(),sanitize_har(),validate_har()) unchanged.Migration
Run
har-capture patternsfor the full list of choices.Test plan
🤖 Generated with Claude Code