Skip to content

Add Claude Code GitHub Workflow#3

Merged
tazarov merged 2 commits into
mainfrom
add-claude-github-actions-1768907652868
Jan 20, 2026
Merged

Add Claude Code GitHub Workflow#3
tazarov merged 2 commits into
mainfrom
add-claude-github-actions-1768907652868

Conversation

@tazarov

@tazarov tazarov commented Jan 20, 2026

Copy link
Copy Markdown
Contributor

🤖 Installing Claude Code GitHub App

This PR adds a GitHub Actions workflow that enables Claude Code integration in our repository.

What is Claude Code?

Claude Code is an AI coding agent that can help with:

  • Bug fixes and improvements
  • Documentation updates
  • Implementing new features
  • Code reviews and suggestions
  • Writing tests
  • And more!

How it works

Once this PR is merged, we'll be able to interact with Claude by mentioning @claude in a pull request or issue comment.
Once the workflow is triggered, Claude will analyze the comment and surrounding context, and execute on the request in a GitHub action.

Important Notes

  • This workflow won't take effect until this PR is merged
  • @claude mentions won't work until after the merge is complete
  • The workflow runs automatically whenever Claude is mentioned in PR or issue comments
  • Claude gets access to the entire PR or issue context including files, diffs, and previous comments

Security

  • Our Anthropic API key is securely stored as a GitHub Actions secret
  • Only users with write access to the repository can trigger the workflow
  • All Claude runs are stored in the GitHub Actions run history
  • Claude's default tools are limited to reading/writing files and interacting with our repo by creating comments, branches, and commits.
  • We can add more allowed tools by adding them to the workflow file like:
allowed_tools: Bash(npm install),Bash(npm run build),Bash(npm run lint),Bash(npm run test)

There's more information in the Claude Code action repo.

After merging this PR, let's try mentioning @claude in a comment on any PR to get started!

@tazarov tazarov merged commit a07d915 into main Jan 20, 2026
1 check passed
@tazarov tazarov deleted the add-claude-github-actions-1768907652868 branch January 20, 2026 17:37
tazarov added a commit that referenced this pull request Apr 16, 2026
Only writeAdaptiveStringIndexes was using sortedPathIDs; every other
section writer iterated maps in non-deterministic Go order. Two
Encode() calls on the same in-memory index could therefore produce
different bytes, breaking content-addressable caching and making
encoded-byte test assertions fragile.

Fixed:
- writeStringIndexes, writeStringLengthIndexes, writeNumericIndexes,
  writeNullIndexes, writeTrigramIndexes, writeHyperLogLogs now use
  sortedPathIDs at the outer level
- writeTrigramIndexes now sorts its inner Trigrams map by trigram
- writeConfig now sorts transformerSpecs by path

Decode is order-tolerant so this is wire-compatible. Addresses PR #19
review item #3.
tazarov added a commit that referenced this pull request Apr 16, 2026
* test(08-01): add failing test for adaptive hot-term promotion

- Locks adaptive config defaults for high-cardinality paths
- Expects adaptive path metadata and promoted exact terms
- Verifies hot terms outrank long-tail values by RG coverage

* feat(08-01): add adaptive finalize-time string promotion

- Add adaptive string path mode, config knobs, and summary metadata
- Build promoted exact terms and deterministic bucket fallbacks at finalize time
- Cover hot-term promotion with a focused regression test

* test(08-01): add failing tests for adaptive query fallback

- Lock bucket-backed positive lookups to no-false-negative supersets
- Require non-promoted adaptive negatives to stay conservative
- Preserve exact negative inversion for promoted hot terms

* feat(08-01): route adaptive string queries through exact and buckets

- Add shared adaptive lookup with explicit exact-vs-lossy results
- Use bucket-backed supersets for adaptive EQ and IN tail terms
- Keep adaptive NE and NIN conservative unless the match was exact

* test(08-01): update threshold property for three-mode strings

- Exercise exact, adaptive-hybrid, and bloom-only high-cardinality outcomes
- Assert threshold breaches may resolve to adaptive or bloom-only modes
- Keep positive EQ lookups false-negative-free across all three cases

* docs(08-01): complete adaptive-high-cardinality-indexing plan

Tasks completed: 3/3
- Add adaptive-hybrid path structures and finalize-time promotion
- Route string membership queries through exact promotion and bucket fallback safely
- Update property coverage for the three-mode threshold contract

SUMMARY: .planning/phases/08-adaptive-high-cardinality-indexing/08-01-SUMMARY.md

* test(08-02): add failing tests for adaptive serialization

- Covers adaptive config round-trip expectations
- Covers adaptive path metadata and bitmap round-trip
- Locks oversized adaptive bucket section rejection

* feat(08-02): persist adaptive serialization metadata

- Bump the wire format to version 5 for adaptive sections
- Serialize adaptive config knobs and per-path adaptive indexes explicitly
- Reject malformed adaptive path, term, and bucket counts on decode

* fix(08-03): restore benchmark baseline from stray 08-02 tests

- remove incomplete adaptive serialization RED tests from serialize_security_test.go
- realign verification to the requested 424aceb content boundary

* test(08-03): add skewed adaptive benchmark fixture

- add deterministic skewed head-tail fixture generation for $.user_id
- define explicit exact, bloom-only, and adaptive-hybrid benchmark configs
- add BenchmarkAdaptiveHighCardinality with mode= and shape= naming

* test(08-02): add failing tests for CLI adaptive info output

- Locks exact, bloom-only, and adaptive-hybrid mode labels
- Verifies adaptive summary counters in formatter output
- Uses local-only index fixtures without S3 dependencies

* feat(08-02): expose adaptive mode in CLI info output

- Extract index info rendering to a reusable writer helper
- Report exact, bloom-only, and adaptive-hybrid modes per path
- Include compact adaptive promoted, bucket, threshold, and cap counters

* docs(08-02): document adaptive high-cardinality behavior

- Describe exact, adaptive-hybrid, and bloom-only string path modes
- Document adaptive config defaults and tuning knobs by name
- Replace bloom-only-only language in comparisons and examples

* fix(08-03): remove stray 08-02 cli red tests

- drop unfinished adaptive CLI info tests from cmd/gin-index/main_test.go
- keep 08-03 verification scoped to the requested 424aceb baseline

* test(08-03): report adaptive pruning and size metrics

- report candidate_rgs and encoded_bytes for hot and tail probes
- assert adaptive hot probe prunes better than bloom-only before timing runs
- reuse shared skewed fixtures across exact, bloom-only, and adaptive modes

* docs(08-02): complete adaptive serialization and CLI info plan

- Tasks completed: 3/3
- SUMMARY: .planning/phases/08-adaptive-high-cardinality-indexing/08-02-SUMMARY.md
- Shared state artifacts intentionally left untouched for orchestrator ownership

* docs(08-03): complete adaptive benchmark evidence plan

- record HCARD-05 benchmark metrics and verification evidence
- document blocker fixes required to restore the requested 424aceb baseline
- leave STATE.md and ROADMAP.md untouched per execution request

* fix(08-02): restore adaptive serialization and cli tests after wave overlap

* fix(08): close review regressions in builder, cli, and decode

* fix(08): harden decode limits and align CLI/docs

* docs(phase-08): complete phase execution

* docs(phase-08): evolve PROJECT.md after phase completion

* docs(phase-08): add security threat verification

* fix: harden adaptive path mode invariants

* fix: address post-fix review hardening

* fix: tighten round-3 adaptive hardening

* docs(phase-08): clean up comments and add godoc for adaptive surface

- expand Version constant comment with migration note and v4/v5/v6 history
- dedupe PathEntry.AdaptivePromotedTerms/AdaptiveBucketCount comments into one
  grouped godoc that documents the "derived, never persisted" contract
- justify the three serialize.go max-constants (maxDecodedIndexSize,
  maxHeaderRowGroups, maxHeaderDocs, maxAdaptivePaths) with the reasoning
  future maintainers need to tune them
- add field godoc on AdaptiveStringIndex explaining the sorted-terms +
  lossy-bucket contract
- document WithAdaptive* option semantics including the disable-via-zero
  conventions on PromotedTermCap and BucketCount
- rewrite the *WithIO shim comment to match the actual asymmetric pattern
  (only build/extract have wrappers; query/info are *WithIO-only)
- simplify adaptivePathSummary: the fallback to PathEntry counters is
  unreachable through both Decode and Finalize, so treat missing section as
  invariant-violation-return-zero instead of silently reading stale data
- add cmd/gin-index/ errcheck exclusion in .golangci.yml; the CLI uses
  fmt.Fprintf to stdout/stderr for user-facing output where ignoring write
  errors is standard practice

* feat(phase-08): add PathMode.IsValid and decode-side enforcement

PathMode was validated only structurally (through validatePathReferences)
after the whole directory was read. Any byte >= 3 would flow through as
an "unknown" mode until the downstream switch caught it, which risked
diverging behavior if future code paths forgot the guard.

- PathMode.IsValid() reports whether the value is one of the declared
  constants
- readPathDirectory now rejects unknown mode bytes immediately with
  ErrInvalidFormat, matching the fail-closed posture of the other
  per-field bounds checks
- cover the new helper with a table test and the decoder with a
  targeted corruption test that flips a single mode byte in a valid
  encoded payload to 99

* feat(phase-08): export SetAdaptiveInvariantLogger for library consumers

Adaptive invariant violations (path flagged PathModeAdaptiveHybrid with no
matching section) were always logged to log.Default() through a package-
private variable. Host processes couldn't redirect into structured logging
pipelines without poking internals, and concurrent writes to the variable
weren't safe.

- SetAdaptiveInvariantLogger swaps the logger under sync.RWMutex; nil
  silences violations entirely for embedders that prefer not to have any
  stderr chatter
- currentAdaptiveInvariantLogger reads the current logger with the read
  lock so Evaluate paths don't serialize on logger access
- migrate the existing TestAdaptiveInvariantViolationLogs to the exported
  API and add TestSetAdaptiveInvariantLoggerNilSilences covering the nil
  case

* feat(phase-08): poison builder after partial merge failure

mergeStagedPaths mutates pathData, bloom, presentRGs, and nullRGs path-by-path
in a sorted loop. A mid-loop failure leaves earlier paths merged for the
current document while later ones are untouched - subsequent AddDocument
calls would compound the corruption.

validateStagedPaths' preview catches every naturally-occurring trigger today
(mixed numeric promotion), so this is defensive: if a future code path
lands a merge-time failure that bypasses the preview, the builder now
flags poisonErr and refuses further AddDocument calls with a wrapped error
pointing at the original failure. Finalize's contract is unchanged - callers
who honor AddDocument's error never reach a corrupt Finalize regardless.

* test(phase-08): cover remaining validate arm and non-EQ adaptive probes

- add the "adaptive path missing adaptive section" row to
  TestValidatePathReferencesRejectsModeMismatches. All 6 arms of the mode
  switch at gin.go:587-608 now have direct unit coverage instead of the
  previous 5-arm subset + 1 transitive EncodeWithLevel assertion.
- expand BenchmarkAdaptiveHighCardinality beyond EQ: add NE, IN, NIN, and
  Contains probes alongside the existing hot/tail EQ probes. Each probe
  ReportMetric's candidate_rgs and encoded_bytes per mode so regressions in
  adaptive pruning show up as metric shifts in bench diffs. The hot-EQ
  soundness guard that asserts adaptive < bloom-only is preserved
  unchanged; no new strict assertions are layered over the new probes
  because different operators have legitimately different pruning
  semantics across modes.

* test(phase-08): add cross-mode soundness property

For the same fixture, EQ results must widen monotonically:
classic ⊆ adaptive ⊆ bloom-only. A violation means an adaptive-mode index
is strictly more aggressive than the exact index - i.e. it drops row
groups that really do match, which is unsound and silently loses data.

The existing per-mode no-false-negatives property catches missed ground-
truth matches against the fixture, but not the case where adaptive's
bucket layer under-reports compared to its own exact-index peer. 1000
generated cases over the existing cardinality threshold generator.

* refactor(cli): drop outer var err error shadows in *WithIO helpers

Each *WithIO helper declared a function-scope var err error that only
worked because one "idx, err = ..." assignment per function happened to
be in a branch without a block-local err declaration. Future edits that
added or removed err := inside inner blocks could silently read or write
the wrong err variable.

Restructure the 4 helpers so every err value lives in the smallest block
that uses it: the two branches that previously needed the outer err now
use "loaded, err := ...; idx = loaded" instead of "idx, err = ...". The
remaining idx, err = ... sites are inside blocks with their own err :=
and continue to work unchanged.

* style(benchmark): apply gci formatting to long probe func literals

* test: fix lint violations in test fixtures

* perf(builder): in-place merge for adaptive tail bucket bitmaps

Adds RGSet.UnionWith for in-place merges and uses it in
buildAdaptiveStringIndex to avoid one bitmap clone per non-promoted
term. On high-cardinality paths with many tail terms (the workload
adaptive mode targets) this removes O(tail) allocations from the
hot builder path. Addresses PR #19 review item #2.

* fix(serialize): produce deterministic byte output for all index sections

Only writeAdaptiveStringIndexes was using sortedPathIDs; every other
section writer iterated maps in non-deterministic Go order. Two
Encode() calls on the same in-memory index could therefore produce
different bytes, breaking content-addressable caching and making
encoded-byte test assertions fragile.

Fixed:
- writeStringIndexes, writeStringLengthIndexes, writeNumericIndexes,
  writeNullIndexes, writeTrigramIndexes, writeHyperLogLogs now use
  sortedPathIDs at the outer level
- writeTrigramIndexes now sorts its inner Trigrams map by trigram
- writeConfig now sorts transformerSpecs by path

Decode is order-tolerant so this is wire-compatible. Addresses PR #19
review item #3.

* refactor(query): tighten adaptive lookup paths and silence default logger

Three small adaptive-mode cleanups from PR #19 review:

- lookupAdaptiveStringMatch: drop the unreachable bucketID bounds
  check and nil bucket guard. NewAdaptiveStringIndex (used by both
  build and decode paths) already enforces a power-of-two bucket
  count and non-nil bucket bitmaps, and adaptiveBucketIndex masks
  with bucketCount-1, so neither guard can fire. Replace with an
  invariant comment. Addresses item #1.
- evaluateNIN: when a non-string element forces non-adaptive
  fallback, call evaluateINNonAdaptive(values) directly with the
  already-extracted slice instead of routing through evaluateIN,
  which would re-enter the adaptive path before falling back.
  Addresses item #7.
- adaptiveInvariantLogger: default to nil instead of log.Default().
  Per Go library convention, libraries should not write to stderr
  unless the consumer opts in. Existing callers can install
  log.Default() explicitly. Addresses item #6.

* docs(gin): clarify v5 history and AdaptiveBucketCount=0 disable sentinel

- Version comment now notes v5 was never released; the in-tree v5
  shape was iterated before the wire format was finalised in v6
  and v5 payloads are always rejected on decode. Addresses item #4.
- WithAdaptiveBucketCount godoc now explains the asymmetry between
  the option (rejects 0) and the validate() path (accepts 0 as the
  disable sentinel for struct-literal callers). validate() also
  carries an inline comment explaining why 0 is permitted there.
  Addresses item #5.
tazarov added a commit that referenced this pull request Apr 21, 2026
* docs(13): capture phase context

* docs(state): record phase 13 context session

* docs(13): add research and validation strategy

* docs(13): add pattern map

* docs(13): create phase 13 parser seam extraction plans

* docs(13): add three phase 13 parser-seam-extraction plans

Plan 01 (wave 1): additive parser surface — Parser interface,
parserSink (package-private, 6 methods per D-01), stdlibParser (zero-
field struct, Name() == "stdlib"), WithParser option, three new
builder fields. Existing test suite remains green — seam not wired.

Plan 02 (wave 2, depends on 01): wire AddDocument through
b.parser.Parse; default stdlibParser and cache b.parserName at
NewBuilder with empty-name rejection (D-06); delete parseAndStage-
Document / stageStreamValue / decodeTransformedValue from builder.go.
Classifier (stageJSONNumberLiteral etc.) preserved per Pitfall #1.

Plan 03 (wave 3, depends on 02): parity harness — authored goldens
(7 fixtures per D-05 dim #1/#4), gopter determinism property (dim #2),
12-operator Evaluate matrix (dim #3). parity_goldens_test.go gated by
//go:build regenerate_goldens. Merge-gate benchmark delta captured in
13-03-BENCH.md. D-02 ROADMAP deviation (ParserSink + stdlibParser
ship package-private) flagged in SUMMARY.

* docs(13): address plan-checker issues (iteration 1)

* docs(13): cross-AI review for phase 13 (gemini + codex)

* docs(13): revise plans per cross-AI review feedback

* refactor(parser): add parser seam surface

* docs(13): record plan 01 completion

* refactor(parser): wire AddDocument through parser seam

* docs(13): record plan 02 completion

* test(parser): add phase 13 parity harness

* docs(13): record plan 03 benchmark review hold

* docs(phase-13): add/update security threat verification

* docs(phase-13): accept benchmark risk

* docs(phase-13): reconcile accepted benchmark exception

* docs(phase-13): refresh validation strategy

* Enforce parser sink contract consistently

* Keep parser state opaque

* docs(13): ship phase 13 - PR #28

* docs(parser): clarify seam contract and fix numeric present-marking

Close a latent contract asymmetry in the Phase 13 parser seam:
stageNumericObservation now marks its path present, so all Stage*
sink methods are uniformly self-marking. Parsers only need to call
MarkPresent explicitly for object and array roots.

Behavior-preserving for stdlibParser -- parity goldens unchanged.

Addresses review items 1 and 2 on PR #28.
@claude claude Bot mentioned this pull request Apr 23, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant