[draft] cmd/tap: use internal/group for managing the various goroutine components of a tap instance#1295
Draft
davecheney wants to merge 4114 commits into
Draft
[draft] cmd/tap: use internal/group for managing the various goroutine components of a tap instance#1295davecheney wants to merge 4114 commits into
davecheney wants to merge 4114 commits into
Conversation
This is at a point where the new lexgen code roughly reproduces the
current ("legacy") codegen behavior. This means it can replace the
previous `lex/*.go` implementation. This version is somewhat more
complete and correct about the lexicon language in general, not just the
features we have used in the `com.atproto.*` and `app.bsky.*` lexicons
to date.
Things I might try to hit before merging:
- [ ] add some serialization tests (`./testing/`): records with
strongRef, recordWithMedia, unions, etc
- [x] resolve some more XXX and TODO
For later follow-up:
- [ ] API server handlers (for query and procedure), and ability to
disable client method output
- [ ] alternative CBOR library support (eg, annotations for `go-dasl`)
- [ ] ability to configure more package mappings (not just the indigo
pages)
- [ ] string, boolean, and integer query params as pointers
(configurable?)
- [ ] more correctness around `$type` serialization
- [ ] validation methods (for limits, etc)
- [ ] option to break out large query param lists as a struct
…sing under MIT or Apache-2.0
This is an effort to get some of the lexgen refactor merged as a package. I want to start using this code from `glot`, but am not ready to run and merge updates to the existing / "legacy" code in `api/`.
As indicated in the comment, adding an actual foreign key relationship would probably help with this. But doing it as a query tweak is a simpler change to deploy (no schema change). I have tested this in prod and it helped a bunch with perf. I have not tested even locally with an actual call to listRepos, should do that before merging+deploying. closes: #1236
This was a clever struct wrapped, but that was weird and inconsistent with all the other types. This updates it to be a simple string alias type. This is (slightly) breaking, and will require updates in downstream packages. Particularly those doing OAuth.
Fix `PasswordAuth#Refresh` using the wrong method.
This check was overly aggressive. There are a lot of situations where an account needs to authenticate in "deactivated" state. I'm not even sure if a log line is necessary here, but figure this is a change in SDK behavior so it should at least be a bit vocal. This also fixes a bug where a pointer to the status string was printed, instead of the status string itself.
The majority of deliverEvent was managing the case where a did worker was active. As the reference to these workers are already maintained in an xsync.Map, we can outsource the paperwork of looking up or creating the worker to Map's LoadOrCompute method. LoadOrCompute does hold a write lock over the Map during the compute function, which is a potenial bottleneck, althought the lock time is O(1) for the allocation of the &DIDWorker so I am hoping for now.
Fixes a few issues in Tap. The most important is that we were not updating prev data & rev when we didn't need to process any ops from a commit. This would cause commits to appear as disjunctions in commit history. This also rearranges some logic in `ProcessCommit` which would cause us to mark a repo "desynced" if an event came in while resyncing rather than adding that event to the resync buffer. For events that we add to the resync buffer, we hold off checking that they play neatly on the current state of the repo (through prevData) until we actually go to play them back.
Settin modules-download-mode: readonly causes the linter to complain, rather than silently letting the tooling fix things up behind the scenes. This is a good belt and suspenders check to ensure whats in the commit is what is built at that commit.
Rather than mentioning the Go version in several places, actions/setup-go can source the correct version from the go.mod file, removing one place where they can get out of sync. I wasn't sure if the go-version-file syntax is available with setup-go v4, so I went ahead and bumped up the checkout and setup actions to their latest versions. I use these versions on several projects as they are what dependabot will upgrade you too by default, so feel pretty confident this is a safe change.
Prev data can be nil in the case of legacy (non sync1.1) commits, causing a nil pointer error.
…1286) Rather than mentioning the Go version in several places, actions/setup-go can source the correct version from the go.mod file, removing one place where they can get out of sync. I wasn't sure if the go-version-file syntax is available with setup-go v4, so I went ahead and bumped up the checkout and setup actions to their latest versions. I use these versions on several projects as they are what dependabot will upgrade you too by default, so feel pretty confident this is a safe change.
Setting modules-download-mode: readonly causes the linter to complain, rather than silently letting the tooling fix things up behind the scenes. This is a good belt and suspenders check to ensure whats in the commit is what is built at that commit.
Rather than making runTap responsible for wrapping the context.Context passsed to it in a signal handler, push that responsibility to the caller as SIGTERM'ness is really a property of the process wrapped around tap, rather than runTap itself managing its own lifecycle. The logic is now runTap is pass a parent context, when that context is done, everyone shuts down. The why it was cancelled is not something runTap has to manage.
The majority of `deliverEvent` was managing the case where a did worker was active. As the reference to these workers are already maintained in an xsync.Map, we can outsource the paperwork of looking up or creating the worker to Map's LoadOrCompute method. LoadOrCompute does hold a write lock over the Map during the compute function, which is a potential bottleneck, althought the lock time is O(1) for the allocation of the &DIDWorker so I am hoping this is ok for now.
Rather than making runTap responsible for wrapping the context.Context passsed to it in a signal handler, push that responsibility to the caller as SIGTERM'ness is really a property of the process wrapped around tap, rather than runTap itself managing its own lifecycle. The logic is now runTap is pass a parent context, when that context is done, everyone shuts down. The why it was cancelled is not something runTap has to manage.
runResyncWorker takes a context for lifecycle management, however if that context is cancelled -- err != nil -- then the rsync worker will sleep for a second, hit WaitForReady, which returns immediately because the context is cancelled, then claimResyncJob will also return immediately with context.Cancelled, err != nil, so the worker sleeps for a second and tries again, etc. Fix the loop by explicitly checking for the context.Cancelled case and exiting the worker. No test yet, mia culpa.
Naming is hard, group.G, group.Group, both aren't great. I also considered work.Group, but none really light my fire. As a saving grace, this is internal to indigo so we can iterate on the name later.
jcalabro
reviewed
Jan 27, 2026
jcalabro
left a comment
Member
There was a problem hiding this comment.
I like! Can we just import github.com/pkg/group rather than copy+pasting though?
Contributor
Author
This was just sample code I wrote for a talk, I'd rather not adopt another open source project, so for the moment, copying it into an internal package means nobody has to commit to is as something others can use directly. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a worked example of switching from using
goto a group like structure to manage the lifecycle of the various workers inside a tap instance.Because of the way some components, kick off their own child goroutines, this required stacking groups -- not something I'd done before -- but worked surprisingly well. I think the size of the change could be reduced by a preparatory PR which moves more of the worker creation into tap.Run.