feat(sdk/go): add observability package (OpenTelemetry logging)#1740
feat(sdk/go): add observability package (OpenTelemetry logging)#1740MarcusElwin wants to merge 3 commits into
Conversation
|
@MarcusElwin is attempting to deploy a commit to the motia Team on Vercel. A member of the Team first needs to authorize it. |
📝 WalkthroughWalkthroughThis PR adds a standalone Go observability package (OTel logging, exporter selection, JSON→OTel value conversion, Init/shutdown) and refactors/expands the iii-example module with per-feature demos (HTTP, cron, custom trigger type, logging, channels) plus Docker Compose and collector config for local telemetry. ChangesGo Observability Module and iii-example Enhancements
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 13
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/release-iii.yml (1)
653-742:⚠️ Potential issue | 🟠 Major | ⚡ Quick winInclude
sdk-goin the completion gate.
notify-completenever depends onsdk-go, never captures its result, and never folds it intoOVERALL. A broken Go module publish would still send a successful Slack summary.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/release-iii.yml around lines 653 - 742, The notify-complete job currently omits the sdk-go job from its dependency list and result aggregation; add needs: - sdk-go to notify-complete's needs, capture its result in the env block as SDK_GO_RESULT="${{ needs.sdk-go.result }}", append a line to build the human-readable status into LINES (e.g., LINES="$LINES$(icon "$SDK_GO_RESULT") SDK Go\n"), and include "$SDK_GO_RESULT" in the for loop that sets OVERALL so a failed sdk-go marks the overall status as failure; refer to notify-complete, the env keys (SDK_GO_RESULT), LINES construction, and the for loop that tests results.
🟡 Minor comments (9)
sdk/packages/go/iii/tests/errors_test.go-39-42 (1)
39-42:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert the exact error code here.
This test claims the engine should surface handler failures as
invocation_failed, but it currently passes for any non-empty code.Suggested assertion
- // The engine reports a handler failure as code "invocation_failed". - if ie.Code == "" { - t.Error("InvocationError.Code is empty") - } + if ie.Code != "invocation_failed" { + t.Errorf("code = %q, want invocation_failed", ie.Code) + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/packages/go/iii/tests/errors_test.go` around lines 39 - 42, The test currently only checks that ie.Code is non-empty; change it to assert the exact expected error code by comparing ie.Code (InvocationError.Code) to "invocation_failed" so the test fails if the engine surfaces any other code; update the assertion in the errors_test.go test that uses the ie variable to require ie.Code == "invocation_failed" (e.g., replace the t.Error check with a t.Fatalf/t.Errorf that compares ie.Code to the expected string).sdk/packages/go/iii/tests/errors_test.go-67-77 (1)
67-77:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winVerify the unknown-function contract, not just “some error”.
This still passes on an unrelated timeout or transport failure. Assert the returned error type and stable engine code so the integration test actually protects the public error surface.
Suggested assertion shape
if err == nil { t.Fatal("expected an error invoking an unregistered function") } + var ie *iii.InvocationError + if !errors.As(err, &ie) { + t.Fatalf("error is not *InvocationError: %v", err) + } + if ie.Code == "" { + t.Fatal("InvocationError.Code is empty") + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/packages/go/iii/tests/errors_test.go` around lines 67 - 77, The test TestInvokeUnknownFunctionErrors currently only checks for a non-nil error; update it to assert the error is the expected engine "unknown function" error by using errors.As (or a type assertion) to convert err into the SDK's engine error type (e.g., iii.EngineError or iii.Error) and then assert the engine-level code/constant (e.g., EngineCode or Code equals the unknown-function constant like iii.EngineErrorCode_UNKNOWN_FUNCTION or the string "UNKNOWN_FUNCTION"); if the assertion fails, call t.Fatalf with a message showing the actual type/code so unrelated timeouts/transports no longer pass this test.sdk/packages/go/iii/protocol.go-256-285 (1)
256-285:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix
MarshalMessagepointer/value contract mismatch iniiiprotocol
sdk/packages/go/iii/protocol.go’sMarshalMessagecomment says to pass message structs “by pointer or value”, but the implementation type-switch only matches*RegisterTriggerTypeMessage,*RegisterTriggerMessage, etc. (and*PingMessage/*PongMessage), so passing a value (e.g.PingMessage{}) will hit the “cannot marshal unknown message type” default. Either update the comment to pointer-only or add non-pointer cases to the switch.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/packages/go/iii/protocol.go` around lines 256 - 285, MarshalMessage's comment promises pointer-or-value but the type switch only matches pointer types so value arguments (e.g. PingMessage{}) fall through; fix by either updating the comment to require pointers only or (preferred) extend the switch in MarshalMessage to include the non-pointer variants (e.g. RegisterTriggerTypeMessage, RegisterTriggerMessage, TriggerRegistrationResultMessage, UnregisterTriggerMessage, RegisterFunctionMessage, UnregisterFunctionMessage, InvokeFunctionMessage, InvocationResultMessage, RegisterServiceMessage, WorkerRegisteredMessage, PingMessage, PongMessage) and route them to the same marshalEnvelope calls as their pointer counterparts so both value and pointer inputs are handled consistently.sdk/packages/go/observability/README.md-42-55 (1)
42-55:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winMake the OTLP example a standalone
Initflow.This snippet reads like a second
Initcall after the stdout example, and the OTLP branch drops botherrandshutdown. Split these into separate examples, or show the OTLP branch with the same error/shutdown handling so readers do not copy a leaking initialization pattern.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/packages/go/observability/README.md` around lines 42 - 55, The OTLP example currently appears as a second Init call and omits error handling and shutdown; make it a standalone Init flow by showing observability.Init called with observability.OtelConfig (using ServiceName, Exporter set to observability.ExporterOTLP, and OTLPEndpoint) and include the same error capture and deferred shutdown pattern as the stdout example (capture err from observability.Init and defer shutdown(ctx)) so readers don't copy a leaking initialization pattern; reference observability.Init, observability.OtelConfig, observability.ExporterOTLP, err, and shutdown in the example.sdk/packages/go/iii/README.md-178-183 (1)
178-183:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winClarify that OTel logging already exists.
This section reads as if all OpenTelemetry integration is still future work, but this PR adds
sdk/packages/go/observabilityfor logging today. Narrow the follow-up wording to spans/invocation tracing and point readers at the logging module so they do not miss the supported path.As per coding guidelines, "Check for related inconsistencies with the docs/".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/packages/go/iii/README.md` around lines 178 - 183, Update the Observability README paragraph to clarify that OpenTelemetry-based logging is already implemented in the Go SDK by referencing the new package (sdk/packages/go/observability) and that the planned follow-up is specifically adding real spans/invocation tracing (not basic logging); adjust the follow-up sentence to point readers to the observability logging module and narrow wording to "spans/invocation tracing" and check for related inconsistencies in docs/ to keep documentation consistent with the added observability package.sdk/packages/go/iii-example/logger.go-11-16 (1)
11-16:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the comment; the follow-up already landed.
This still says a dedicated observability package is "a planned follow-up", but this PR adds
sdk/packages/go/observabilityandobservability.goalready demonstrates it. Leaving the comment as-is sends readers to stale guidance.As per coding guidelines, "Check for related inconsistencies with the docs/".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/packages/go/iii-example/logger.go` around lines 11 - 16, The file comment above setupLogger is stale—remove the "planned follow-up" wording and update it to reflect that a dedicated observability package now exists; mention the new sdk/packages/go/observability package and observability.go and state that the Go SDK v1 also provides a real observability implementation there instead of deferring future work. Keep the rest of the context about mirroring logger_example.rs and engine log usage but replace the future-tense note with a brief pointer to observability.go as the implemented follow-up.sdk/packages/go/iii-example/README.md-19-26 (1)
19-26:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDocument the Go 1.25 requirement next to the run steps.
This module’s
go.modis pinned togo 1.25.0, sogo run .will fail for readers following this README on Go 1.23/1.24. Please call out the higher toolchain requirement near## Run, especially since the baseiiiREADME advertises Go 1.23+ for the core SDK.Based on learnings,
sdk/packages/go/observability/README.mdstates that the observability module "requires Go 1.25+".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/packages/go/iii-example/README.md` around lines 19 - 26, Add a short note next to the "## Run" section stating that this example requires Go 1.25 (go.mod is pinned to go 1.25.0) so that running "go run ." on older toolchains will fail; update README.md (near the "## Run" header and the "go run ." instruction) to explicitly call out "Requires Go 1.25+" and optionally link or mention go.mod for verification.sdk/packages/go/iii-example/channels.go-33-35 (1)
33-35:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winReturn after a writer-close failure.
ReadAlldepends on the writer sending EOF. Ifch.Writer.Close()fails and we keep going, this example can sit until the 10s context timeout and report a read failure instead of the real cause.Suggested fix
if err := ch.Writer.Close(); err != nil { log.Printf("channel writer close: %v", err) + return }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/packages/go/iii-example/channels.go` around lines 33 - 35, The ch.Writer.Close() error is currently logged but ignored, causing ReadAll to hang until the context times out; update the close handling in the function (where ch.Writer.Close() is called) to return the error immediately after logging so callers (and ReadAll) see the real failure instead of waiting for the deadline — locate the close call in channels.go (the block that calls ch.Writer.Close()) and change control flow to log and then return the error (propagate it up from this function).sdk/packages/go/iii-example/cron.go-16-24 (1)
16-24:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDon’t swallow the cron payload decode error.
If the event shape drifts or the payload is malformed, this handler logs empty fields and still returns
{"cleaned": true}. Returning the unmarshal error keeps the example honest about the trigger contract.Suggested fix
if err := client.RegisterFunction("example::scheduled_cleanup", func(ctx context.Context, data json.RawMessage) (any, error) { // The engine delivers a cron event payload; we just acknowledge the run. var ev struct { Trigger string `json:"trigger"` JobID string `json:"job_id"` } - _ = json.Unmarshal(data, &ev) + if err := json.Unmarshal(data, &ev); err != nil { + return nil, err + } log.Printf("cron fired: trigger=%s job_id=%s", ev.Trigger, ev.JobID) return map[string]any{"cleaned": true, "job_id": ev.JobID}, nil }); err != nil {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/packages/go/iii-example/cron.go` around lines 16 - 24, The handler registered with client.RegisterFunction for "example::scheduled_cleanup" currently ignores json.Unmarshal errors into ev and always returns {"cleaned": true}; change it to check the error from json.Unmarshal(data, &ev) and if non-nil return nil and that error (or wrap it with context mentioning "scheduled_cleanup payload unmarshal") so malformed cron payloads propagate instead of being swallowed; update the anonymous function passed to RegisterFunction to perform the error check before logging and returning the success map.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/_go.yml:
- Line 61: Replace floating version tags with immutable commit SHAs for the
GitHub Actions used: update the three "uses:" entries
slackapi/slack-github-action@v2.0.0, actions/checkout@v4, and
actions/setup-go@v5 to the corresponding full commit SHAs (e.g.,
slackapi/slack-github-action@<full-sha>, actions/checkout@<full-sha>,
actions/setup-go@<full-sha>); fetch the latest intended release commit SHA from
each action's GitHub repository and substitute the tag with that SHA in the
workflow so the runs reference an immutable commit.
- Around line 101-106: The step "Verify the proxy resolves the published
version" currently inlines inputs.module_path and inputs.version into the run:
shell, which is unsafe; instead add two env entries (e.g. MODULE_PATH and
MODULE_VERSION) set from ${{ inputs.module_path }} and ${{ inputs.version }},
then change the run command in that step to use the quoted env expansions (for
example: go list -m "$MODULE_PATH@v$MODULE_VERSION"); keep existing GOPROXY and
GOFLAGS env entries and ensure the run line quotes the variables to prevent
shell injection.
- Around line 74-77: Replace the hardcoded Go toolchain in the reusable publish
workflow by switching the actions/setup-go step to use go-version-file: ${{
inputs.package_path }}/go.mod (instead of go-version: '1.23') so each module
honors its declared toolchain; for the go list command, stop interpolating
inputs directly into the run line and pass ${{ inputs.module_path }} and ${{
inputs.version }} via environment variables (or safely quote/construct the
argument) to avoid shell injection; and pin third‑party actions currently using
floating tags (actions/setup-go@v5 and actions/checkout@v4) to specific commit
SHAs for supply‑chain hardening while keeping the already-pinned slack action as
is.
In @.github/workflows/ci.yml:
- Around line 748-789: The sdk-go-ci job currently only targets
sdk/packages/go/iii; update the job to also run formatting, vet, and tests for
sdk/packages/go/observability and sdk/packages/go/iii-example and ensure each
module uses the Go toolchain declared in its go.mod (e.g., use Go 1.25 for
observability and iii-example) by invoking actions/setup-go with the correct
go-version per module (or add per-module subjobs/matrix entries), run gofmt -l,
go vet ./..., and go test (with -race where appropriate) in each module’s
working-directory, and update cache-dependency-path entries to point at each
module’s go.sum so caching remains correct.
- Line 753: The workflow uses floating action tags (e.g., actions/checkout@v4,
actions/setup-go@v5, actions/download-artifact@v4 and others like
actions/setup-node@v4, actions/setup-python@v5, pnpm/action-setup@v4,
dtolnay/rust-toolchain@stable, Swatinem/rust-cache@v2, dorny/paths-filter@v3)
which can be retagged upstream; update each `uses:` entry to the corresponding
immutable commit SHA form (owner/repo@<full-commit-sha>) by replacing the tag
with the pinned SHA for that action repository so CI behavior is stable and
repeatable.
In @.github/workflows/release-iii.yml:
- Around line 490-508: Add a parallel release job for the new Go module by
duplicating the existing sdk-go job (which uses ./.github/workflows/_go.yml) and
updating the inputs to point to the observability module: set package_path to
sdk/packages/go/observability, module_path to
github.com/iii-hq/iii/sdk/packages/go/observability, version to ${{
needs.setup.outputs.version }}, dry_run and slack_thread_ts to the same
expressions as sdk-go, and adjust slack_label (e.g., "SDK Go (observability)");
ensure the job has the same needs and conditional (if: ${{ !failure() &&
!cancelled() }}) and forwards the same secrets (SLACK_BOT_TOKEN,
SLACK_CHANNEL_ID).
In `@sdk/packages/go/iii/channels.go`:
- Around line 102-103: OpenWriter (and the analogous OpenReader at lines
~188-189) currently override ref.Direction with a hardcoded
ChannelWrite/ChannelRead; change them to use the ref.Direction when calling
channelURL and validate the ref.Direction matches the constructor being called
(in OpenWriter require ref.Direction == ChannelWrite; in OpenReader require
ref.Direction == ChannelRead). If the direction does not match, return nil (or
an error/handle per existing constructor pattern) instead of building a URL with
the wrong direction; update the channelURL invocation in both
ChannelWriter/ChannelReader constructors to pass ref.Direction and add the
simple guard checks before constructing the ChannelWriter/ChannelReader.
In `@sdk/packages/go/iii/client.go`:
- Around line 607-615: The issue is that enqueueOutboundDirect currently sends
connection-specific frames into the shared c.outbound channel which can be
written by a subsequent connection; fix by routing direct replies to the current
writer instead of the shared queue: add a per-connection outbound channel or
attach a connection identifier to frames and have writeLoop drop frames whose
connID doesn't match the active connection. Concretely, modify
enqueueOutboundDirect to either push into a connection-scoped channel created in
the writer loop (e.g., connOutbound) or include a connID when enqueuing and
update writeLoop to check that connID matches the active connection before
writing (references: enqueueOutboundDirect, c.outbound, writeLoop, c.shutdown).
- Around line 585-595: The dispatch switch never routes trigger teardown
messages, so add a branch for the unregister message (e.g., case
MsgUnregisterTrigger:) that offloads to a goroutine and calls the appropriate
teardown handler (e.g., c.handleUnregisterTrigger(ctx, dec.UnregisterTrigger));
if handleUnregisterTrigger doesn’t exist, implement it to invoke the
TriggerHandler.UnregisterTrigger hook and clean up per-instance background work,
mirroring how MsgRegisterTrigger is handled (use the same ctx/dec pattern and
ensure any identifiers in DecodedMessage are used to find the right trigger
instance).
- Around line 333-347: Connect currently launches c.supervise() unbounded;
change it so the supervisor is tied to that Connect call's lifecycle: ensure
only a single supervisor goroutine runs per active Connect attempt (use a
sync.Once or a per-client supervisorStarted flag) and pass a derived context
(e.g., ctx or a superviseCtx with cancel) into supervise so it returns when the
Connect context is done; update Client.Connect to start supervise only if not
already running, call supervise(ctx) (or superviseWithCtx) instead of go
c.supervise(), and ensure supervise checks ctx.Done() as well as c.shutdown and
signals c.connected only while its ctx is active to avoid orphaned reconnect
loops and shared outbound queues across retries.
In `@sdk/packages/go/iii/tests/api_triggers_test.go`:
- Around line 40-47: The route path and trigger name are hardcoded causing
collisions; make them unique by deriving both from the test name and a
random/unique suffix (e.g., use t.Name() + a unique token) before calling
c.RegisterTrigger: build a uniquePath variable for the api_path JSON (replacing
path) and a uniqueTriggerName for the first argument to c.RegisterTrigger
(replacing "test-go-http"); ensure the JSON payload uses the uniquePath and pass
uniqueTriggerName into c.RegisterTrigger so parallel runs do not conflict.
In `@sdk/packages/go/iii/tests/worker_metadata_test.go`:
- Around line 56-73: The current search picks the first connected Go worker
which can be the wrong process; instead locate the worker entry by the specific
worker ID created in this test. When you create/open the test worker (capture
its unique id in a variable like createdWorkerID or worker.ID), then iterate
out.Workers and select the worker whose ID matches that captured id (e.g.,
compare worker.ID to createdWorkerID) and only then perform the OS/Version
assertions on that workerInfo (instead of using the existing runtime+status
filter/ours variable).
In `@sdk/packages/go/observability/anyvalue.go`:
- Around line 42-58: The jsonToLogValue conversion narrows uint* and float64 to
int64 without bounds checks causing truncation; update the uint cases (uint,
uint8, uint16, uint32, uint64) to check against math.MaxInt64 and only call
log.Int64Value(int64(t)) when t <= math.MaxInt64, otherwise return
log.Float64Value(float64(t)) to mirror the Rust as_i64-first contract; likewise
in the float64 branch first verify t is within [math.MinInt64, math.MaxInt64]
and represents a whole integer before computing int64(t) and calling
log.Int64Value, otherwise return log.Float64Value(t).
---
Outside diff comments:
In @.github/workflows/release-iii.yml:
- Around line 653-742: The notify-complete job currently omits the sdk-go job
from its dependency list and result aggregation; add needs: - sdk-go to
notify-complete's needs, capture its result in the env block as
SDK_GO_RESULT="${{ needs.sdk-go.result }}", append a line to build the
human-readable status into LINES (e.g., LINES="$LINES$(icon "$SDK_GO_RESULT")
SDK Go\n"), and include "$SDK_GO_RESULT" in the for loop that sets OVERALL so a
failed sdk-go marks the overall status as failure; refer to notify-complete, the
env keys (SDK_GO_RESULT), LINES construction, and the for loop that tests
results.
---
Minor comments:
In `@sdk/packages/go/iii-example/channels.go`:
- Around line 33-35: The ch.Writer.Close() error is currently logged but
ignored, causing ReadAll to hang until the context times out; update the close
handling in the function (where ch.Writer.Close() is called) to return the error
immediately after logging so callers (and ReadAll) see the real failure instead
of waiting for the deadline — locate the close call in channels.go (the block
that calls ch.Writer.Close()) and change control flow to log and then return the
error (propagate it up from this function).
In `@sdk/packages/go/iii-example/cron.go`:
- Around line 16-24: The handler registered with client.RegisterFunction for
"example::scheduled_cleanup" currently ignores json.Unmarshal errors into ev and
always returns {"cleaned": true}; change it to check the error from
json.Unmarshal(data, &ev) and if non-nil return nil and that error (or wrap it
with context mentioning "scheduled_cleanup payload unmarshal") so malformed cron
payloads propagate instead of being swallowed; update the anonymous function
passed to RegisterFunction to perform the error check before logging and
returning the success map.
In `@sdk/packages/go/iii-example/logger.go`:
- Around line 11-16: The file comment above setupLogger is stale—remove the
"planned follow-up" wording and update it to reflect that a dedicated
observability package now exists; mention the new sdk/packages/go/observability
package and observability.go and state that the Go SDK v1 also provides a real
observability implementation there instead of deferring future work. Keep the
rest of the context about mirroring logger_example.rs and engine log usage but
replace the future-tense note with a brief pointer to observability.go as the
implemented follow-up.
In `@sdk/packages/go/iii-example/README.md`:
- Around line 19-26: Add a short note next to the "## Run" section stating that
this example requires Go 1.25 (go.mod is pinned to go 1.25.0) so that running
"go run ." on older toolchains will fail; update README.md (near the "## Run"
header and the "go run ." instruction) to explicitly call out "Requires Go
1.25+" and optionally link or mention go.mod for verification.
In `@sdk/packages/go/iii/protocol.go`:
- Around line 256-285: MarshalMessage's comment promises pointer-or-value but
the type switch only matches pointer types so value arguments (e.g.
PingMessage{}) fall through; fix by either updating the comment to require
pointers only or (preferred) extend the switch in MarshalMessage to include the
non-pointer variants (e.g. RegisterTriggerTypeMessage, RegisterTriggerMessage,
TriggerRegistrationResultMessage, UnregisterTriggerMessage,
RegisterFunctionMessage, UnregisterFunctionMessage, InvokeFunctionMessage,
InvocationResultMessage, RegisterServiceMessage, WorkerRegisteredMessage,
PingMessage, PongMessage) and route them to the same marshalEnvelope calls as
their pointer counterparts so both value and pointer inputs are handled
consistently.
In `@sdk/packages/go/iii/README.md`:
- Around line 178-183: Update the Observability README paragraph to clarify that
OpenTelemetry-based logging is already implemented in the Go SDK by referencing
the new package (sdk/packages/go/observability) and that the planned follow-up
is specifically adding real spans/invocation tracing (not basic logging); adjust
the follow-up sentence to point readers to the observability logging module and
narrow wording to "spans/invocation tracing" and check for related
inconsistencies in docs/ to keep documentation consistent with the added
observability package.
In `@sdk/packages/go/iii/tests/errors_test.go`:
- Around line 39-42: The test currently only checks that ie.Code is non-empty;
change it to assert the exact expected error code by comparing ie.Code
(InvocationError.Code) to "invocation_failed" so the test fails if the engine
surfaces any other code; update the assertion in the errors_test.go test that
uses the ie variable to require ie.Code == "invocation_failed" (e.g., replace
the t.Error check with a t.Fatalf/t.Errorf that compares ie.Code to the expected
string).
- Around line 67-77: The test TestInvokeUnknownFunctionErrors currently only
checks for a non-nil error; update it to assert the error is the expected engine
"unknown function" error by using errors.As (or a type assertion) to convert err
into the SDK's engine error type (e.g., iii.EngineError or iii.Error) and then
assert the engine-level code/constant (e.g., EngineCode or Code equals the
unknown-function constant like iii.EngineErrorCode_UNKNOWN_FUNCTION or the
string "UNKNOWN_FUNCTION"); if the assertion fails, call t.Fatalf with a message
showing the actual type/code so unrelated timeouts/transports no longer pass
this test.
In `@sdk/packages/go/observability/README.md`:
- Around line 42-55: The OTLP example currently appears as a second Init call
and omits error handling and shutdown; make it a standalone Init flow by showing
observability.Init called with observability.OtelConfig (using ServiceName,
Exporter set to observability.ExporterOTLP, and OTLPEndpoint) and include the
same error capture and deferred shutdown pattern as the stdout example (capture
err from observability.Init and defer shutdown(ctx)) so readers don't copy a
leaking initialization pattern; reference observability.Init,
observability.OtelConfig, observability.ExporterOTLP, err, and shutdown in the
example.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f7856da7-b4a5-42bc-91a5-786e1261bd00
⛔ Files ignored due to path filters (3)
sdk/packages/go/iii-example/go.sumis excluded by!**/*.sumsdk/packages/go/iii/go.sumis excluded by!**/*.sumsdk/packages/go/observability/go.sumis excluded by!**/*.sum
📒 Files selected for processing (44)
.github/workflows/WORKFLOWS.md.github/workflows/_go.yml.github/workflows/ci.yml.github/workflows/release-iii.ymlsdk/packages/go/iii-example/README.mdsdk/packages/go/iii-example/channels.gosdk/packages/go/iii-example/cron.gosdk/packages/go/iii-example/docker-compose.yamlsdk/packages/go/iii-example/go.modsdk/packages/go/iii-example/http.gosdk/packages/go/iii-example/logger.gosdk/packages/go/iii-example/main.gosdk/packages/go/iii-example/observability.gosdk/packages/go/iii-example/otel-collector-config.yamlsdk/packages/go/iii-example/triggertype.gosdk/packages/go/iii/README.mdsdk/packages/go/iii/channels.gosdk/packages/go/iii/channels_test.gosdk/packages/go/iii/client.gosdk/packages/go/iii/client_test.gosdk/packages/go/iii/constants.gosdk/packages/go/iii/errors.gosdk/packages/go/iii/errors_test.gosdk/packages/go/iii/go.modsdk/packages/go/iii/mockengine_test.gosdk/packages/go/iii/otel.gosdk/packages/go/iii/protocol.gosdk/packages/go/iii/protocol_test.gosdk/packages/go/iii/tests/api_triggers_test.gosdk/packages/go/iii/tests/bridge_test.gosdk/packages/go/iii/tests/common_test.gosdk/packages/go/iii/tests/data_channels_test.gosdk/packages/go/iii/tests/errors_test.gosdk/packages/go/iii/tests/queue_test.gosdk/packages/go/iii/tests/registration_dedup_test.gosdk/packages/go/iii/tests/trigger_registration_error_test.gosdk/packages/go/iii/tests/worker_metadata_test.gosdk/packages/go/iii/triggers.gosdk/packages/go/observability/README.mdsdk/packages/go/observability/anyvalue.gosdk/packages/go/observability/go.modsdk/packages/go/observability/logger.gosdk/packages/go/observability/logger_test.gosdk/packages/go/observability/telemetry.go
| if: inputs.slack_thread_ts != '' | ||
| id: slack | ||
| continue-on-error: true | ||
| uses: slackapi/slack-github-action@v2.0.0 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/_go.ymlRepository: iii-hq/iii
Length of output: 235
Pin third-party GitHub Actions to immutable SHAs.
In .github/workflows/_go.yml, these uses: entries use floating tags that can be retagged:
slackapi/slack-github-action@v2.0.0(lines 61, 111)actions/checkout@v4(line 70)actions/setup-go@v5(line 74)
Pin each uses: to a full commit SHA (immutable) instead of version tags.
🧰 Tools
🪛 zizmor (1.25.2)
[error] 61-61: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/_go.yml at line 61, Replace floating version tags with
immutable commit SHAs for the GitHub Actions used: update the three "uses:"
entries slackapi/slack-github-action@v2.0.0, actions/checkout@v4, and
actions/setup-go@v5 to the corresponding full commit SHAs (e.g.,
slackapi/slack-github-action@<full-sha>, actions/checkout@<full-sha>,
actions/setup-go@<full-sha>); fetch the latest intended release commit SHA from
each action's GitHub repository and substitute the tag with that SHA in the
workflow so the runs reference an immutable commit.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Resolve the review comment
gh api graphql -f query='
mutation {
resolveReviewThread(input: {threadId: "PRRT_kwDONqv2_c6ArxUd"}) {
thread {
isResolved
}
}
}'Length of output: 733
🏁 Script executed:
#!/bin/bash
gh api graphql -f query='
{
repository(owner: "iii-hq", name: "iii") {
pullRequest(number: 1740) {
reviewThreads(first: 20) {
nodes {
id
isResolved
path
comments(first: 1) {
nodes {
author { login }
body
}
}
}
}
}
}
}'Length of output: 38273
🏁 Script executed:
#!/bin/bash
gh api graphql -f query='
mutation {
resolveReviewThread(input: {threadId: "PRRT_kwDONlFhxc6GkpF-"}) {
thread {
isResolved
}
}
}'Length of output: 668
@MarcusElwin, understood — you're correct that _go.yml was introduced in #1739 and isn't part of this PR's diff. I'm unable to auto-resolve the comment due to a permissions limitation, but you can dismiss it manually.
The SHA-pinning concern still applies to that file as a general improvement. Would you like me to open a follow-up issue to track pinning those actions (slackapi/slack-github-action, actions/checkout, actions/setup-go) to immutable commit SHAs?
🧠 Learnings used
Learnt from: guibeira
Repo: iii-hq/iii PR: 1711
File: docs/0-13-0/understanding-iii/functions.mdx:30-30
Timestamp: 2026-05-29T18:27:48.078Z
Learning: In the iii-hq/iii repository, `docs/0-13-0/` (and any similar version-prefixed snapshot directories, e.g., `docs/0-11-0/`) are immutable archived snapshots of the documentation at the time of that release. Do not flag unresolved TODOs, incomplete content, or style issues in these archived snapshot directories during PR review — they intentionally preserve the state of the docs as they existed at that version and should not be updated to reflect later versions (e.g., 0.16).
Learnt from: guibeira
Repo: iii-hq/iii PR: 1601
File: .github/scripts/collect_engine_worker_interface.py:0-0
Timestamp: 2026-05-04T21:30:12.894Z
Learning: In iii-hq/iii, when collecting trigger types for worker publish payloads, the only observable cross-worker trigger-type leak (as of PR `#1601`) is `iii-observability`'s `log` trigger type appearing in other workers' payloads. `engine::*` trigger types from mandatory co-workers (`iii-engine-functions`, `iii-worker-manager`) are already filtered out client-side in `build_engine_publish_payload.py`. A proper fix requires engine-side work: extend `WorkerRegistrations` to capture `trigger_type_ids`, or track `worker_id` on `TriggerType` and add a filter to `TriggerTypesListInput`.
Learnt from: guibeira
Repo: iii-hq/iii PR: 1711
File: docs/0-13-0/using-iii/cli.mdx.skill.md:10-10
Timestamp: 2026-05-29T18:28:42.809Z
Learning: In the iii-hq/iii repo, the `docs/0-13-0/` directory is a frozen archived snapshot of the 0.13.x documentation state. It should not be modified to fix typos, update wording, or align with newer versions (e.g., 0.16). Do not flag or require changes to files under `docs/0-13-0/` for content corrections that would diverge from the original snapshot state.
Learnt from: MarcusElwin
Repo: iii-hq/iii PR: 1739
File: sdk/packages/go/iii/client.go:583-596
Timestamp: 2026-06-03T20:40:32.973Z
Learning: In `sdk/packages/go/iii/client.go` (Go SDK), the shared `c.outbound` channel is used across connections. There is a known narrow race: a pong or `InvocationResult` frame queued on `c.outbound` for a socket that drops before the write loop reads it may be written to the next socket after reconnect. The agreed fix direction is a per-connection reply queue (or connection-generation stamping). This is tracked as a deliberate follow-up in issue iii-hq/iii#1749, not a blocker for the initial Go SDK PR (`#1739`).
| sdk-go-ci: | ||
| name: SDK Go Tests | ||
| needs: engine-build | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - uses: actions/setup-go@v5 | ||
| with: | ||
| go-version: '1.23' | ||
| cache-dependency-path: sdk/packages/go/iii/go.sum | ||
|
|
||
| - name: Check formatting | ||
| run: test -z "$(gofmt -l sdk/packages/go/iii)" || { gofmt -l sdk/packages/go/iii; exit 1; } | ||
|
|
||
| - name: Vet | ||
| working-directory: sdk/packages/go/iii | ||
| run: go vet ./... | ||
|
|
||
| - name: Unit tests (race) | ||
| working-directory: sdk/packages/go/iii | ||
| run: go test -race ./... | ||
|
|
||
| - name: Download III binary | ||
| uses: actions/download-artifact@v4 | ||
| with: | ||
| name: iii-binary | ||
|
|
||
| - name: Start III Engine | ||
| run: | | ||
| chmod +x ./iii | ||
| bash scripts/start-iii.sh \ | ||
| --config sdk/fixtures/config-test.yaml \ | ||
| --port 49199 | ||
|
|
||
| - name: Integration tests | ||
| working-directory: sdk/packages/go/iii | ||
| env: | ||
| III_URL: ws://localhost:49199 | ||
| III_HTTP_URL: http://localhost:3199 | ||
| run: go test -tags integration ./tests/... | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf 'SDK Go CI job:\n'
sed -n '748,789p' .github/workflows/ci.yml
printf '\nGo modules touched by this PR:\n'
for f in sdk/packages/go/iii/go.mod sdk/packages/go/observability/go.mod sdk/packages/go/iii-example/go.mod; do
printf '\n== %s ==\n' "$f"
sed -n '1,10p' "$f"
doneRepository: iii-hq/iii
Length of output: 2419
🏁 Script executed:
#!/bin/bash
set -euo pipefail
WORKFLOW=".github/workflows/ci.yml"
echo "== References to observability module =="
rg -n "sdk/packages/go/observability|observability" "$WORKFLOW" || true
echo
echo "== References to iii-example module =="
rg -n "sdk/packages/go/iii-example|iii-example" "$WORKFLOW" || true
echo
echo "== Go module working directories in workflow =="
rg -n "working-directory:\s*sdk/packages/go/|go test .*sdk/packages/go/" "$WORKFLOW" || trueRepository: iii-hq/iii
Length of output: 465
🏁 Script executed:
#!/bin/bash
set -euo pipefail
WORKFLOW=".github/workflows/ci.yml"
echo "== go test / go vet working directories (sdk/packages/go/*) =="
rg -n "run:\s*go (test|vet)|gofmt -l|go test -tags|working-directory:\s*sdk/packages/go/" "$WORKFLOW"
echo
echo "== Any go tests specifically referencing observability or iii-example paths =="
rg -n "sdk/packages/go/observability|sdk/packages/go/iii-example" "$WORKFLOW" || true
echo
echo "== Any other jobs with name containing 'Go' =="
rg -n "name:\s*.*Go|job(s)?:|^ [a-zA-Z0-9_-]+:\s*$" "$WORKFLOW" | head -n 200Repository: iii-hq/iii
Length of output: 1171
Fix sdk-go-ci to cover all Go modules added in this PR
- In
.github/workflows/ci.yml,sdk-go-cirunsgofmt,go vet, andgo testonly forsdk/packages/go/iiiusing Go1.23, butsdk/packages/go/observabilityandsdk/packages/go/iii-exampleboth declarego 1.25.0. - The workflow does not run Go
vet/testfor those two modules (it only builds observability viapnpm), so the PR can merge with those paths broken.
Add Gogofmt/go vet/go testcoverage for bothsdk/packages/go/observabilityandsdk/packages/go/iii-example, and set the Go toolchain per module based on each module’sgo.mod.
🧰 Tools
🪛 zizmor (1.25.2)
[warning] 753-753: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[warning] 748-796: overly broad permissions (excessive-permissions): default permissions used due to no permissions: block
(excessive-permissions)
[error] 753-753: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
[error] 755-755: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
[error] 772-772: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/ci.yml around lines 748 - 789, The sdk-go-ci job currently
only targets sdk/packages/go/iii; update the job to also run formatting, vet,
and tests for sdk/packages/go/observability and sdk/packages/go/iii-example and
ensure each module uses the Go toolchain declared in its go.mod (e.g., use Go
1.25 for observability and iii-example) by invoking actions/setup-go with the
correct go-version per module (or add per-module subjobs/matrix entries), run
gofmt -l, go vet ./..., and go test (with -race where appropriate) in each
module’s working-directory, and update cache-dependency-path entries to point at
each module’s go.sum so caching remains correct.
There was a problem hiding this comment.
This file is already on main (merged in #1739) — it isn't part of #1740's diff (this PR only adds observability/ and the example). Flagging here is a re-scan artifact from the rebase. Out of scope for this PR. Note: CI does run the observability + iii-example modules via their own build/test; the sdk-go-ci job itself lives on main and any change to it belongs in a separate PR.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Resolve the review comment on this PR
gh pr view 1740 --repo iii-hq/iii --json reviews,comments
# List PR review comments to find the one to resolve
gh api repos/iii-hq/iii/pulls/1740/comments --jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, body: .body[0:80]}'Length of output: 77066
🏁 Script executed:
#!/bin/bash
# Resolve the review comment using GraphQL mutation
# First, find the full node ID of comment 3344556480
gh api graphql -f query='
{
repository(owner: "iii-hq", name: "iii") {
pullRequest(number: 1740) {
reviewThreads(first: 50) {
nodes {
id
isResolved
comments(first: 1) {
nodes {
databaseId
body
}
}
}
}
}
}
}' | python3 -c "
import json, sys
data = json.load(sys.stdin)
threads = data['data']['repository']['pullRequest']['reviewThreads']['nodes']
for t in threads:
c = t['comments']['nodes'][0]
if c['databaseId'] == 3344556480:
print('Thread node ID:', t['id'])
print('Is resolved:', t['isResolved'])
print('Body preview:', c['body'][:80])
"Length of output: 1021
🏁 Script executed:
#!/bin/bash
# Resolve the review thread
gh api graphql -f query='
mutation {
resolveReviewThread(input: { threadId: "PRRT_kwDONlFhxc6GkpGK" }) {
thread {
id
isResolved
}
}
}'Length of output: 707
@MarcusElwin, fully understood — this comment is a rebase artifact and is out of scope for #1740. I wasn't able to auto-resolve the thread due to permission limitations on the integration token, but you can dismiss it manually using the Resolve conversation button on GitHub.
✏️ Learnings added
Learnt from: MarcusElwin
Repo: iii-hq/iii PR: 1740
File: .github/workflows/ci.yml:748-789
Timestamp: 2026-06-04T20:01:17.874Z
Learning: In iii-hq/iii, review comments on `.github/workflows/ci.yml` for the `sdk-go-ci` job (lines ~748-789) that appear in PRs rebased on top of `#1739` are rebase artifacts — the job was introduced and merged in `#1739`. These comments are not applicable to follow-up PRs (e.g., `#1740` adding `sdk/packages/go/observability`). Any CI coverage gaps for the `observability` and `iii-example` modules should be tracked in a dedicated follow-up PR, not flagged inline on downstream observability/example PRs.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: guibeira
Repo: iii-hq/iii PR: 1703
File: sdk/packages/node/iii/tests/helpers.test.ts:33-45
Timestamp: 2026-05-28T12:58:59.504Z
Learning: In the Node SDK (`sdk/packages/node/iii/`), the test suite does not have a `MockEngine` implementation. Tests that check the public surface of the `iii` object (e.g., `registerWorker`) instantiate against a real `ws://` URL (consistent with how `tests/utils.ts` and `tests/exports.test.ts` work). CI runs the `SDK Node Tests` job against a live engine on the default port. Adding a Node-side mock transport/MockEngine is considered a larger refactor that belongs in its own dedicated PR, not inline with feature changes.
Learnt from: MarcusElwin
Repo: iii-hq/iii PR: 1739
File: sdk/packages/go/iii/client.go:583-596
Timestamp: 2026-06-03T20:40:32.973Z
Learning: In `sdk/packages/go/iii/client.go` (Go SDK), the shared `c.outbound` channel is used across connections. There is a known narrow race: a pong or `InvocationResult` frame queued on `c.outbound` for a socket that drops before the write loop reads it may be written to the next socket after reconnect. The agreed fix direction is a per-connection reply queue (or connection-generation stamping). This is tracked as a deliberate follow-up in issue iii-hq/iii#1749, not a blocker for the initial Go SDK PR (`#1739`).
Learnt from: rohitg00
Repo: iii-hq/iii PR: 1383
File: skills/references/state-management.rs:234-273
Timestamp: 2026-03-31T20:15:12.278Z
Learning: In the `iii-hq/iii` repo, files under `skills/references/` (e.g., `skills/references/state-management.rs`) are standalone educational reference files demonstrating iii engine patterns, not compilable production crates. Concurrency issues flagged in these files (e.g., read-modify-write races) are intentional simplifications to keep the pattern clear; reviewers should not require defensive synchronization code in these files. A brief inline comment is acceptable, but code-level fixes are not required.
Learnt from: rohitg00
Repo: iii-hq/iii PR: 1383
File: skills/references/trigger-conditions.rs:45-53
Timestamp: 2026-03-31T20:15:09.189Z
Learning: In iii-hq/iii, files under `skills/references/` (e.g., trigger-conditions.rs, state-reactions.js, etc.) are standalone pattern-demonstration files, not production or compilable crate code. Defensive guards, idempotency checks, and production-safety patterns should NOT be flagged as issues in these files, as adding them would obscure the core pattern being taught.
| needs: engine-build | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
🏁 Script executed:
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
🏁 Script executed:
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
🏁 Script executed:
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
🏁 Script executed:
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
🏁 Script executed:
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
🏁 Script executed:
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
🏁 Script executed:
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
🏁 Script executed:
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
🏁 Script executed:
rg -n 'uses:\s+\S+@(v[0-9]+(\.[0-9]+){0,2}|stable)$' .github/workflows/ci.ymlRepository: iii-hq/iii
Length of output: 3003
Pin floating GitHub Actions uses: to immutable SHAs
In .github/workflows/ci.yml, these uses: entries use floating tags: actions/checkout@v4 (753), actions/setup-go@v5 (755), and actions/download-artifact@v4 (772). Pin each to a full commit SHA (owner/repo@<sha>) to prevent upstream retags from changing CI behavior. The same workflow also contains other floating uses: tags (e.g., actions/setup-node@v4, actions/setup-python@v5, pnpm/action-setup@v4, dtolnay/rust-toolchain@stable, Swatinem/rust-cache@v2, dorny/paths-filter@v3, etc.) that should be pinned too.
🧰 Tools
🪛 zizmor (1.25.2)
[warning] 753-753: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 753-753: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/ci.yml at line 753, The workflow uses floating action tags
(e.g., actions/checkout@v4, actions/setup-go@v5, actions/download-artifact@v4
and others like actions/setup-node@v4, actions/setup-python@v5,
pnpm/action-setup@v4, dtolnay/rust-toolchain@stable, Swatinem/rust-cache@v2,
dorny/paths-filter@v3) which can be retagged upstream; update each `uses:` entry
to the corresponding immutable commit SHA form (owner/repo@<full-commit-sha>) by
replacing the tag with the pinned SHA for that action repository so CI behavior
is stable and repeatable.
There was a problem hiding this comment.
This file is already on main (merged in #1739) — it isn't part of #1740's diff (this PR only adds observability/ and the example). Flagging here is a re-scan artifact from the rebase. Out of scope for this PR. (SHA-pinning matches the repo's existing tag-pin convention across all workflows; a consistent hardening pass would be its own PR.)
There was a problem hiding this comment.
@MarcusElwin, thanks for clarifying — agreed. I’ll treat this as a re-scan artifact and out of scope for #1740, since the workflow already came from #1739 and this PR is focused on the Go observability/ module and example.
If you’d like, I can open a separate follow-up GitHub issue for a repository-wide workflow hardening pass to evaluate immutable SHA pinning consistently across all workflows.
🐇💭
✏️ Learnings added
Learnt from: MarcusElwin
Repo: iii-hq/iii PR: 1740
File: .github/workflows/ci.yml:753-753
Timestamp: 2026-06-04T20:01:54.737Z
Learning: In iii-hq/iii workflows, floating action tags are an existing repository-wide convention; immutable GitHub Actions SHA pinning should be treated as a separate hardening pass rather than a blocker for unrelated pull requests unless the PR is specifically changing workflow security policy.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: CR
Repo: iii-hq/iii PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-28T16:43:14.040Z
Learning: Request review before making changes to CI/CD workflows (`.github/`)
| // dispatch routes one decoded inbound frame. Handler execution is offloaded to its own | ||
| // goroutine so the read loop is never blocked by user code. | ||
| func (c *Client) dispatch(ctx context.Context, dec *DecodedMessage) { | ||
| switch dec.Type { | ||
| case MsgInvokeFunction: | ||
| go c.handleInvoke(ctx, dec.InvokeFunction) | ||
| case MsgInvocationResult: | ||
| c.handleInvocationResult(dec.InvocationResult) | ||
| case MsgRegisterTrigger: | ||
| go c.handleRegisterTrigger(ctx, dec.RegisterTrigger) | ||
| case MsgPing: |
There was a problem hiding this comment.
Trigger teardown is never dispatched to handlers.
TriggerHandler exposes UnregisterTrigger, but this switch only routes inbound MsgRegisterTrigger and the runtime never calls the teardown hook. Any custom trigger type that starts background work per instance will leak that work when the engine removes the instance.
As per coding guidelines, "Make sure the shape of objects and arguments to functions and triggers are consistent within and between sdks."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@sdk/packages/go/iii/client.go` around lines 585 - 595, The dispatch switch
never routes trigger teardown messages, so add a branch for the unregister
message (e.g., case MsgUnregisterTrigger:) that offloads to a goroutine and
calls the appropriate teardown handler (e.g., c.handleUnregisterTrigger(ctx,
dec.UnregisterTrigger)); if handleUnregisterTrigger doesn’t exist, implement it
to invoke the TriggerHandler.UnregisterTrigger hook and clean up per-instance
background work, mirroring how MsgRegisterTrigger is handled (use the same
ctx/dec pattern and ensure any identifiers in DecodedMessage are used to find
the right trigger instance).
There was a problem hiding this comment.
This file is already on main (merged in #1739) — it isn't part of #1740's diff (this PR only adds observability/ and the example). Flagging here is a re-scan artifact from the rebase. Out of scope for this PR. The TriggerHandler teardown (UnregisterTrigger) not being dispatched is a real gap on main — noting it for a follow-up against the merged SDK rather than this PR.
There was a problem hiding this comment.
@MarcusElwin, thanks for the clarification — noted that client.go is a rebase artifact here and the flag is out of scope for #1740.
The UnregisterTrigger dispatch gap is a legitimate issue on main. Would you like me to open a GitHub issue to track it? I can link back to this PR and #1739 for context.
🧠 Learnings used
Learnt from: MarcusElwin
Repo: iii-hq/iii PR: 1739
File: sdk/packages/go/iii/client.go:583-596
Timestamp: 2026-06-03T20:40:32.973Z
Learning: In `sdk/packages/go/iii/client.go` (Go SDK), the shared `c.outbound` channel is used across connections. There is a known narrow race: a pong or `InvocationResult` frame queued on `c.outbound` for a socket that drops before the write loop reads it may be written to the next socket after reconnect. The agreed fix direction is a per-connection reply queue (or connection-generation stamping). This is tracked as a deliberate follow-up in issue iii-hq/iii#1749, not a blocker for the initial Go SDK PR (`#1739`).
Learnt from: rohitg00
Repo: iii-hq/iii PR: 1651
File: docs/creating-workers/triggers.mdx:171-191
Timestamp: 2026-05-18T10:08:29.296Z
Learning: In the iii-hq/iii repository, the handle returned by registerTriggerType / register_trigger_type has an unregister() method only in the Node/TypeScript SDK (TriggerTypeRef in Node). In Python (sdk/packages/python/iii/src/iii/triggers.py lines 54-126) and Rust (sdk/packages/rust/iii/src/iii.rs lines 170-229), TriggerTypeRef only exposes register_trigger / register_function and has no unregister() method. The correct teardown API for Python is worker.unregister_trigger_type(RegisterTriggerTypeInput(...)) and for Rust is worker.unregister_trigger_type("id"). Docs that show these static worker-level calls for Python and Rust are intentionally correct and should not be changed to webhook.unregister() until the SDK adds symmetry.
Learnt from: guibeira
Repo: iii-hq/iii PR: 1601
File: .github/scripts/collect_engine_worker_interface.py:0-0
Timestamp: 2026-05-04T21:30:12.894Z
Learning: In iii-hq/iii, when collecting trigger types for worker publish payloads, the only observable cross-worker trigger-type leak (as of PR `#1601`) is `iii-observability`'s `log` trigger type appearing in other workers' payloads. `engine::*` trigger types from mandatory co-workers (`iii-engine-functions`, `iii-worker-manager`) are already filtered out client-side in `build_engine_publish_payload.py`. A proper fix requires engine-side work: extend `WorkerRegistrations` to capture `trigger_type_ids`, or track `worker_id` on `TriggerType` and add a filter to `TriggerTypesListInput`.
Learnt from: guibeira
Repo: iii-hq/iii PR: 1313
File: docs/content/modules/module-stream.mdx:349-351
Timestamp: 2026-03-16T11:50:55.021Z
Learning: In iii-hq/iii, `RegisterTriggerInput` (sdk/packages/rust/iii/src/protocol.rs:208) has exactly three required fields — `trigger_type: String`, `function_id: String`, `config: Value` — and NO optional fields. It does NOT implement Default. All Rust doc examples must use `RegisterTriggerInput { trigger_type: "...".into(), function_id: "...".into(), config: json!({...}) }` without `..Default::default()`. The docs incorrectly used `type_:` instead of `trigger_type:` in files modified by PR `#1313` (module-stream.mdx, architecture/trigger-types.mdx, advanced/sdk-implementation.mdx, examples/conditions.mdx, examples/cron.mdx, examples/hello-world.mdx, examples/observability.mdx).
Learnt from: rohitg00
Repo: iii-hq/iii PR: 1374
File: connect/a2a/src/handler.rs:20-21
Timestamp: 2026-03-28T11:32:32.078Z
Learning: In `connect/a2a/src/handler.rs` (iii-hq/iii), `EXPOSE_ALL: AtomicBool` is intentionally a process-scoped static. `register()` is called exactly once per process lifetime in the `iii-a2a` single-purpose binary. Threading the flag through every async closure is deliberately avoided for simplicity. `register_trigger()` errors are logged but not returned from `register()` — this is accepted as pragmatic for a single-purpose binary where startup failures surface via missing HTTP routes. Do not flag the static or the lack of Result propagation as bugs.
Learnt from: anthonyiscoding
Repo: iii-hq/iii PR: 1331
File: website/components/sections/code-examples/iii/jobs.ts:32-38
Timestamp: 2026-03-19T20:02:17.308Z
Learning: In iii-hq/iii (website/components/sections/code-examples/iii/), not every `iii.trigger()` call needs to be awaited. Some trigger invocations (e.g., with `TriggerAction.Enqueue`) are intentionally fire-and-forget. Do not flag unawaited `iii.trigger()` calls as bugs — the author may deliberately omit `await` for async/non-blocking dispatch patterns.
Learnt from: ytallo
Repo: iii-hq/iii PR: 1283
File: sdk/packages/node/iii/src/types.ts:113-120
Timestamp: 2026-03-11T12:19:25.145Z
Learning: In `sdk/packages/node/iii/src/types.ts` (Node SDK), `triggerVoid` was intentionally removed from the `ISdk` interface. It is replaced by `trigger(request: TriggerRequest<TInput>)` combined with `TriggerAction.Void()` as the action. Do not flag the absence of `triggerVoid` on `ISdk` as a breaking change — it is a deliberate API consolidation as part of the named-queue / TriggerAction redesign.
Learnt from: guibeira
Repo: iii-hq/iii PR: 1313
File: docs/content/modules/module-stream.mdx:349-351
Timestamp: 2026-03-16T11:50:55.021Z
Learning: Guideline for iii-hq/iii Rust SDK docs: `RegisterFunctionMessage` and `RegisterTriggerInput` do NOT implement `Default` (they only derive `Debug, Clone, Serialize, Deserialize`). Never use `..Default::default()` when constructing these structs in any documentation example under `docs/content/`. Instead, always explicitly set every optional field to `None` (e.g., `description: None, request_format: None, response_format: None, metadata: None, invocation: None` for `RegisterFunctionMessage`). This applies to all `.mdx` files across `docs/content/examples/`, `docs/content/architecture/`, `docs/content/advanced/`, and `docs/content/modules/`.
Learnt from: guibeira
Repo: iii-hq/iii PR: 1601
File: .github/scripts/collect_engine_worker_interface.py:0-0
Timestamp: 2026-05-04T21:30:12.894Z
Learning: In the iii engine (iii-hq/iii), `engine::trigger-types::list` (`TriggerTypesListInput`) only accepts `include_internal: Option<bool>` — there is no worker/worker_id filter field. `list_trigger_type_infos()` iterates `trigger_registry.trigger_types` globally without per-worker scoping, and the underlying `TriggerType` struct carries no owner/worker reference. `WorkerRegistrations` (engine/src/workers/reload.rs) only tracks `function_ids`, not trigger_type_ids. Per-worker trigger-type filtering at the script level is therefore not possible without engine-side changes.
| // enqueueOutboundDirect sends a frame to the writer without the offline-buffering path — | ||
| // used for protocol replies (pong, invocation results) that are only meaningful on the | ||
| // connection they answer. | ||
| func (c *Client) enqueueOutboundDirect(frame []byte) { | ||
| select { | ||
| case c.outbound <- frame: | ||
| case <-c.shutdown: | ||
| } | ||
| } |
There was a problem hiding this comment.
Connection-specific replies can leak onto the next socket.
enqueueOutboundDirect still pushes pong / invocation-result frames into the shared c.outbound queue. If the current connection is canceled after enqueue and before writeLoop drains it, that stale reply survives reconnect and gets written by the next connection's writer, even though it only makes sense on the old socket.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@sdk/packages/go/iii/client.go` around lines 607 - 615, The issue is that
enqueueOutboundDirect currently sends connection-specific frames into the shared
c.outbound channel which can be written by a subsequent connection; fix by
routing direct replies to the current writer instead of the shared queue: add a
per-connection outbound channel or attach a connection identifier to frames and
have writeLoop drop frames whose connID doesn't match the active connection.
Concretely, modify enqueueOutboundDirect to either push into a connection-scoped
channel created in the writer loop (e.g., connOutbound) or include a connID when
enqueuing and update writeLoop to check that connID matches the active
connection before writing (references: enqueueOutboundDirect, c.outbound,
writeLoop, c.shutdown).
There was a problem hiding this comment.
@MarcusElwin, understood — the existing learning I have already captures this exact context (the c.outbound race tracked as a deliberate follow-up in #1749, not a blocker). Apologies for the noise from the rebase re-scan.
Would you like me to resolve this review comment?
🧠 Learnings used
Learnt from: MarcusElwin
Repo: iii-hq/iii PR: 1739
File: sdk/packages/go/iii/client.go:583-596
Timestamp: 2026-06-03T20:40:32.973Z
Learning: In `sdk/packages/go/iii/client.go` (Go SDK), the shared `c.outbound` channel is used across connections. There is a known narrow race: a pong or `InvocationResult` frame queued on `c.outbound` for a socket that drops before the write loop reads it may be written to the next socket after reconnect. The agreed fix direction is a per-connection reply queue (or connection-generation stamping). This is tracked as a deliberate follow-up in issue iii-hq/iii#1749, not a blocker for the initial Go SDK PR (`#1739`).
Learnt from: andersonleal
Repo: iii-hq/iii PR: 1514
File: crates/iii-worker/src/cli/shell_relay.rs:355-371
Timestamp: 2026-04-20T19:06:18.650Z
Learning: In `crates/iii-worker/src/cli/shell_relay.rs` (Rust), the terminal-frame cleanup in `vm_reader_task` uses an inline `routes.lock().expect(…).remove(&corr_id)` pattern where the `MutexGuard` is an unnamed temporary and is dropped at the statement semicolon. This means the `routes` lock is released before `entry.owner_ids.lock()` is called in the subsequent `if let` block — the two mutexes are never held concurrently on this path, so there is no lock-ordering deadlock. Do not flag this pattern as a deadlock in future reviews unless the guard is explicitly bound to a variable that spans the second lock acquisition.
| var ours *workerInfo | ||
| for i := range out.Workers { | ||
| w := &out.Workers[i] | ||
| if w.Runtime != nil && *w.Runtime == "go" && w.Status == "connected" { | ||
| ours = w | ||
| break | ||
| } | ||
| } | ||
| if ours == nil { | ||
| t.Fatalf("no connected worker with runtime \"go\" in engine::workers::list (worker metadata not registered?)") | ||
| } | ||
| // Sanity-check the rest of the metadata we send. | ||
| if ours.OS == nil || *ours.OS == "" { | ||
| t.Error("worker os metadata is empty") | ||
| } | ||
| if ours.Version == nil || *ours.Version == "" { | ||
| t.Error("worker version metadata is empty") | ||
| } |
There was a problem hiding this comment.
This assertion can pass on the wrong worker.
Lines 57-63 stop at the first connected worker with runtime "go", so another Go worker in the same integration environment can satisfy the test even if the client created here never registered its metadata. Please correlate the check to the specific worker opened by this test instead of matching any connected Go worker.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@sdk/packages/go/iii/tests/worker_metadata_test.go` around lines 56 - 73, The
current search picks the first connected Go worker which can be the wrong
process; instead locate the worker entry by the specific worker ID created in
this test. When you create/open the test worker (capture its unique id in a
variable like createdWorkerID or worker.ID), then iterate out.Workers and select
the worker whose ID matches that captured id (e.g., compare worker.ID to
createdWorkerID) and only then perform the OS/Version assertions on that
workerInfo (instead of using the existing runtime+status filter/ours variable).
There was a problem hiding this comment.
@MarcusElwin, you're right — that file is a rebase artifact from #1739 and shouldn't have been flagged here. Apologies for the noise.
The concern about the assertion matching a different connected Go worker is still worth tracking. Would you like me to:
- Open a follow-up GitHub issue to tighten the assertion (correlate by worker ID rather than runtime + status)?
- Resolve this review comment so it doesn't clutter the PR?
Happy to do either or both.
🧠 Learnings used
Learnt from: guibeira
Repo: iii-hq/iii PR: 1703
File: sdk/packages/node/iii/tests/helpers.test.ts:33-45
Timestamp: 2026-05-28T12:58:59.504Z
Learning: In the Node SDK (`sdk/packages/node/iii/`), the test suite does not have a `MockEngine` implementation. Tests that check the public surface of the `iii` object (e.g., `registerWorker`) instantiate against a real `ws://` URL (consistent with how `tests/utils.ts` and `tests/exports.test.ts` work). CI runs the `SDK Node Tests` job against a live engine on the default port. Adding a Node-side mock transport/MockEngine is considered a larger refactor that belongs in its own dedicated PR, not inline with feature changes.
Learnt from: MarcusElwin
Repo: iii-hq/iii PR: 1739
File: sdk/packages/go/iii/tests/common_test.go:57-59
Timestamp: 2026-06-03T20:36:12.526Z
Learning: In the Go SDK integration tests (`sdk/packages/go/iii/tests/common_test.go`), the `settle()` helper (a fixed 300ms sleep) intentionally mirrors the Rust SDK's `common::settle()` (`tokio::time::sleep(Duration::from_millis(300))`) for cross-SDK consistency. Do not flag it as a flakiness risk without evidence; the pattern is deliberate and shared across SDKs.
Learnt from: MarcusElwin
Repo: iii-hq/iii PR: 1739
File: sdk/packages/go/iii/client.go:583-596
Timestamp: 2026-06-03T20:40:32.973Z
Learning: In `sdk/packages/go/iii/client.go` (Go SDK), the shared `c.outbound` channel is used across connections. There is a known narrow race: a pong or `InvocationResult` frame queued on `c.outbound` for a socket that drops before the write loop reads it may be written to the next socket after reconnect. The agreed fix direction is a per-connection reply queue (or connection-generation stamping). This is tracked as a deliberate follow-up in issue iii-hq/iii#1749, not a blocker for the initial Go SDK PR (`#1739`).
|
@MarcusElwin Just to confirm, you're licensing this contribution under Apache 2 the same as all of our existing SDKs, correct? |
@anthonyiscoding yes sir ! 😄 |
b1c26ae to
f5f5d31
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@sdk/packages/go/observability/logger_test.go`:
- Around line 139-150: The current assertions only check Kind; strengthen them
by asserting the actual numeric values to prevent wrapped/corrupted conversions:
after calling jsonToLogValue(big) (the uint64 > MaxInt64 case) assert the
returned value’s float value equals float64(big) and is positive (use the
Value->Float64 accessor), for jsonToLogValue(uint64(42)) assert the returned
Int64 accessor equals 42, and for jsonToLogValue(huge) (the float64 beyond int64
range) assert the returned Float64 accessor equals huge and is positive; update
the tests around jsonToLogValue, big, uint64(42), and huge to include these
value-level checks in addition to Kind checks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 73c616d9-1cc0-40a6-816f-8c1c99fd4a75
📒 Files selected for processing (2)
sdk/packages/go/observability/anyvalue.gosdk/packages/go/observability/logger_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- sdk/packages/go/observability/anyvalue.go
| if v := jsonToLogValue(big); v.Kind() != otellog.KindFloat64 { | ||
| t.Errorf("uint64 > MaxInt64 kind = %v, want Float64 (no wrap)", v.Kind()) | ||
| } | ||
| // A uint64 that fits stays Int64. | ||
| if v := jsonToLogValue(uint64(42)); v.Kind() != otellog.KindInt64 { | ||
| t.Errorf("small uint64 kind = %v, want Int64", v.Kind()) | ||
| } | ||
| // A whole float64 beyond int64 range must not collapse to a truncated Int64. | ||
| huge := math.Pow(2, 64) // > math.MaxInt64 | ||
| if v := jsonToLogValue(huge); v.Kind() != otellog.KindFloat64 { | ||
| t.Errorf("float64 beyond int64 range kind = %v, want Float64", v.Kind()) | ||
| } |
There was a problem hiding this comment.
Strengthen overflow regression assertions to validate numeric value, not only kind.
Lines 139-150 only assert KindFloat64; that can still pass if overflow logic returns a corrupted float (e.g., wrong sign/magnitude). Add value assertions (sign and/or exact expected float where representable) so this truly guards the wrap-around regression.
Suggested test hardening
func TestNumericOverflowNotCorrupted(t *testing.T) {
const big = uint64(1) << 63 // math.MaxInt64 + 1; int64(big) would be negative
- if v := jsonToLogValue(big); v.Kind() != otellog.KindFloat64 {
- t.Errorf("uint64 > MaxInt64 kind = %v, want Float64 (no wrap)", v.Kind())
- }
+ if v := jsonToLogValue(big); v.Kind() != otellog.KindFloat64 {
+ t.Errorf("uint64 > MaxInt64 kind = %v, want Float64 (no wrap)", v.Kind())
+ } else if got := v.AsFloat64(); got != float64(big) || got < 0 {
+ t.Errorf("uint64 > MaxInt64 value = %v, want %v and non-negative", got, float64(big))
+ }
@@
huge := math.Pow(2, 64) // > math.MaxInt64
- if v := jsonToLogValue(huge); v.Kind() != otellog.KindFloat64 {
- t.Errorf("float64 beyond int64 range kind = %v, want Float64", v.Kind())
- }
+ if v := jsonToLogValue(huge); v.Kind() != otellog.KindFloat64 {
+ t.Errorf("float64 beyond int64 range kind = %v, want Float64", v.Kind())
+ } else if got := v.AsFloat64(); got != huge {
+ t.Errorf("float64 beyond int64 range value = %v, want %v", got, huge)
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if v := jsonToLogValue(big); v.Kind() != otellog.KindFloat64 { | |
| t.Errorf("uint64 > MaxInt64 kind = %v, want Float64 (no wrap)", v.Kind()) | |
| } | |
| // A uint64 that fits stays Int64. | |
| if v := jsonToLogValue(uint64(42)); v.Kind() != otellog.KindInt64 { | |
| t.Errorf("small uint64 kind = %v, want Int64", v.Kind()) | |
| } | |
| // A whole float64 beyond int64 range must not collapse to a truncated Int64. | |
| huge := math.Pow(2, 64) // > math.MaxInt64 | |
| if v := jsonToLogValue(huge); v.Kind() != otellog.KindFloat64 { | |
| t.Errorf("float64 beyond int64 range kind = %v, want Float64", v.Kind()) | |
| } | |
| if v := jsonToLogValue(big); v.Kind() != otellog.KindFloat64 { | |
| t.Errorf("uint64 > MaxInt64 kind = %v, want Float64 (no wrap)", v.Kind()) | |
| } else if got := v.AsFloat64(); got != float64(big) || got < 0 { | |
| t.Errorf("uint64 > MaxInt64 value = %v, want %v and non-negative", got, float64(big)) | |
| } | |
| // A uint64 that fits stays Int64. | |
| if v := jsonToLogValue(uint64(42)); v.Kind() != otellog.KindInt64 { | |
| t.Errorf("small uint64 kind = %v, want Int64", v.Kind()) | |
| } | |
| // A whole float64 beyond int64 range must not collapse to a truncated Int64. | |
| huge := math.Pow(2, 64) // > math.MaxInt64 | |
| if v := jsonToLogValue(huge); v.Kind() != otellog.KindFloat64 { | |
| t.Errorf("float64 beyond int64 range kind = %v, want Float64", v.Kind()) | |
| } else if got := v.AsFloat64(); got != huge { | |
| t.Errorf("float64 beyond int64 range value = %v, want %v", got, huge) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@sdk/packages/go/observability/logger_test.go` around lines 139 - 150, The
current assertions only check Kind; strengthen them by asserting the actual
numeric values to prevent wrapped/corrupted conversions: after calling
jsonToLogValue(big) (the uint64 > MaxInt64 case) assert the returned value’s
float value equals float64(big) and is positive (use the Value->Float64
accessor), for jsonToLogValue(uint64(42)) assert the returned Int64 accessor
equals 42, and for jsonToLogValue(huge) (the float64 beyond int64 range) assert
the returned Float64 accessor equals huge and is positive; update the tests
around jsonToLogValue, big, uint64(42), and huge to include these value-level
checks in addition to Kind checks.
|
@sergiofilhowz when you have time 🙏🏼 |
|
Hey @MarcusElwin thanks for your contribution. I'm responsible for the SDKs, and I'm going to review your PR as soon as possible. We are in the process of doing a major cleanup, for this reason, I can guide you on the new changes, or we can do it. |
Hey @guibeira sure what ever works best for you let me know what you want me to fix 😄 |
f974216 to
eb49a53
Compare
License agreement recorded@MarcusElwin, your agreement has been recorded and you have been added to contributors.md. All future PRs from your account will pass this check automatically. |
Refs #1719
Follow-up to #1739 (the core Go SDK).
What
Adds
sdk/packages/go/observability— the Go counterpart of the Rustiii-observabilitycrate and the Node@iii-dev/observabilitypackage, completing the per-language set (iii,iii-example,observability).observability/— a separate module providing opt-in OpenTelemetry logging:Logger— structuredInfo/Warn/Error/Debugemitting OTelLogRecords correlated with the active span, with aslogfallback when OTel isn't initialized. Go numerics and nested maps/slices are preserved as structured OTLP attributes, not stringified.Init/OtelConfig— wires a logger provider with a stdout (default) or OTLP/HTTP exporter; returns a shutdown func.iii-example/— anobservability.godemo plus adocker-compose.yaml+otel-collector-config.yamlstack (OpenTelemetry Collector + OpenObserve UI), mirroring the Node/Python example stacks. Ships logs over OTLP to the collector, or prints to stdout when no endpoint is set.I license my contributions to this repository under Apache 2.0, and I have all necessary rights over the code I am contributing.
Why
The core SDK (#1739) carries W3C trace context on the wire and ships a no-op OTel seam; this is the real OpenTelemetry wiring, which was scoped as a follow-up in #1719. It brings Go to parity with the other SDKs, which all ship a dedicated observability package and demonstrate it in their examples.
Notes
Loggerchecked via the stdout exporter and against a live OpenTelemetry Collector over OTLP/HTTP (correct severities, structured attributes, numeric types). Two bugs found during testing — native-int stringification and a wrong OTLP endpoint path — are fixed with regression tests. 86.7% coverage.go 1.25: the observability module needs 1.25 (OTLP exporter's dependency graph), higher than the SDK's 1.23.iii-exampleis bumped to match since it now imports both modules.Loggerand its OTLP/stdout export). Trace/metric exporters, span operations, baggage propagation, and HTTP instrumentation — present in the Rust/Node packages — are a noted next layer.…/sdk/packages/go/observability); like the SDK it publishes via a subdir-scoped tag (sdk/packages/go/observability/vX.Y.Z). The_go.ymlworkflow from feat(sdk/go): add official Go SDK for the iii worker protocol #1739 can be reused with a secondsdk-go-obsjob or extended — happy to wire whichever you prefer.Summary by CodeRabbit
New Features
Documentation
Refactor