[receiver/awsfirehose] Add support for encoding extensions #37262

axw · 2025-01-16T08:25:12Z

Description

Add support for using encoding extensions for unmarshalling records transmitted via Amazon Data Firehose.

The "record_type" config is now deprecated and has been replaced by "encoding". This new config setting supports all of the existing encodings (cwlogs, cwmetrics otlp_v1) as well as support for loading additional encodings via extensions.

Link to tracking issue

Fixes #37113

Testing

Should be a non-functional change, so mostly relying on existing unit tests to catch issues. Tests have been added for new extension functionality. Manually tested creating a Firehose delivery stream and using the text_encoding extension:

Ran collector with following config:

receivers:
  awsfirehose:
    endpoint: localhost:1234
    encoding:  text_encoding

exporters:
  debug:
    verbosity: detailed

extensions:
  text_encoding:

service:
  extensions: [text_encoding]
  pipelines:
    logs:
      receivers: [awsfirehose]
      processors: []
      exporters: [debug]

Exposed to the internet with ngrok
Created a Firehose delivery stream pointed at ngrok HTTPS endpoint
Used AWS CLI to send a record: aws firehose put-record --delivery-stream-name=axwtest --record Data=$(echo -n abc | base64)
Observed log record being exported by the debug exporter:

2025-02-11T14:09:17.090+0800    info    Logs    {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2025-02-11T14:09:17.090+0800    info    ResourceLog #0
Resource SchemaURL: 
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2025-02-11 06:09:17.090506322 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(abc)
Trace ID: 
Span ID: 
Flags: 0

Documentation

Updated README.

atoulme

LGTM

Aneurysm9 · 2025-01-17T01:39:27Z

receiver/awsfirehosereceiver/config_test.go

@@ -19,7 +19,7 @@ import (

 func TestLoadConfig(t *testing.T) {
 	for _, configType := range []string{
-		"cwmetrics", "cwlogs", "otlp_v1", "invalid",


Testing for an invalid record type or encoding is different from testing that both an encoding and record type have been provided. Both tests should remain.

Because we're now supporting extensions, the record type is only known to be valid/invalid at the time we call Start. There's a test case in there for invalid encoding/record type. See the WithUnknownEncoding test cases for TestLogsReceiver_Start and TestMetricsReceiver_Start.

receiver/awsfirehosereceiver/config_test.go

Aneurysm9 · 2025-01-17T01:44:11Z

receiver/awsfirehosereceiver/config.go

-	// If a record type is specified, it must be valid.
-	// An empty string is acceptable, however, because it will use a telemetry-type-specific default.
-	if c.RecordType != "" {
-		return validateRecordType(c.RecordType)


I don't think validation of the record type or encoding can be deferred. This has to fail fast to alert the user to their configuration error rather than allowing the collector to start and then failing to process received data.

The collector will still fail fast. e.g.

$ cat local/config.yaml receivers: awsfirehose: record_type: invalid exporters: debug: {} service: pipelines: logs: receivers: [awsfirehose] processors: [] exporters: [debug] $ ./bin/otelcontribcol_linux_amd64 --config local/config.yaml 2025-01-17T10:51:28.527+0800 info [email protected]/service.go:164 Setting up own telemetry... 2025-01-17T10:51:28.527+0800 info telemetry/metrics.go:70 Serving metrics {"address": "localhost:8888", "metrics level": "Normal"} 2025-01-17T10:51:28.527+0800 info builders/builders.go:26 Development component. May change in the future. {"kind": "exporter", "data_type": "logs", "name": "debug"} 2025-01-17T10:51:28.527+0800 warn [email protected]/config.go:48 record_type is deprecated, and will be removed in a future version. Use encoding instead. {"kind": "receiver", "name": "awsfirehose", "data_type": "logs"} 2025-01-17T10:51:28.530+0800 info [email protected]/service.go:230 Starting otelcontribcol... {"Version": "0.117.0-dev", "NumCPU": 16} 2025-01-17T10:51:28.530+0800 info extensions/extensions.go:39 Starting extensions... 2025-01-17T10:51:28.530+0800 error graph/graph.go:426 Failed to start component {"error": "unknown encoding extension \"invalid\"", "type": "Receiver", "id": "awsfirehose"} 2025-01-17T10:51:28.530+0800 info [email protected]/service.go:295 Starting shutdown... 2025-01-17T10:51:28.530+0800 info extensions/extensions.go:66 Stopping extensions... 2025-01-17T10:51:28.530+0800 info [email protected]/service.go:309 Shutdown complete. Error: cannot start pipelines: unknown encoding extension "invalid" 2025/01/17 10:51:28 collector server run finished with error: cannot start pipelines: unknown encoding extension "invalid"

It's doing a bit more work than before it gets to the error, but AFAIK it's not possible to access extensions earlier than the Start method.

receiver/awsfirehosereceiver/internal/unmarshaler/cwlog/unmarshaler.go

receiver/awsfirehosereceiver/internal/unmarshaler/cwmetricstream/unmarshaler.go

receiver/awsfirehosereceiver/internal/unmarshaler/cwmetricstream/unmarshaler_test.go

receiver/awsfirehosereceiver/logs_receiver.go

receiver/awsfirehosereceiver/metrics_receiver.go

github-actions · 2025-02-05T05:22:04Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Add support for using encoding extensions for unmarshalling records transmitted via Amazon Data Firehose. The "record_type" config is now deprecated and has been replaced by "encoding". This new config setting supports all of the existing encodings (cwlogs, cwmetrics otlp_v1) as well as support for loading additional encodings via extensions.

github-actions · 2025-02-26T05:20:56Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Aneurysm9

Error enrichment can happen in a separate PR, but I think it needs to happen. Context from the caller is sometimes the only way to understand what an error message actually means.

receiver/awsfirehosereceiver/logs_receiver.go

Co-authored-by: Anthony Mirabella <[email protected]>

receiver/awsfirehosereceiver/config.go

* main: (22 commits) [receiver/awsfirehose] Add support for encoding extensions (open-telemetry#37262) fix(deps): update module google.golang.org/api to v0.223.0 (open-telemetry#38181) [chore] skip TestSyslogComplementaryRFC3164 (open-telemetry#38240) fix(deps): update module github.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/common to v1.0.1106 (open-telemetry#38199) [provider/s3] Use mdatagen, promote to alpha (open-telemetry#38227) fix: fix flaky test in kafkatopicsobserver (open-telemetry#38218) [processor/resourcedetection] Add k8s.cluster.uid to kubeadm detector (open-telemetry#38216) Revert "Add issue generation from fkaly tests for all archs (open-telemetry#38191)" (open-telemetry#38230) Revert "Introduce issuegenerator to open issues when tests fail on main (open-telemetry#38177)" (open-telemetry#38231) [chore] Update otelcol core dependency (open-telemetry#38214) [pkg/stanza] Improve error logs produced by transformer processors (open-telemetry#37285) [receiver/statsd] Make full config structure public (open-telemetry#38186) processor/metricsstarttime: add ridwanmsharif as codeowner (open-telemetry#38193) fix(deps): update module github.com/huaweicloud/huaweicloud-sdk-go-v3 to v0.1.137 (open-telemetry#38154) [pkg/datadog] export StaticAPIKeyCheck (open-telemetry#38223) [chore][pkg/ottl] Move scope and resource PathGetSetters to internal ctx packages (open-telemetry#38225) fix(deps): update all github.com/datadog packages to v0.64.0-rc.3 (open-telemetry#38202) feat(telemetrygen): added support for delta temporality (open-telemetry#38146) [chore] Some more fixes of component IDs (open-telemetry#38221) [chore][pkg/ottl] Define PathGetSetter in ctxdatapoint (open-telemetry#38201) ...

@Aneurysm9

#### Description I would like to become a codeowner of the awsfirehosereceiver. Elastic (where I work) intends to use this receiver heavily, and I would like to: - help spread out the maintenance load - ensure we have a path to progressing beyond the current Alpha stability Assuming open-telemetry/opentelemetry-collector#11864 goes through, it will be necessary to have multiple codeowners to progress to Beta and preferable to have codeowners from multiple employers, which is the case for @Aneurysm9 (AWS) and me (Elastic). Relevant changes: - #37111 (superseded by #37361) - #37262 - #37361 - #38388 (will extract from awsfirehosereceiver) - #38445 #### Link to tracking issue N/A #### Testing N/A #### Documentation N/A

github-actions bot added the receiver/awsfirehose label Jan 16, 2025

github-actions bot requested a review from Aneurysm9 January 16, 2025 08:25

axw force-pushed the firehose-encoding-extension branch 4 times, most recently from fbf5919 to abc720f Compare January 16, 2025 08:59

axw marked this pull request as ready for review January 16, 2025 11:25

axw requested a review from a team as a code owner January 16, 2025 11:25

axw requested a review from andrzej-stencel January 16, 2025 11:25

github-actions bot assigned crobert-1 Jan 16, 2025

axw mentioned this pull request Jan 16, 2025

[receiver/awsfirehose] Implement encoding extension framework #37113

Closed

atoulme approved these changes Jan 17, 2025

View reviewed changes

Aneurysm9 suggested changes Jan 17, 2025

View reviewed changes

This comment was marked as outdated.

Sign in to view

github-actions bot added the Stale label Feb 5, 2025

This comment was marked as outdated.

Sign in to view

axw marked this pull request as draft February 10, 2025 00:43

github-actions bot removed the Stale label Feb 10, 2025

axw force-pushed the firehose-encoding-extension branch 2 times, most recently from b70cbd0 to b147b60 Compare February 11, 2025 06:17

axw force-pushed the firehose-encoding-extension branch from b147b60 to 512d5e4 Compare February 11, 2025 06:37

axw marked this pull request as ready for review February 11, 2025 06:58

axw requested a review from Aneurysm9 February 11, 2025 06:58

github-actions bot assigned TylerHelmuth Feb 11, 2025

github-actions bot added the Stale label Feb 26, 2025

Aneurysm9 approved these changes Feb 26, 2025

View reviewed changes

receiver/awsfirehosereceiver/logs_receiver.go Outdated Show resolved Hide resolved

axw and others added 3 commits February 27, 2025 07:19

Update receiver/awsfirehosereceiver/logs_receiver.go

6f8b1b7

Co-authored-by: Anthony Mirabella <[email protected]>

Merge branch 'main' into firehose-encoding-extension

936d68c

Add more context to errors

f726b21

github-actions bot removed the Stale label Feb 27, 2025

mx-psi approved these changes Feb 27, 2025

View reviewed changes

receiver/awsfirehosereceiver/config.go Outdated Show resolved Hide resolved

Update receiver/awsfirehosereceiver/config.go

9008e16

mx-psi merged commit 81c5eee into open-telemetry:main Feb 27, 2025
156 checks passed

github-actions bot added this to the next release milestone Feb 27, 2025

axw deleted the firehose-encoding-extension branch February 27, 2025 08:59

axw mentioned this pull request Mar 6, 2025

[chore] Add axw to awsfirehosereceiver codeowners #38415

Merged

[receiver/awsfirehose] Add support for encoding extensions #37262

[receiver/awsfirehose] Add support for encoding extensions #37262

Uh oh!

Conversation

axw commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Link to tracking issue

Testing

Documentation

Uh oh!

atoulme left a comment

Choose a reason for hiding this comment

Uh oh!

Aneurysm9 Jan 17, 2025

Choose a reason for hiding this comment

Uh oh!

axw Jan 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Aneurysm9 Jan 17, 2025

Choose a reason for hiding this comment

Uh oh!

axw Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

github-actions bot commented Feb 5, 2025

Uh oh!

This comment was marked as outdated.

github-actions bot commented Feb 26, 2025

Uh oh!

Aneurysm9 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

axw commented Jan 16, 2025 •

edited

Loading

axw Jan 17, 2025 •

edited

Loading