Skip to content

[receiver/azure_functions] Initial implementation to receive logs from Azure Functions triggered by Event Hub#47050

Open
tetianakravchenko wants to merge 17 commits intoopen-telemetry:mainfrom
tetianakravchenko:azurefunctions-eventhub
Open

[receiver/azure_functions] Initial implementation to receive logs from Azure Functions triggered by Event Hub#47050
tetianakravchenko wants to merge 17 commits intoopen-telemetry:mainfrom
tetianakravchenko:azurefunctions-eventhub

Conversation

@tetianakravchenko
Copy link
Copy Markdown
Contributor

Description

Follow up of #46584
This is the initial implementation of the new azure_functions receiver.
Focus of this PR:

  • adds support for receiving logs from Azure Functions triggered by Event Hub
  • generalize the receiver to support trigger types beyond the Event Hub in future

Link to tracking issue

Fixes #43507

Testing

Corresponding Unit Test were added.

Documentation

Readme was adjusted to reflect PR changes

…rt event_hub trigger, only logs signal

Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>

mux := http.NewServeMux()
// TODO: Refactor so Start() collects path+profile specs from each trigger (logs and, later, metrics) and registers them in one loop; adding a trigger or signal should not duplicate registration logic here.
if r.cfg.EventHub != nil && len(r.cfg.EventHub.Logs) > 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the server start regardless of this configuration? If EventHub is nil or there are no EventHub.Logs configured there will be no route configured

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the server will start if correct http_config configuration is provided, but yes: if EventHub is nil or there are no EventHub.Logs configured, there will be no invoke routes
routes and corresponding profiles are created from the config (event_hub.logs only, for now), there are no default triggers

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we flag this early in the config Validate()? There's no point in moving forward with the startup sequence if we have zero bindings.

Validate() should require at least one trigger.

Copy link
Copy Markdown
Contributor Author

@tetianakravchenko tetianakravchenko Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zmoog my initial idea was to keep it this way for now, and when more triggers are added, add validation for at least one trigger with at least 1 binding is configured, to avoid making an event hub configuration required now
I've addressed this in 38fff2e, please have another look

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

p.protocol.Failure(w, fmt.Errorf("read body: %w", err), nil)
return
}
defer r.Body.Close()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be defered before body, err := io.ReadAll(r.Body) to catch an early error in io.ReadAll

res.Attributes().PutStr(k, v)
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is import cycles the only reason for a new package here? It's a bit confusing naming it handler, when there is also a handler.go file.

In my view this just includes types. I don't know if there are future plans for this package to include more functionality.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was the main driver, also to place shared behavior used by consumers (not much for now).
Agree the overlap with handler.go is confusing, we can rename to something like internal/invoke, wdyt?

In my view this just includes types. I don't know if there are future plans for this package to include more functionality.

I was trying to keep implementation generic and easy to extend to support trigger types other than the Event Hub as discussed with @zmoog

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was the main driver, also to place shared behavior used by consumers (not much for now).

Why not internal/common ?
@zmoog do you agree? I am not blocking on this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to common to avoid naming confusion

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal/common fails with the error: avoid meaningless package names, renamed to trigger as it contains common types for all triggers

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like trigger! 👍

Copy link
Copy Markdown
Contributor

@MichaelKatsoulis MichaelKatsoulis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just left some nits

Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
@github-actions github-actions Bot requested a review from constanca-m April 2, 2026 10:45
Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
… error: avoid meaningless package names

Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
@tetianakravchenko
Copy link
Copy Markdown
Contributor Author

@jmacd could you please have a look at this PR?

Copy link
Copy Markdown
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tetianakravchenko, nice work! I've noted a few minor things we could refine, but I'm happy to ship it in its current state, if needed.


mux := http.NewServeMux()
// TODO: Refactor so Start() collects path+profile specs from each trigger (logs and, later, metrics) and registers them in one loop; adding a trigger or signal should not duplicate registration logic here.
if r.cfg.EventHub != nil && len(r.cfg.EventHub.Logs) > 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we flag this early in the config Validate()? There's no point in moving forward with the startup sequence if we have zero bindings.

Validate() should require at least one trigger.

res.Attributes().PutStr(k, v)
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like trigger! 👍

Comment on lines +49 to +51
if merged.LogRecordCount() == 0 {
return errors.New("no logs to consume")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if merged.LogRecordCount() == 0 {
return errors.New("no logs to consume")
}
if merged.LogRecordCount() == 0 {
// Decision: Log events that result in zero records are treated
// as anomalies and rejected as permanent errors.
return errors.New("no logs to consume")
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this logic is a bit subtle, I'm calling it out explicitly so that reviewers and code-owners can push back if this doesn't align with our standards.

Comment on lines +46 to +49
w.Header().Set("Content-Type", "application/json")
if _, err := w.Write(data); err != nil {
http.Error(w, fmt.Sprintf("write response: %v", err), http.StatusInternalServerError)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
w.Header().Set("Content-Type", "application/json")
if _, err := w.Write(data); err != nil {
http.Error(w, fmt.Sprintf("write response: %v", err), http.StatusInternalServerError)
}
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write(data) // headers are flushed on Write; nothing useful we can do on error
}

// profile binds a method name (binding/path) to a Protocol and Consumer.
// The generic HTTP handler uses it to: parse request with protocol, then consume with consumer.
type profile struct {
method string
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider calling this binding instead of method.

My current understanding is the Function Host sets the Data.<name> in the HTTP request using the binding name from the <function-name/function.json:

{
  "bindings": [
    {
      "type": "eventHubTrigger",
      "name": "logs", // <———— 👀
      "direction": "in",
      "eventHubName": "logs",
      "connection": "EventHubConnectionString",
      "cardinality": "many",
      "consumerGroup": "ecf",
      "dataType": "binary"
    }
  ]
}

azure_functions:
# HTTP server configuration
http:
http_config:
Copy link
Copy Markdown
Contributor

@zmoog zmoog Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR contains the very first azure functions receiver working implementation, I don't think we need to handle a formal breaking change.

IMO no actions needed here.

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Apr 14, 2026

The Azure Functions host is the only client here, but the request size depends on the trigger — for Event Hub it's (max message size × batch count × base64 overhead).

Consider adding a max_request_body_size field to the top-level config with a generous
default (e.g. 100 MB) and wrapping the read:

    body, err := io.ReadAll(io.LimitReader(r.Body, maxRequestBodySize))

100 MB covers the worst realistic Event Hub scenario (20 MB premium messages ×
a few dozen batch size × 1.33 base64 overhead) while protecting against unbounded reads.

Future triggers should evaluate their own payload characteristics and document whether
the default is appropriate or needs to be adjusted.

And the config change would look like:

type Config struct {
      HTTP     *confighttp.ServerConfig `mapstructure:"http_config"`
      Auth     component.ID             `mapstructure:"auth"`
      EventHub *EventHubTriggerConfig   `mapstructure:"event_hub"`

      // MaxRequestBodySize is the maximum allowed size of an incoming invoke request body.
      // Defaults to 100 MB, which covers premium Event Hub tiers (20 MB messages)
      // with large batch sizes and base64 encoding overhead.
      // Future trigger types should verify this default fits their payload profile.
      MaxRequestBodySize int64 `mapstructure:"max_request_body_size"`
}

With the default set in the factory:

func createDefaultConfig() component.Config {
      return &Config{
              MaxRequestBodySize: 100 * 1024 * 1024, // 100 MB
      }
}

This keeps it simple — one field, one LimitReader, and a comment trail for whoever adds the next trigger type.

@tetianakravchenko
Copy link
Copy Markdown
Contributor Author

tetianakravchenko commented Apr 20, 2026

@zmoog thank you for the review!

The Azure Functions host is the only client here, but the request size depends on the trigger — for Event Hub it's (max message size × batch count × base64 overhead).

Consider adding a max_request_body_size field to the top-level config with a generous
default (e.g. 100 MB) and wrapping the read:

  body, err := io.ReadAll(io.LimitReader(r.Body, maxRequestBodySize))

100 MB covers the worst realistic Event Hub scenario (20 MB premium messages ×
a few dozen batch size × 1.33 base64 overhead) while protecting against unbounded reads.

in this case we might end up processing a truncated payload without noticing, additionally there is another existing max request body: https://pkg.go.dev/go.opentelemetry.io/collector/config/confighttp#ServerConfig:

// MaxRequestBodySize sets the maximum request body size in bytes. Default: 20MiB.
	MaxRequestBodySize [int64](https://pkg.go.dev/builtin#int64) `mapstructure:"max_request_body_size,omitempty"`

I think it is better to avoid two competing limits, wdyt?

Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
@tetianakravchenko
Copy link
Copy Markdown
Contributor Author

@constanca-m @jmacd could you please have a look at this PR?

Copy link
Copy Markdown
Contributor

@constanca-m constanca-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks very good. I am relying a bit on @zmoog review here, even though I have read the whole code now. Just two question, but LGTM


// Logs defines configuration for log records received from Azure Functions.
Logs EncodingConfig `mapstructure:"logs"`
HTTP *confighttp.ServerConfig `mapstructure:"http_config"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http_config name is a bit awkward. Usually it is just squashed or called http. Is there a reason why you changed it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, here's what the current config.yaml looks like:

azure_functions:
  http_config:
    endpoint: test:123
  auth: azureauth
  event_hub:
    logs:  # signal (logs, metrics, traces)
      - name: logs # event hub name
        encoding: azure_encoding
      - name: raw_logs # event hub name
        encoding: beats_encoding

In the above example, event_hub is the trigger type, then we have logs for the signal type, and the actual event hub name/encoding with the mapping to the encoding extension to use.

Since this is an Azure Functions receiver, in the future we'll possibly have more trigger types, like timer or http triggers. So, if we keep the trigger type at the root level we can't use http.

The alternative is to make the trigger explicit, like this:

azure_functions:
  http:
    endpoint: test:123
  auth: azureauth
  triggers:
    event_hub:
      logs:  # signal (logs, metrics, traces)
        - name: logs # event hub name
          encoding: azure_encoding
        - name: raw_logs # event hub name
          encoding: beats_encoding

@constanca-m, how does this look?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding triggers feels a bit redundant now, but having all the trigger types in one bucket seems tidier. I’m going to make this change and revert to the idiomatic config option. Sorry for the back-and-forth, @tetianakravchenko!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zmoog I've changed it in c6862fe

errs = append(errs, errors.New("at least one configured trigger with at least one binding is required"))
}

if cfg.EventHub != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for event hub to be nil? I thought it was mandatory for the receiver to work properly, but maybe I am wrong

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the event hub is the only trigger now, we can simply aassume it's required. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now only event hub trigger is supported, but I tried to keep it generic - so instead of making event hub required, there is a check above: "at least one configured trigger with at least one binding is required"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for me.

@zmoog
Copy link
Copy Markdown
Contributor

zmoog commented Apr 21, 2026

in this case we might end up processing a truncated payload without noticing, additionally there is another existing max request body: https://pkg.go.dev/go.opentelemetry.io/collector/config/confighttp#ServerConfig:

// MaxRequestBodySize sets the maximum request body size in bytes. Default: 20MiB.
	MaxRequestBodySize [int64](https://pkg.go.dev/builtin#int64) `mapstructure:"max_request_body_size,omitempty"`

I think it is better to avoid two competing limits, wdyt?

LGTM.

@tetianakravchenko
Copy link
Copy Markdown
Contributor Author

Hi @jmacd, all required codeowner approvals are in - is there anything else needed from my side before this is ready to merge?

…ger types in one bucket

Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New component: azure functions receiver

5 participants