The purpose and use-cases of the new component
Goal
Create a configurable, maintainable and reusable Encoding Extension for all telemetry data types available via Azure Diagnostic Settings export
Description
Azure Monitor at the moment support export of Logs, Metrics and Traces using Azure Diagnostic Settings to different destination that can be consumed by OpenTelemetry Collector.
Sources of telemetry data exposed via Azure Diagnostic Settings with corresponding OTEL data type
Supported Azure Diagnostic Settings destination with OpenTelemetry Collector receiver components
- Azure Event Hubs - can be consumed by
azureeventhubreceiver (AMQP-based) or kafkareceiver (Kafka-proto based)
- Azure Storage - can be consument by
azureblobreceiver, but this component does not have support for Azure-specific data unmarshaling
Current state of support of Azure-specific telemetry data types
In general exported Azure telemetry data is just a JSON with specific structure for each type of data (and log category as well), so in general it can be consumed as-is, but, in this case no normalization and aligning with OTEL SemConv will happened
There is already some packages and code that is helping with translation of Azure-specific telemetry data to correct OTEL data type with proper Resource and Record Attributes:
- pkg/translator/azure - is used for Logs and Traces, but Logs Attribute names are not translated to corresponding OTEL SemConv naming (for example
httpMethod is to translated to http.request.method)
- pkg/translator/azurelogs - is used for Logs only and has Logs Attribute name translation into OTEL SemConv (this is WIP now)
- azureeventhubreceiver - this component has built-in translator for Metrics, at the same time using
pkg/translator/azure and pkg/translator/azurelogs for Traces and Logs
- kafkareceiver - is using
pkg/translator/azure for Logs, but missing for any support for Azure Traces or Metrics translation
- azureblobreceiver - has no Azure-specific telemetry data support at all, only plain JSONs
Problem statement
From the descriptions above I can identify following problem statements with consuming Azure-specific telemetry data:
- Not all OpenTelemetry receivers, that can actually be used, are supporting consuming Azure-specific telemetry data:
azureeventhubreceiver - has all (but not all with proper semconv translation), kafkareceiver - only Logs (also without proper semconv translation), azureblobreceiver - has no support at all
- Azure-specific translation code is scattered between multiple components (
pkg/translator/azure, pkg/translator/azurelogs and azureeventhubreceiver) has duplicate functionality (Logs for example), sometimes lack of tests and has different approaches of translating data into OTEL format
- Azure-specific translation code is hard to configure, for example support for multiple time formats was actually copy-pasted to multiple components to be enabled (see here and here), while
kafkareceiver was missed and doesn't have this functionality at all
- Some portions of code are duplicated, for example
asTimestamp function that parses timestamp has 3 copies: here, here and here
- Last, but not the least - adding support for Azure-specific telemetry data, for example into
azureblobreceiver became quite hard task, because of all points described above
Proposed solution
From my perspective, the most appropriate solution to mitigate mentioned issues - is to create Azure Encoding Extension, that will:
- Consolidate all relevant code in single component, instead of 3 different. This will greatly increase maintainability of this code
- Defines single approach in translating Azure-specific telemetry data into OTEL telemetry data. This already happens with Logs in
pkg/translator/azurelogs, but I believe it should be relevant for other data types as well - resulting OTEL telemetry data should conform SemConv as much as possible
- Greatly decrease amount of required breaking changes for other involved components. New component could be developed independently from other components and only after stabilizing introduce single breaking change for migration
- Make Azure-specific translation code more flexible by providing additional configuration options, that doesn't impact receivers that is actually using this code
- Allows to use Azure-specific translation code in components that has/will support encoding extensions, which covers more specific use-cases for advanced users
Proposed implementation plan
Important: Most code for proposed Azure Encoding Extension is already present in this repository, but stretched across multiple components. So, actually this is not development from scratch, it's rather smart refactoring of the existing code
- Introduce skeleton for the new Azure Encoding Extension, discuss and align with proposed configuration options
- Implement Traces and Logs translation as simple usage of existing Unmarshallers in
pkg/translator/azure and pkg/translator/azurelogs. Unfortunately Metrics unmarshaller in azureeventhubreceiver is not exported and can't be used in this way, see next step
- Copy (to maintain compatibility) Metrics Unmarshaller from
azureeventhubreceiver to the new Extension, add missing unit tests and validate that produced result is aligned with OTEL SemConv
- Copy (to maintain compatibility) Traces Unmarshaller from
pkg/translator/azure, to the new Extension, add missing unit tests and validate that produced result is aligned with OTEL SemConv
- Copy (to maintain compatibility) Logs Unmarshaller from
pkg/translator/azurelogs to the new Extension
- Add missing Azure Resource Logs translator code for specific categories of Azure Logs, which is not translated yet (this is current WIP in
pkg/translator/azurelogs)
- Release component, at this point it will be available as an option for at least
kafkareceiver component
- Implement optional support for Encoding Extensions to
receiver/azureeventhubreceiver to provide ability to utilize new encoding extension (still no breaking changes BTW)
- Deprecate usage of
logs.encoding=azure_resource_logs in kafkareceiver and format=azure in azureeventhubreceiver, in favor of usage of azureencodingextension
- After deprecation period - remove
pkg/translator/azure, pkg/translator/azurelogs and respective unmarshalling code from azureeventhubreceiver (this is the only one breaking change that is required)
Example configuration for the component
extensions:
azure_encoding:
logs:
time_formats: ["01/02/2006 15:04:05", "2006-01-02T15:04:05Z"]
# other logs-specific settings
metrics:
time_formats: ["01/02/2006 15:04:05", "2006-01-02T15:04:05Z"]
# other logs-specific settings
traces:
time_formats: ["01/02/2006 15:04:05", "2006-01-02T15:04:05Z"]
# other logs-specific settings
receivers:
kafka/azure:
encoding: azure_encoding
Telemetry data types supported
traces, metrics and logs
Code Owner(s)
@constanca-m, @zmoog (eventually, pending membership), @Fiery-Fenix (eventually, pending membership), others welcome
Sponsor (optional)
@axw
Additional context
There are existing issues that somehow will be fulfilled by the current issue:
I would like to contribute some time to help create this Azure Encoding Extension, as a start of discussion point I have also created initial PR #41708
Pinging current maintainers of pkg/translator/azure, pkg/translator/azurelogs and azureeventhubreceiver as your code is planned to be used for this extension: @atoulme, @cparkins, @MikeGoldsmith, @constanca-m
As for Sponsorship for this new extension:
Taking in account that code is already in place and will be simple refactored into new component - I would love to see someone of maintainers of existing code (pkg/translator/azure, pkg/translator/azurelogs and azureeventhubreceiver) as a Sponsor, if it required of course
Open for your thoughts
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.
The purpose and use-cases of the new component
Goal
Create a configurable, maintainable and reusable Encoding Extension for all telemetry data types available via Azure Diagnostic Settings export
Description
Azure Monitor at the moment support export of Logs, Metrics and Traces using Azure Diagnostic Settings to different destination that can be consumed by OpenTelemetry Collector.
Sources of telemetry data exposed via Azure Diagnostic Settings with corresponding OTEL data type
microsoft.insights/componentsResource Type)Supported Azure Diagnostic Settings destination with OpenTelemetry Collector receiver components
azureeventhubreceiver(AMQP-based) orkafkareceiver(Kafka-proto based)azureblobreceiver, but this component does not have support for Azure-specific data unmarshalingCurrent state of support of Azure-specific telemetry data types
In general exported Azure telemetry data is just a JSON with specific structure for each type of data (and log category as well), so in general it can be consumed as-is, but, in this case no normalization and aligning with OTEL SemConv will happened
There is already some packages and code that is helping with translation of Azure-specific telemetry data to correct OTEL data type with proper Resource and Record Attributes:
httpMethodis to translated tohttp.request.method)pkg/translator/azureandpkg/translator/azurelogsfor Traces and Logspkg/translator/azurefor Logs, but missing for any support for Azure Traces or Metrics translationProblem statement
From the descriptions above I can identify following problem statements with consuming Azure-specific telemetry data:
azureeventhubreceiver- has all (but not all with proper semconv translation),kafkareceiver- only Logs (also without proper semconv translation),azureblobreceiver- has no support at allpkg/translator/azure,pkg/translator/azurelogsandazureeventhubreceiver) has duplicate functionality (Logs for example), sometimes lack of tests and has different approaches of translating data into OTEL formatkafkareceiverwas missed and doesn't have this functionality at allasTimestampfunction that parses timestamp has 3 copies: here, here and hereazureblobreceiverbecame quite hard task, because of all points described aboveProposed solution
From my perspective, the most appropriate solution to mitigate mentioned issues - is to create Azure Encoding Extension, that will:
pkg/translator/azurelogs, but I believe it should be relevant for other data types as well - resulting OTEL telemetry data should conform SemConv as much as possibleProposed implementation plan
Important: Most code for proposed Azure Encoding Extension is already present in this repository, but stretched across multiple components. So, actually this is not development from scratch, it's rather smart refactoring of the existing code
pkg/translator/azureandpkg/translator/azurelogs. Unfortunately Metrics unmarshaller inazureeventhubreceiveris not exported and can't be used in this way, see next stepazureeventhubreceiverto the new Extension, add missing unit tests and validate that produced result is aligned with OTEL SemConvpkg/translator/azure,to the new Extension, add missing unit tests and validate that produced result is aligned with OTEL SemConvpkg/translator/azurelogsto the new Extensionpkg/translator/azurelogs)kafkareceivercomponentreceiver/azureeventhubreceiverto provide ability to utilize new encoding extension (still no breaking changes BTW)logs.encoding=azure_resource_logsinkafkareceiverandformat=azureinazureeventhubreceiver, in favor of usage ofazureencodingextensionpkg/translator/azure,pkg/translator/azurelogsand respective unmarshalling code fromazureeventhubreceiver(this is the only one breaking change that is required)Example configuration for the component
Telemetry data types supported
traces, metrics and logs
Code Owner(s)
@constanca-m, @zmoog (eventually, pending membership), @Fiery-Fenix (eventually, pending membership), others welcome
Sponsor (optional)
@axw
Additional context
There are existing issues that somehow will be fulfilled by the current issue:
pkg/translator/azureandpkg/translator/azurelogs#39969 - proposal for mergingpkg/translator/azureandpkg/translator/azurelogsI would like to contribute some time to help create this Azure Encoding Extension, as a start of discussion point I have also created initial PR #41708
Pinging current maintainers of
pkg/translator/azure,pkg/translator/azurelogsandazureeventhubreceiveras your code is planned to be used for this extension: @atoulme, @cparkins, @MikeGoldsmith, @constanca-mAs for Sponsorship for this new extension:
Taking in account that code is already in place and will be simple refactored into new component - I would love to see someone of maintainers of existing code (
pkg/translator/azure,pkg/translator/azurelogsandazureeventhubreceiver) as a Sponsor, if it required of courseOpen for your thoughts
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding
+1orme too, to help us triage it. Learn more here.