[extension/azure_encoding] Add support for Administrative, Alert, Autoscale, Policy, Security, ServiceHealth, and ResourceHealth log categories.#45699
Conversation
|
@Fiery-Fenix, I’d love to get your eyes on this as well, whenever you have a chance. |
# Conflicts: # extension/encoding/azureencodingextension/internal/unmarshaler/logs/README.md # extension/encoding/azureencodingextension/internal/unmarshaler/logs/category.go # extension/encoding/azureencodingextension/internal/unmarshaler/logs/helpers.go
Use the tranlator/azurelogs test document
Use the translator/azurelogs test document
Use the translator/azurelogs test document
Use the translator/azurelogs test document
Use the translator/azurelogs test document
95b92f4 to
4028542
Compare
|
Also I think it's better to place all test data (input json files and expected yaml files) into single folder that correlates with name of code file that is actually parsing them, like it done for other |
I agree! It definitely simplifies finding the relevant test data during debugging. Sorry for missing the obvious—my bad! |
|
@Fiery-Fenix, I believe we need to reconcile the data model for Analysis of 36 JSON test files revealed:
Files WITH Identity
Files WITHOUT IdentityThe following categories have no identity field in their test data:
Distinct Identity PatternsPattern 1: Activity Log Identity (Azure AD / Entra ID)Used by: Administrative, Policy, Alert, Autoscale, ServiceHealth, Recommendation This is the standard Azure Activity Log identity structure based on Azure AD/Entra ID tokens. {
"authorization": {
"scope": "/subscriptions/.../providers/...",
"action": "microsoft.insights/diagnosticSettings/write",
"evidence": {
"role": "Owner",
"roleAssignmentScope": "/subscriptions/...",
"roleAssignmentId": "...",
"roleDefinitionId": "...",
"principalId": "...",
"principalType": "User"
}
},
"claims": {
"iss": "https://sts.windows.net/.../",
"aud": "https://management.core.windows.net/",
"exp": "1744717084",
"nbf": "1744711621",
"iat": "1744711621",
"http://schemas.microsoft.com/identity/claims/scope": "user_impersonation",
"http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress": "user@example.com",
"http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier": "...",
"appid": "...",
"idtyp": "user"
}
}Key characteristics:
Variants:
Pattern 2: Storage Identity (SAS/Delegation)Used by: Storage logs (StorageRead, StorageWrite, StorageDelete) This structure is specific to Azure Storage and represents SAS token or delegation-based access. {
"type": "DelegationSAS",
"tokenHash": "system-delegation(...),SasSignature(...)",
"authorization": [
{
"action": "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read",
"roleAssignmentId": "...",
"roleDefinitionId": "...",
"principals": [
{
"id": "...",
"type": "User"
}
],
"denyAssignmentId": "",
"type": "RBAC",
"result": "Denied",
"reason": "NoPolicy"
}
],
"requester": {
"objectId": "...",
"tenantId": "..."
}
}Key characteristics:
Pattern 3: Generic/Unknown (KeyVault-like)Used by: This appears to be a simplified or different identity structure. {
"claim": {
"oid": "607964b6-41a5-4e24-a5db-db7aab3b9b34"
}
}Key characteristics:
|
|
The current PR is based on the
So, for the - key: azure.identity.authorization.action
value:
stringValue: microsoft.insights/diagnosticSettings/write
- key: azure.identity.authorization.evidence.role
value:
stringValue: Owner
- key: azure.identity.authorization.evidence.role.assignment.scope
value:
stringValue: /subscriptions/11111111-1111-1111-1111-111111111111Instead, the storage categories map identity as follows: - key: azure.identity
value:
kvlistValue:
values:
- key: authorization
value:
arrayValue:
values:
- kvlistValue:
values:
- key: principals
value:
arrayValue:
values:
- kvlistValue:
values:
- key: type
value:
stringValue: User
- key: id
value:
stringValue: 7890abcd-ef12-3456-7890-abcd12345678The structural difference is that |
|
@Fiery-Fenix @constanca-m, how should we move forward? Should we continue deviating from the extension by:
In that case, how should we handle the
|
@zmoog How about to make similar trick as we made for general Category parsing? It should bring required flexibility in Azure Identity parsing.
I think by using this approach we can adopt Azure Identity parsing for any possible JSON structure if it will be required in future. Also by using this approach we'll avoid additional JSON parsing for |
It's a sound approach. But before moving to the implementation, I would love to sort out the general problem. We learned that activity and storage logs both have and
Structure is clearly different, but what about the semantics? Are these semantically the same, similar, or fundamentally different? These seems semantically different concepts that happen to share the field name identity: Activity Log identity = Caller Identity Assertion
Storage Log identity = Authorization Decision Audit
The Storage identity is closer to an access decision log than a pure identity claim. Based on differences in semantics, should we separate the concepts into different attribute families? For example (just an example, I'm not suggesting to make these changes): As a user, I would find confusing getting logs events with the same |
|
Or, perhaps simply renaming I can live with that, but schema inconsistency across categories still gives me some pause. Could having the same field with different structures and semantics cause issues for downstream backends? |
|
I propose this semantic conventions:
The other fields don't have conventions that fit them, that I could find. So I think we could use the ones you mentioned. For the storage logs, since authorization is an array, I agree it makes sense to use the ones you mentioned. |
It seems azure.identity is missing from some categories, and there are significant structural and semantic differences between them.
|
@Fiery-Fenix, I think your proposal (moving azure.identity mappings to categories) makes sense. I don't think we have a good sampling of how the categories implement azure.identity, at least not yet. So, we can deal with the current status in categories and maybe re-evaluate this decision as we add more log categories over time. So I'm making these changes and addressing other suggestions from the review comments:
|
Remove trailing blank line in category_identity.go and fix struct field alignment in category_storage.go Co-authored-by: Cursor <cursoragent@cursor.com>
|
@constanca-m, on scope → user.roles: scope seems to refer to OAuth2 permissions, which doesn't always quite fit here semantics-wise. In sample logs I often see How strongly do you feel about pushing this through? |
|
@Fiery-Fenix, could you take another look when you have a moment? |
constanca-m
left a comment
There was a problem hiding this comment.
Sorry I am being such a slow reviewer
Just dropping the empty field denyAssignmentId
constanca-m
left a comment
There was a problem hiding this comment.
I am a bit worried of using camel case for attributes, but if this behavior was already present, we can handle it in a different PR
I see, but it was already part of the storage mapping. |
|
@atoulme, I think we're ready for your final review. Could you take a look when you have a moment? 🙇 |
|
Thank you for your contribution @zmoog! 🎉 We would like to hear from you about your experience contributing to OpenTelemetry by taking a few minutes to fill out this survey. If you are getting started contributing, you can also join the CNCF Slack channel #opentelemetry-new-contributors to ask for guidance and get help. |
…oscale, Policy, Security, ServiceHealth, and ResourceHealth log categories. (open-telemetry#45699) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR brings over the remaining Activity log categories from `translator/azurelogs`, aligning this component with the changes made in open-telemetry#44871. New log categories: - Administrative - Alert - Autoscale - Policy - Security - ServiceHealth - ResourceHealth <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. #### Link to tracking issue Fixes n/a --> <!--Describe what testing was performed and which tests were added.--> #### Testing We added test cases for each category, leveraging the same log events found in translator/azurelogs for consistency. <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> Added a section covering Identity (including its major subfields) along with dedicated sections for each new log category. --------- Co-authored-by: Cursor <cursoragent@cursor.com>
…oscale, Policy, Security, ServiceHealth, and ResourceHealth log categories. (open-telemetry#45699) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR brings over the remaining Activity log categories from `translator/azurelogs`, aligning this component with the changes made in open-telemetry#44871. New log categories: - Administrative - Alert - Autoscale - Policy - Security - ServiceHealth - ResourceHealth <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. #### Link to tracking issue Fixes n/a --> <!--Describe what testing was performed and which tests were added.--> #### Testing We added test cases for each category, leveraging the same log events found in translator/azurelogs for consistency. <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> Added a section covering Identity (including its major subfields) along with dedicated sections for each new log category. --------- Co-authored-by: Cursor <cursoragent@cursor.com>
Description
This PR brings over the remaining Activity log categories from
translator/azurelogs, aligning this component with the changes made in #44871.New log categories:
Testing
We added test cases for each category, leveraging the same log events found in translator/azurelogs for consistency.
Documentation
Added a section covering Identity (including its major subfields) along with dedicated sections for each new log category.