Skip to content

Conversation

@kaiyan-sheng
Copy link
Contributor

@kaiyan-sheng kaiyan-sheng commented Oct 22, 2025

Proposed commit message

This PR is to enhance Azure Event Hub input plugin for Elastic Agent with RBAC authorization (OAuth2) due to security requirements. Previously we only support shared access key (with connection string) for authentication.

The implementation added a new config parameter called auth_type for users to specify authentication method:
When auth_type is set to connection_string, or leave it blank: connection_string is required. When auth_typeis set toclient_secret`, oauth2 is used.

Note: We do expect users to use the same auth type for both eventhub and storage account.

OAuth2 specific Configuration Parameters (auth_type=client_secret)

When using OAuth2 authentication, the following parameters are required:

  • eventhub_namespace: Fully qualified namespace (e.g., namespace.servicebus.windows.net)
  • tenant_id: Azure AD tenant ID
  • client_id: Azure AD application (client) ID
  • client_secret: Azure AD application client secret
  • authority_host: Azure AD authority host (optional, defaults to Azure Public Cloud) https://login.microsoftonline.com is the default.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool.

Disruptive User Impact

N/A

How to test this PR locally

Setups on Azure side
  1. Setup environment variables for setting up eventhub
export RESOURCE_GROUP="kaiyan-resource-group"
export LOCATION="eastus"
export EVENTHUB_NAMESPACE="kaiyan-filebeat-test-ns"
export EVENTHUB_NAME="kaiyan-test-logs"
export STORAGE_ACCOUNT="kaiyanfbstorage"
export STORAGE_CONTAINER="kaiyan-fb-container"
export APP_NAME="filebeat-eventhub-app"
  1. Create resource group, eventhub namespace, eventhub
az group create --name $RESOURCE_GROUP --location $LOCATION
az eventhubs namespace create \
    --resource-group $RESOURCE_GROUP \
    --name $EVENTHUB_NAMESPACE \
    --location $LOCATION \
    --sku Standard
az eventhubs eventhub create \
    --resource-group $RESOURCE_GROUP \
    --namespace-name $EVENTHUB_NAMESPACE \
    --name $EVENTHUB_NAME \
    --partition-count 4
  1. Create storage account, storage container
az storage account create \
    --resource-group $RESOURCE_GROUP \
    --name $STORAGE_ACCOUNT \
    --location $LOCATION \
    --sku Standard_LRS
az storage container create --name $STORAGE_CONTAINER --account-name $STORAGE_ACCOUNT
  1. Create Azure AD application, service principle
APP_ID=$(az ad app create \
    --display-name $APP_NAME \
    --query appId --output tsv)
az ad sp create --id $APP_ID
  1. Assign eventhub role
EVENTHUB_RESOURCE_ID=$(az eventhubs namespace show \
    --resource-group $RESOURCE_GROUP \
    --name $EVENTHUB_NAMESPACE \
    --query id --output tsv)
az role assignment create \
    --assignee $APP_ID \
    --role "Azure Event Hubs Data Receiver" \
    --scope $EVENTHUB_RESOURCE_ID
  1. Get storage account connection string and client secret
STORAGE_CONNECTION_STRING=$(az storage account show-connection-string \
    --resource-group $RESOURCE_GROUP \
    --name $STORAGE_ACCOUNT \
    --query connectionString --output tsv)
CLIENT_SECRET=$(az ad app credential reset \
    --id $APP_ID \
    --years 1 \
    --query password --output tsv)

OR
Instead of getting storage account connection string, assign storage account role:

STORAGE_RESOURCE_ID=$(az storage account show \
    --resource-group $RESOURCE_GROUP \
    --name $STORAGE_ACCOUNT \
    --query id --output tsv)

az role assignment create \
    --assignee $APP_ID \
    --role "Storage Blob Data Contributor" \
    --scope $STORAGE_RESOURCE_ID
  1. Create an elastic cloud deployment and get the credentials for testing Filebeat
cloud.id: test-filebeat:foo
cloud.auth: elastic:bar
  1. Build and run Filebeat locally
mage update; mage build; ./filebeat -e
  1. Get tenant ID:
az account show --query tenantId --output tsv

When no connection_string is specified and no auth_type is specified:

filebeat.inputs:
  - type: azure-eventhub
    eventhub: "kaiyan-test-logs"
    consumer_group: "$Default"
    eventhub_namespace: "kaiyan-filebeat-test-ns"
    tenant_id: "<redacted>"
    client_id: "<redacted>"
    client_secret: "<redacted>"
    authority_host: "https://login.microsoftonline.com"
    storage_account: "kaiyanfbstorage"
    storage_account_connection_string: "<redacted>"
    storage_account_container: "kaiyan-fb-container"
    processor_version: "v2"

We get error log when starting Filebeat:

Exiting: Failed to start crawler: starting input failed: error while initializing input: reading azure-eventhub input config: connection_string is required when auth_type is empty or set to connection_string accessing 'filebeat.inputs.0' (source:'filebeat.yml')

testing backwards compatibility:

TBD

testing with oauth2 for both eventhub and SA:

filebeat.inputs:
  - type: azure-eventhub
    eventhub: "kaiyan-test-logs"
    consumer_group: "$Default"
    eventhub_namespace: "kaiyan-filebeat-test-ns.servicebus.windows.net"
    tenant_id: "<your-tenant-id>" 
    client_id: "<your-app-id>"
    client_secret: "<your-secret>"
    authority_host: "https://login.microsoftonline.com"
    storage_account: "kaiyanfbstorage"
    storage_account_container: "kaiyan-fb-container"
    processor_version: "v2"
    auth_type: "client_secret"

Screenshots

I can see logs getting ingested from Eventhub to elasticsearch with Filebeat:
Screenshot 2025-10-21 at 9 44 26 PM

Logs

I see this in the filebeat log when testing:

{"log.level":"info","@timestamp":"2025-10-28T17:28:38.082-0600","log.logger":"input.azure-eventhub.oauth2","log.origin":{"function":"github.com/elastic/beats/v7/x-pack/filebeat/input/azureeventhub.createContainerClientWithOAuth2","file.name":"azureeventhub/v2_input.go","file.line":771},"message":"successfully created container client with OAuth2 authentication","service.name":"filebeat","storage_account":"kaiyanfbstorage","container":"kaiyan-fb-container","tenant_id":"aa40685b-417d-4664-b4ec-8f7640719adb","client_id":"b7a30122-496d-4d84-9200-4f24066b6045","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2025-10-28T17:28:39.858-0600","log.logger":"input.azure-eventhub.oauth2","log.origin":{"function":"github.com/elastic/beats/v7/x-pack/filebeat/input/azureeventhub.createConsumerClientWithOAuth2","file.name":"azureeventhub/v2_input.go","file.line":727},"message":"successfully created consumer client with OAuth2 authentication","service.name":"filebeat","namespace":"kaiyan-filebeat-test-ns.servicebus.windows.net","eventhub":"kaiyan-test-logs","tenant_id":"aa40685b-417d-4664-b4ec-8f7640719adb","client_id":"b7a30122-496d-4d84-9200-4f24066b6045","ecs.version":"1.6.0"}

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 22, 2025
@github-actions
Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@mergify
Copy link
Contributor

mergify bot commented Oct 22, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @kaiyan-sheng? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 22, 2025

🔍 Preview links for changed docs

@kaiyan-sheng kaiyan-sheng added the backport-skip Skip notification from the automated backport with mergify label Oct 28, 2025
Copy link
Contributor

@colleenmcginnis colleenmcginnis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting with v9.0, there is no longer a new documentation set published with every minor release: the same page stays valid over time and shows version-related evolutions. Read more in Write cumulative documentation.

Based on the backport labels, it looks like these changes are likely targeting 9.3.0. If that's the case, you can use my suggestions below. If not, feel free to adjust my suggestions as needed.

@kaiyan-sheng kaiyan-sheng added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Oct 30, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 30, 2025
@kaiyan-sheng
Copy link
Contributor Author

@colleenmcginnis Thank you so much for the comments! Yes this change will likely to go in for 9.3.0. Definitely no backports since it is a feature.

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the changes in the oauth2_azure branch. This is a solid first PR for adding OAuth2 authentication to the Azure Event Hub input.

Here are my observations.

I would move the authentication-related code into dedicated files, with helper function(s) that takes the config and returns a azcore.TokenCredential — or a client, since the current legacy authentication methods are not included in azidentity.

// Just a high-level example
func createAzureCredential(config azureInputConfig, log *logp.Logger) (azcore.TokenCredential, error) {
      credentialOptions := &azidentity.ClientSecretCredentialOptions{
          ClientOptions: azcore.ClientOptions{
              Cloud: getAzureCloud(config.AuthorityHost),
          },
      }

      return azidentity.NewClientSecretCredential(
          config.TenantID,
          config.ClientID,
          config.ClientSecret,
          credentialOptions,
      )
  }

I'm usually against premature optimizations, but in this case we have a list of authentication methods we're going to implement very soon.

The azidentity supports many credential types https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity#readme-credential-types, but I guess our list will probably be:

  • Client secret (oauth2) — this PR!
  • Workload identity
  • Managed Identity (?)
  • Client certificate (?)

I would adopt the azidentity naming convention and use "client secret" insted of "OAuth2".

Since we'll soon have 3+ authentication types, we should probably consider adding an "auth type" configuration setting for explicit selection. Inferring the authentication type from the option values can become tricky pretty quickly—also considering that we'll have event hub and storage account.

Having an auth_type option to check can probably simplify config validation and setup logic.

To wrap things up I suggest:

  • Move the authentication-related code into dedicated files, with a single entry point that takes the config and returns a client.
  • Add auth_type (for example, client_credentials, workload_identity, etc.)
  • Adopt the azidentity naming convention for credential types.
  • We probably need to list the permissions needed to read from the Event Hubs and write to the Storage Account. Ideally also add at least a link to some Microsoft docs that explains how to set up the service principal / app regististration—so we don't have to maintain it.

@kaiyan-sheng
Copy link
Contributor Author

Since we'll soon have 3+ authentication types, we should probably consider adding an "auth type" configuration setting for explicit selection. Inferring the authentication type from the option values can become tricky pretty quickly—also considering that we'll have event hub and storage account.

@zmoog I totally forgot we talked about this in our meeting. Yes I will add the auth_type here. I think for backwards compatibility, I will just have connection_string as the default auth_type if auth_type is not specified.

@kaiyan-sheng
Copy link
Contributor Author

@zmoog While making the changes, I realize we also have storage account auth type to consider. Should we allow customers to have different auth types for storage account and eventhub? Will the customer in any use case perfers that?

@kaiyan-sheng
Copy link
Contributor Author

@zmoog and I talked offline and decided to only have one auth_type for both eventhub and storage account.

@kaiyan-sheng kaiyan-sheng changed the title [Azure] Add Oauth2 support for eventhub filebeat input [Azure] Add client secret (Oauth2) support for eventhub filebeat input Nov 3, 2025
@kaiyan-sheng kaiyan-sheng marked this pull request as draft November 4, 2025 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-skip Skip notification from the automated backport with mergify Team:obs-ds-hosted-services Label for the Observability Hosted Services team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants