Skip to content

Azure Graph API SignIn Log logstash configuration#342

Open
cirosec wants to merge 6 commits intophilhagen:developfrom
cirosec:AzureADSignInLogs
Open

Azure Graph API SignIn Log logstash configuration#342
cirosec wants to merge 6 commits intophilhagen:developfrom
cirosec:AzureADSignInLogs

Conversation

@cirosec
Copy link
Copy Markdown

@cirosec cirosec commented Dec 13, 2024

This pull-request contains the configuration required to parse Azure Graph API SignIn Logs.
Together with this pull-request another pull request in the Microsoft-extraction-suite repo will be opened, which adds an output option, so that the output of that tool can be seamlessly imported into the sof-elk.

philhagen and others added 6 commits October 30, 2023 09:15
This is a minor change: only change the sort while importing Python modules.
The objective is to make the code more readable and standardized.
# Conflicts:
#	configfiles/1010-preprocess-snare.conf
#	configfiles/1801-preprocess-azure.conf
#	configfiles/6010-snare.conf
#	configfiles/6901-aws.conf
@philhagen
Copy link
Copy Markdown
Owner

Thank you! I've changed the base branch to the one that will be used for the pending updated release.

I'm wondering if it would be better to add this handling to the existing signinlogs section (

### Azure SignIn Logs, in JSON format
) rather than to its own. My thought would be to add the if [raw][category] == "GraphSignInLogs" or [raw][authenticationMethodsUsed] logic to line 60, and add the rename{} block and [raw][authenticationProcessingDetails] restructuring to the conditional block at line 60.

Also, do you have any samples of these logs that I could test with? Emailing them would be fine if they cannot be shared publicly.

@philhagen
Copy link
Copy Markdown
Owner

I spoke with the FOR509 team and they've been unable to find a record with category=GraphSignInLogs. However, this may be the result of various license categories with differing fields/structures, or just that Microsoft could have changed the schema without documenting it (sadly, this would not be new or a surprise).

Do you have information on the native Azure workflow to get logs with this category value? My strong preference is to handle logs in their most native form, rather than restructuring done by third party tools. However, in some cases that's inevitable - I just want to be sure it's an informed decision.

@cirosec-ffr
Copy link
Copy Markdown

cirosec-ffr commented Dec 14, 2024

The data for this logstash snippet is obtained using the Graph API endpoint https://graph.microsoft.com/beta/auditLogs/signIns (as currently implemented in the Extractor Suite). An example of the Graph API output can be found here: https://learn.microsoft.com/en-us/graph/api/signin-get?view=graph-rest-1.0&tabs=http .
The problem with this endpoint is that it does not return a category field, so I added the GraphSignInLogs value, which - in retrospect - seems a bit arbitrary; agreed! Nobody implements this value, so from my point of view we can remove it.

While testing, I saw that there was already a SignInLog parser:

if [raw][category] == "SignInLogs" or [raw][category] == "ManagedIdentitySignInLogs" or [raw][category] == "NonInteractiveUserSignInLogs" or [raw][category] == "ServicePrincipalSignInLogs" {

But I was unable to make sense of the category values, the key mapping (as the Microsoft documentation does not contain the property key that many of the mapped values use as well as a different capitalization of keys), and many values present in the original logs were not parsed at all. I suspect, that the parser for the sign in logs is either outdated, Microsoft changed things in the last 2-3 years, or there is another endpoint from Microsoft, which returns SignInLog data, which is not in the format of any of theese tables:

After researching a bit, it looks like, the https://graph.microsoft.com/beta/auditLogs/signIns endpoint returns a subset of the aadnoninteractiveusersigninlogs table, which also explains, why the I was unable to parse it using the existing implementation.

It would be really nice, if we could merge the existing implementation and my implementation for the Graph API endpoint.

Do you know, where the data, the FOR509 team ingests, is sourced from and if the current SignInLog logstash implemention is still used?

@philhagen
Copy link
Copy Markdown
Owner

philhagen commented Dec 16, 2024

Thanks for the detail - this is helpful in figuring out how to proceed. I do not want to start by using data from the MES tool, but rather with the native data format from Microsoft's own export/API/etc. As you've seen, the capitalization is one of the headaches that demonstrate the reason for this: MS exports in lowerFirstCamel case, while the MES tool previously used UpperFirstCamelCase - at least for signIn resource type objects. (I believe this was or will be addressed but do not have sample data to confirm this yet.)

From the resources you've provided, these entries seem to be an entirely different object type. (Using UpperFirstCamelCase, because Microsoft 😠.) If that's the case, I suspect a separate parsing stanza would be appropriate. This would probably be whatever the most appropriate common, top-level parent category of logs these are considered. We'd need to identify a unique characteristic of these logs that would be used to detect this specific category and trigger the appropriate parsing stanza.

Could you provide a sample collected from the source (directly, not from the MES tool) so I can take a closer look? The documentation links are mostly helpful, but especially in Microsoft's case, the real-world samples are commonly mismatched with the documentation so I always try to work from a sample first.

I'll also defer to @Pierre450 (and @aNerdFromDuval) for their comment on the native sourcing of Azure log data.

@philhagen philhagen self-assigned this Dec 16, 2024
@Pierre450
Copy link
Copy Markdown

Pierre450 commented Dec 16, 2024

@0xffr In the FOR509 class, I store the logs in a storage account blob and retrieve them from there. That method works with every type of Azure log (except the UAL) and is reliable.
Extracting the logs with Graph API means some level of coding which opens the possibility of changing the output even if unintentionally.
It would be interesting to compare the two outputs and see if Microsoft is being consistent (or not) in the fields.

@philhagen philhagen deleted the branch philhagen:develop January 2, 2025 02:04
@philhagen philhagen closed this Jan 2, 2025
@philhagen
Copy link
Copy Markdown
Owner

philhagen commented Jan 2, 2025

SHOOT! I didn't mean to close this - it was automatic on deletion of the branch

could you re-file against the develop branch, and then, pending @Pierre450's test results, we can get this one integrated.

@philhagen philhagen reopened this Jan 3, 2025
@philhagen philhagen changed the base branch from feature/ubuntu-ansible to develop January 3, 2025 20:10
@philhagen
Copy link
Copy Markdown
Owner

I realized that if I re-created a branch with the same name, I can re-open this PR and then change its base. Sorry about that again!

@cirosec-ffr
Copy link
Copy Markdown

I send you an e-mail with sample data from the last days. If you need anything else please let me know.

@cirosec-ffr
Copy link
Copy Markdown

@philhagen Is there anything I can do to help you get the changes merged?

We'd need to identify a unique characteristic of these logs that would be used to detect this specific category and trigger the appropriate parsing stanza.

I absolutely agree, but based on the documentation of the datastructure I don't really see a way how to generically detect if EntraID Graph SignIn logs are being imported.

The only sign-In logs specific part in my configuration is this block:

if [raw][authenticationProcessingDetails] {
        ruby {
              path => "/usr/local/sof-elk/supporting-scripts/split_kv_to_fields.rb"
              script_params => {
                "source_field" => "[raw][authenticationProcessingDetails]"
                "destination_field" => "[raw][authenticationProcessingDetails]"
                "key_field" => "key"
                "val_field" => "value"
              }
        }
}

Since there is only this small non-generic block in the configuration, I suggest we could treat the configuration as a generic way of importing EntraID Audit Logs. They all seem to have similar data structures to the SignIn Logs.

Regarding the lowerFirstCamel notation: in the latest exports from MES this - at least in the case of the SignIn object - doesn't seem to be true anymore. In my case the exported data is in the lowerFirstCamel format.

@Pierre450 Storing relevant logs in an Azure Storage Blob is definitely a thing I recommend everyone should do! But in practice, only a small fraction of our customers implement this. So we usually have to download all the logs from the various API endpoints before we can start our analysis. So relying on logs stored in Azure Blob storage is not an option for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants