Skip to content

NeverIncludeDomain filter #74

@lukasurb1

Description

@lukasurb1

I'm hoping to get some clarification on the expected behavior of the NeverIncludeDomain whitelisting feature, as I'm seeing a discrepancy between my live analysis and the results from a unit test.

My Goal:

I am trying to whitelist several domains to reduce noise in my beacon analysis. My configuration includes entries with leading wildcards, like '*.cymru.com'.

My Configuration:

Here is a snippet of my NeverIncludeDomain list:

  NeverIncludeDomain:
    - '*.cymru.com'
    - '*.philips.com'
    - '*.qnap.com'
    - '*.windowsupdate.com'
    - 'push.services.mozilla.com'
 

Observed Behavior:

Despite having '*.cymru.com' in the configuration, I still see beaconing results for the root domain cymru.com in my dataset after running RITA.

Troubleshooting Step I Took:

To investigate this, I modified the TestFilterDomain unit test inside config/filter_test.go to check the logic directly. I set up the test like this:

    neverIncludedDomainList := []string{
            "*.cymru.com",
            "google.com",
    }


    // NeverInclude list test
    t.Run("NeverInclude list test", func(t *testing.T) {
            cfg.Filtering.NeverIncludedDomains = neverIncludedDomainList
            checkCases := cfg.Filtering.FilterDomain("cymru.com")
            require.True(t, checkCases, "filter state should match expected value")

To my surprise, this unit test passed which suggests that the filtering logic itself correctly handles matching a root domain with a leading wildcard.

My Question:

Given that the unit test shows the filtering logic works as I expect, I'm trying to understand why it's not being applied in my live analysis. My main questions are:

Could a single syntax error elsewhere in the list (for example, accidentally including a full URL instead of just a domain name) cause the entire NeverIncludeDomain list to be silently ignored during the import process?

Is there any potential difference between how the FilterDomain() function works in isolation and how the whitelist is applied during the full data import and beacon analysis pipeline?

Thank you for your time and for creating such a valuable tool. Any insight you could provide would be greatly appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions