Zeek Matchy Plugin

A Zeek plugin for high-performance IP address and string pattern matching using Matchy databases.

Why Matchy?

Matchy brings several advantages over traditional threat intelligence approaches in Zeek:

Memory Efficiency on Clusters

Shared memory across workers: Databases are memory-mapped, so all Zeek workers on a host share the same physical memory
Zero heap memory per-process: Unlike the Intel Framework which loads data into each worker's heap, Matchy uses the OS page cache
Massive scale: On a 32-core cluster, this can save gigabytes of RAM compared to per-worker copies

Operational Flexibility

Hot-reloadable: Databases open in <1ms, so you can close and reopen them at runtime during updates—no Zeek restart needed
No libmaxminddb dependency: Load and query MaxMind GeoIP databases directly—one less C library to manage
Build databases offline: Use the matchy CLI in CI/CD pipelines to build databases from any source (CSV, JSON, APIs)
Simple distribution: Just copy .mxy files to your cluster—no Broker setup or Intel Framework synchronization

Performance

7M+ IP queries/second: Memory-mapped lookups with zero-copy access
3M+ pattern queries/second: Efficient glob matching (*.evil.com)
Deterministic performance: No GC pauses or unpredictable slowdowns (Rust + mmap)
Single unified API: Query IPs, CIDRs, exact strings, and wildcards through one interface

Developer Experience

Easy debugging: Query .mxy files directly with the matchy CLI—no need to inspect Zeek's internal state
Type-safe with metadata: Queries return structured JSON with arbitrary fields, not just boolean matches
Version control friendly: Keep source CSVs in git, build binary databases in CI
Cross-platform: Same .mxy file works on Linux, macOS, and BSD

Matchy excels at read-heavy workloads with infrequent updates (typical threat intel scenarios). For dynamic, frequently-changing data with complex sharing across clusters, Zeek's Intel Framework is still the better choice.

Installation

Requirements

Zeek 5.0+
Rust/Cargo (install from rustup.rs)
Git
CMake 3.15+
C++17 compiler

Build

git clone https://github.com/sethhall/zeek-matchy-plugin.git
cd zeek-matchy-plugin
mkdir build && cd build
cmake ..
make

CMake automatically:

Finds Zeek via zeek-config (if in PATH)
Installs cargo-c (if needed)
Clones and builds Matchy from GitHub
Links everything together

Install (optional)

sudo make install

Verify Installation

Check that Zeek can see the plugin:

# If using ZEEK_PLUGIN_PATH
export ZEEK_PLUGIN_PATH=/path/to/zeek-matchy/build
zeek -N Matchy::DB

# If installed system-wide
zeek -N Matchy::DB

Expected output:

Matchy::DB - Fast IP and pattern matching using Matchy databases (dynamic, version 0.1.0)

Functions are automatically available in the Matchy:: namespace:

Matchy::load_database(file) - Returns database handle
Matchy::is_valid(db) - Check if handle is valid
Matchy::query_ip(db, ip) - Query by IP address
Matchy::query_string(db, string) - Query by string/pattern

Usage

Creating a Matchy Database

First, install the Matchy CLI tool:

cargo install matchy

Then create a database:

# Create a CSV file with threat indicators
cat > threats.csv << EOF
entry,threat_level,category,description
1.2.3.4,high,malware,Known C2 server
10.0.0.0/8,low,internal,RFC1918 private network
*.evil.com,critical,phishing,Phishing domain pattern
malware.example.com,high,malware,Malware distribution site
EOF

# Build the database
matchy build threats.csv -o threats.mxy --format csv

MatchyIntel Framework (Intel Framework Replacement)

The plugin includes MatchyIntel, a drop-in replacement for Zeek's Intel Framework that uses Matchy for high-performance matching. It automatically observes DNS queries, connection IPs, HTTP URLs, SSL/TLS SNI, and more.

Quick Start

@load Matchy/DB/intel

# Point to your threat intelligence database
redef MatchyIntel::db_path = "/opt/threat-intel/threats.mxy";

# React to matches
event MatchyIntel::match(s: MatchyIntel::Seen, metadata: string) {
    print fmt("THREAT: %s (%s) -> %s", s$indicator, s$where, metadata);
}

That's it! The framework will automatically check all DNS queries, connection IPs, HTTP hosts/URLs, and SSL SNI against your database.

Runtime Database Switching

You can change the database at runtime without restarting Zeek:

# Switch to a different database
Config::set_value("MatchyIntel::db_path", "/opt/threat-intel/updated.mxy");

# Unload the database (stop matching)
Config::set_value("MatchyIntel::db_path", "");

If the new path is invalid, the change is rejected and the current database stays loaded.

Manual Observation

You can also manually check indicators:

# Check an IP
MatchyIntel::seen(MatchyIntel::Seen($host=1.2.3.4,
                                    $where=MatchyIntel::IN_ANYWHERE));

# Check a domain
MatchyIntel::seen(MatchyIntel::Seen($indicator="evil.example.com",
                                    $indicator_type=MatchyIntel::DOMAIN,
                                    $where=MatchyIntel::IN_ANYWHERE));

Hooks and Customization

# Filter matches before they fire
hook MatchyIntel::seen_policy(s: MatchyIntel::Seen, found: bool) {
    # Suppress matches for internal IPs
    if (s?$host && Site::is_local_addr(s$host))
        break;
}

# Customize logging
hook MatchyIntel::extend_match(info: MatchyIntel::Info, s: MatchyIntel::Seen, metadata: string) {
    # Add custom fields, modify info record, etc.
}

Log Output

Matches are logged to matchy_intel.log with fields including:

ts, uid, id - Connection context
seen.indicator, seen.indicator_type, seen.where - What was seen
metadata - JSON blob from your database

Low-Level API

For more control, use the raw BiF functions directly:

global threats_db: opaque of MatchyDB;

event zeek_init() {
    # Load the database - returns an opaque handle
    threats_db = Matchy::load_database("/path/to/threats.mxy");
    
    if (!Matchy::is_valid(threats_db)) {
        print "Failed to load database!";
        return;
    }
    
    print "Database loaded successfully";
}

event connection_new(c: connection) {
    # Query the originator IP using the database handle
    local result = Matchy::query_ip(threats_db, c$id$orig_h);
    
    if (result != "") {
        print fmt("Threat detected from %s: %s", c$id$orig_h, result);
        # Result is JSON - parse with from_json() 
    }
}

event dns_request(c: connection, msg: dns_msg, query: string, qtype: count, qclass: count) {
    # Query domain name
    local result = Matchy::query_string(threats_db, query);
    
    if (result != "") {
        print fmt("Malicious domain queried: %s - %s", query, result);
    }
}

# Database is automatically cleaned up when Zeek terminates

Advanced Example with JSON Parsing (Low-Level API)

@load base/frameworks/notice

module ThreatIntel;

export {
    redef enum Notice::Type += {
        Threat_Detected
    };
    
    # Define structure matching your database fields
    type ThreatData: record {
        category: string &optional;
        threat_level: string &optional;
        description: string &optional;
    };
    
    global threats_db: opaque of MatchyDB;
}

event zeek_init() {
    threats_db = Matchy::load_database("/opt/threat-intel/threats.mxy");
    
    if (!Matchy::is_valid(threats_db)) {
        print "ERROR: Failed to load threat database";
    }
}

event connection_new(c: connection) {
    local result = Matchy::query_ip(threats_db, c$id$orig_h);
    
    if (result != "") {
        # Parse JSON result into typed record
        local parsed = from_json(result, ThreatData);
        
        if (parsed$valid) {
            local threat: ThreatData = parsed$v;
            
            NOTICE([$note=Threat_Detected,
                    $conn=c,
                    $msg=fmt("Threat: %s (%s)", threat$category, threat$threat_level),
                    $sub=fmt("IP: %s", c$id$orig_h)]);
        }
    }
}

API Reference

`load_database(filename: string): opaque of MatchyDB`

Load a Matchy database from file and return an opaque handle.

filename: Path to the .mxy database file
Returns: Opaque database handle, or nullptr on failure

Note: The database is automatically closed when the handle goes out of scope or Zeek terminates. No manual cleanup needed.

`is_valid(db: opaque of MatchyDB): bool`

Check if a database handle is valid and the database is open.

db: Database handle from load_database()
Returns: T if valid and open, F otherwise

`query_ip(db: opaque of MatchyDB, ip: addr): string`

Query the database by IP address.

db: Database handle from load_database()
ip: IP address to query
Returns: JSON string with match data, or empty string if no match

Example: Matchy::query_ip(db, 1.2.3.4)

`query_string(db: opaque of MatchyDB, query: string): string`

Query the database by string (exact match or pattern).

db: Database handle from load_database()
query: String to query (domain, exact string, or pattern like *.evil.com)
Returns: JSON string with match data, or empty string if no match

Example: Matchy::query_string(db, "malware.example.com")

Testing

The plugin includes comprehensive tests:

cd tests
ZEEK_PLUGIN_PATH=../build zeek simple-test.zeek

All tests should PASS. See tests/README.md for details.

Example test script:

event zeek_init() {
    local db = Matchy::load_database("test.mxy");
    
    if (Matchy::is_valid(db)) {
        # Test IP query
        local ip_result = Matchy::query_ip(db, 1.2.3.4);
        if (ip_result != "") {
            print "Match:", ip_result;
            # Output: {"category":"malware","threat_level":"high",...}
        }
        
        # Test pattern query  
        local pattern_result = Matchy::query_string(db, "sub.evil.com");
        if (pattern_result != "") {
            print "Match:", pattern_result;
            # Output: {"category":"phishing","threat_level":"critical",...}
        }
        
        # Database automatically cleaned up
    }
}

Troubleshooting

Plugin not found at runtime:

export ZEEK_PLUGIN_PATH=/path/to/zeek-matchy-plugin/build
zeek -N Matchy::DB

Advanced build options:

# Use existing Matchy installation
cmake -DBUILD_MATCHY=OFF -DMATCHY_ROOT=/path/to/matchy ..

# Specify Zeek location manually
cmake -DCMAKE_MODULE_PATH=/path/to/zeek/cmake ..

License

BSD-2-Clause License. See LICENSE file.

Contributing

Issues and pull requests welcome at https://github.com/sethhall/zeek-matchy-plugin

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
scripts		scripts
src		src
testing		testing
tests		tests
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
VERSION		VERSION
WARP.md		WARP.md

License

matchylabs/zeek-matchy-plugin

Folders and files

Latest commit

History

Repository files navigation

Zeek Matchy Plugin

Table of Contents

Why Matchy?

Memory Efficiency on Clusters

Operational Flexibility

Performance

Developer Experience

Installation

Requirements

Build

Install (optional)

Verify Installation

Usage

Creating a Matchy Database

MatchyIntel Framework (Intel Framework Replacement)

Quick Start

Runtime Database Switching

Manual Observation

Hooks and Customization

Log Output

Low-Level API

Advanced Example with JSON Parsing (Low-Level API)

API Reference

load_database(filename: string): opaque of MatchyDB

is_valid(db: opaque of MatchyDB): bool

query_ip(db: opaque of MatchyDB, ip: addr): string

query_string(db: opaque of MatchyDB, query: string): string

Testing

Troubleshooting

License

Contributing

See Also

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`load_database(filename: string): opaque of MatchyDB`

`is_valid(db: opaque of MatchyDB): bool`

`query_ip(db: opaque of MatchyDB, ip: addr): string`

`query_string(db: opaque of MatchyDB, query: string): string`

Packages