Skip to content

Create CEF processor for ingest node #126201

Open
@bhapas

Description

@bhapas

Description:

Currently, many users rely on Beats and Logstash to parse and ingest CEF (Common Event Format) logs. However, there is an increasing demand for native CEF processing in Ingest Node, particularly for users who want to move away from Beats or Logstash and leverage the performance benefits and flexibility of Ingest Pipelines directly within Elasticsearch. Introducing a CEF processor within the Ingest Node would provide a significant advantage for users, simplifying log ingestion and processing workflows in Elasticsearch without the need for intermediate processing layers.

Background:

CEF Logs and Usage:
The Common Event Format (CEF) is widely adopted for event logging, especially for security and network devices. It is commonly used in various industries for structured log data, which helps in centralizing log data, monitoring, and alerting.

Current Workflows:

Users typically rely on Beats (such as Filebeat with the decode_cef pre processor ) or Logstash with custom configurations or specific parsers to handle CEF logs. These tools add overhead and complexity for those who prefer a more direct, centralized solution within Elasticsearch.
There is growing interest from users who want to bypass Beats or Logstash and utilize Ingest Pipelines for data transformation and enrichment directly within Elasticsearch.

Proposed Solution:

Introduce a new CEF processor within the Ingest Node to handle the parsing and transformation of CEF logs. This processor would allow users to:

  • Parse and extract fields from CEF logs into structured JSON format for use in Ingest Pipelines.
  • Enable easy migration paths from Beats/Logstash-based processing to native Elasticsearch ingestion.
  • Minimise reliance on external tools for users already leveraging Elasticsearch as their primary platform.

Key Features:

CEF Log Parsing:
The processor should be capable of parsing CEF-formatted logs and extracting key-value pairs from the log header, extension, and event sections.

Example:
A CEF log entry looks like this:

CEF:0|Elastic|Vaporware|1.0.0-alpha|18|Web request|low|eventId=3457 requestMethod=POST slat=38.915 slong=-77.511 proto=TCP sourceServiceName=httpd requestContext=https://www.google.com src=89.160.20.156 spt=33876 dst=192.168.10.1 dpt=443 request=https://www.example.com/cart

After parsing, the processor would extract structured fields such as:

CEF parsed event
{
  "observer": {
    "product": "Vaporware",
    "vendor": "Elastic",
    "version": "1.0.0-alpha"
  },
  "cef": {
    "severity": "low",
    "name": "Web request",
    "device": {
      "product": "Vaporware",
      "event_class_id": 18,
      "vendor": "Elastic",
      "version": "1.0.0-alpha"
    },
    "version": 0
  },
  "destination": {
    "port": 443,
    "ip": "192.168.10.1"
  },
  "http": {
    "request": {
      "referrer": "https://www.google.com",
      "method": "POST"
    }
  },
  "source": {
    "geo": {
      "location": {
        "lon": -77.511,
        "lat": 38.915
      }
    },
    "port": 33876,
    "service": {
      "name": "httpd"
    },
    "ip": "89.160.20.156"
  },
  "message": "CEF:0|Elastic|Vaporware|1.0.0-alpha|18|Web request|low|eventId=3457 requestMethod=POST slat=38.915 slong=-77.511 proto=TCP sourceServiceName=httpd requestContext=https://www.google.com src=89.160.20.156 spt=33876 dst=192.168.10.1 dpt=443 request=https://www.example.com/cart",
  "event": {
    "code": 18,
    "id": 3457
  },
  "url": {
    "original": "https://www.example.com/cart"
  },
  "network": {
    "transport": "TCP"
  },
  "_index": "index",
  "_id": "id",
  "_version": 1,
  "ingestMetadata": {
    "timestamp": "2025-04-03T11:04:49.532277Z"
  }
}

Compatibility with Ingest Pipelines:

The CEF processor will be fully compatible with existing Ingest Pipelines and should support chaining with other processors (e.g., geoip, date, or user-agent parsers).

Configuration Options:

Users should be able to specify a field containing the CEF log and customize the extraction rules for any additional fields (e.g., custom event fields) if needed.

Testing and Validation:

The processor should come with comprehensive test cases, including various CEF log formats and edge cases, to ensure reliability and compatibility.

Use Cases:

Security Information and Event Management (SIEM):

Many organizations use CEF logs for security monitoring and alerting. By processing these logs directly within Elasticsearch, security teams can quickly analyze and correlate security events.

Network Device Monitoring:

Devices such as firewalls, routers, and intrusion detection systems often generate CEF logs. A native processor would make it easier to ingest and analyze these logs directly in Elasticsearch.

Enterprise Log Management:

Enterprises using CEF for various system logs will benefit from having a native processor for seamless log ingestion into Elasticsearch for analysis and search.

Related Issues:

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions