Skip to content

Security Operations Guide

Antonios Voulvoulis edited this page Apr 15, 2026 · 14 revisions

Security Operations Guide

Type: Troubleshooting / Operator Reference Scope: Real operator scenarios, diagnosis, and response procedures Terminology: Glossary & Vocabulary


Purpose

See also: Health Model | CLI Commands | Known Limitations

This page provides real-world operational procedures for NFTBan administrators. Every procedure starts with kernel evidence verification. Claims without evidence are not valid.


First Response: Check System Health

Before any investigation, establish the current system state:

# Step 1: Kernel-derived health (authoritative)
nftban health

# Step 2: If DEGRADED or DOWN — check findings
nftban-validate --json | jq '.status, .findings'

# Step 3: Check daemon
systemctl status nftband

# Step 4: Check kernel structure
nft list tables
nft list chain ip nftban input | grep ANCHOR

Rule: Start from kernel truth, not from symptoms. The validator tells you what is structurally wrong. Findings tell you what to fix.


Scenario: System Shows DEGRADED

Diagnosis

# What is wrong?
nftban-validate --json | jq '.findings[] | {code, severity, message, remediation}'

Common findings and fixes

Finding Meaning Fix
VAL-SERVICE-001 Daemon not running systemctl start nftband
VAL-CHAIN-004 Chain exists but empty (B80-3) nftban {module} enable
VAL-CHAIN-002 Helper chain missing (module enabled) nftban {module} enable or nftban firewall rebuild
VAL-CONS-001 Config says enabled, kernel says missing nftban {module} enable or nftban firewall rebuild
VAL-TIMER-001 No timers active systemctl start nftban-maintenance.timer
VAL-ANCHOR-001 Anchor missing nftban firewall rebuild
VAL-GEOBAN-001 GeoIP database missing nftban geoban sync

If rebuild is needed

nftban firewall rebuild
# Atomic: validates before loading. If invalid, keeps existing ruleset.

Scenario: Brute-Force Attack on SSH

Verify LoginMon is running

# Check module state
nftban login status

# Check daemon is processing events
journalctl -u nftband --since "10 min ago" | grep "login_failed" | tail -5

Check if bans are being issued

# Check journal for ban events
journalctl -u nftband --since "1 hour ago" | grep "EVENT.*banned" | tail -10

# Check kernel set for active bans
nft list set ip nftban blacklist_manual_ipv4
# Elements with timeout = active bans

# Check enforcement counter
nft list counter ip nftban input_blacklist_manual_drop
# Counter > 0 = drops happening on banned IPs

Important: input_blacklist_manual_drop is shared with operator manual bans and portscan bans. It proves manual-blacklist-family enforcement but cannot attribute drops to LoginMon specifically. Journal evidence is required for attribution.

If LoginMon is not detecting

# Check source bindings
journalctl -u nftband | grep "LOGINMON.*resolved_by" | tail -5
# Expected: resolved_by=distroconf
# Warning: resolved_by=fallback or warning=hardcoded_probe

# Check if auth log exists
ls -la /var/log/secure /var/log/auth.log 2>/dev/null

# Check if daemon module is registered
journalctl -u nftband | grep "module_start.*login" | tail -3

Scenario: High Bot Traffic (HTTP)

Verify BotGuard is active

# Check module state
nftban botguard status

# Check classification sets (kernel evidence)
for s in http_bot_suspect http_bot_pending http_bot_ban http_bot_grey http_bot_emergency; do
    count=$(nft -j list set ip nftban $s 2>/dev/null | jq '.nftables[].set.elem | length' 2>/dev/null || echo "0")
    echo "$s: $count entries"
done

Interpreting set population

Set populated Meaning
suspect > 0 IPs under observation (rate exceeded threshold)
pending > 0 IPs awaiting verification
ban > 0 Confirmed bots — traffic dropped
grey > 0 Throttled IPs (suspicious but not banned)
emergency > 0 Emergency blocks (severe abuse)
All empty IDLE — no HTTP bot traffic above threshold

Empty sets on a high-traffic host: Check if BotGuard module is registered in daemon (journalctl -u nftband | grep botguard). Check if the jump rule is reachable (nft list chain ip nftban input | grep bot_guard).


Scenario: Feed Spike (Blacklist Surge)

Check feed sync status

# Check last sync
journalctl -u nftband | grep "SYNC.*Feeds" | tail -3

# Check loaded ranges
nft list set ip nftban blacklist_ipv4 | head -5
# Shows interval set with CIDR ranges

# Check enforcement
nft list counter ip nftban input_blacklist_drop

Attribution limitation: input_blacklist_drop is shared between feeds and geoban. Cannot determine which source caused specific drops. Sync logs provide provenance.

If feeds fail to load

# Check feed config files exist
ls /etc/nftban/conf.d/feeds/*.conf

# Check feed data files
ls -la /var/lib/nftban/feeds/

# Force sync
nftban feeds sync

Scenario: False Positive (Legitimate IP Blocked)

Diagnose

# Check if IP is in any set
nftban check 1.2.3.4

# Search across all sets
nftban search 1.2.3.4

Resolve

# Remove ban
nftban unban 1.2.3.4

# Add to whitelist (permanent protection)
nftban whitelist add 1.2.3.4

# Verify
nftban check 1.2.3.4

Prevent recurrence

# Add to whitelist file (survives rebuild)
echo "1.2.3.4" >> /etc/nftban/whitelist.d/custom.list
nftban firewall rebuild

Scenario: Debugging DEGRADED State

Decision tree

nftban-validate --json | jq '.status'
    │
    ├── "protected" → system is fine
    ├── "idle" → system is fine (no traffic)
    ├── "degraded" → check findings:
    │   │
    │   ├── VAL-SERVICE-001 → daemon stopped
    │   │   └── systemctl start nftband
    │   │
    │   ├── VAL-CHAIN-* → structural problem
    │   │   └── nftban firewall rebuild
    │   │
    │   ├── VAL-CONS-001 → config/kernel mismatch
    │   │   └── nftban {module} enable
    │   │
    │   ├── VAL-TIMER-* → timers stopped
    │   │   └── systemctl start nftban-maintenance.timer
    │   │
    │   └── VAL-GEOBAN-001 → GeoIP database issue
    │       └── nftban geoban sync
    │
    └── "down" → critical failure
        └── check: nft list tables
            ├── tables exist → nftban firewall rebuild
            └── tables missing → reinstall or nftban firewall rebuild

Scenario: After Firewall Rebuild

All counters reset to zero after rebuild. This is expected.

# Verify structure is intact
nftban-validate --json | jq '.status'
# Expected: "protected" or "idle"

# Check anchors
nft list chain ip nftban input | grep ANCHOR
# All 7 must be present

# Counters will accumulate naturally as traffic flows
# Zero immediately after rebuild = NEUTRAL, not failure

Scenario: Daemon Crash Recovery

# Check daemon status
systemctl status nftband

# Restart daemon
systemctl restart nftband

# Verify modules re-registered
journalctl -u nftband --since "1 min ago" | grep "module_start"

# Check health (kernel structure persists across daemon restart)
nftban health

Key point: Kernel rules persist when daemon crashes. Existing bans in timeout sets continue enforcing. DDoS rate limits continue working. Only new bans (LoginMon, BotGuard) require daemon.


Emergency: Locked Out of SSH

If NFTBan blocks your SSH access:

From console/IPMI/KVM

# Flush all nftban rules (allows all traffic)
nft flush table ip nftban
nft flush table ip6 nftban

# Or: add your IP to whitelist
nft add element ip nftban whitelist_ipv4 { YOUR_IP }

# Then rebuild properly
nftban firewall rebuild

Prevention

# Verify SSH port is in service ports
nft list set ip nftban tcp_ports_in
# Must include your SSH port

# Verify your IP is whitelisted
nft list set ip nftban whitelist_ipv4
# Must include your management IP

Log Locations

Log Path Purpose
Daemon log /var/log/nftban/nftban.log Main daemon operations
Ban log /var/log/nftban/bans.log Ban/unban events
Portscan log /var/log/nftban/portscan.log Detection module
Portscan classic /var/log/nftban/portscan-classic.log Classic detector
Health incidents /var/log/nftban/health-incidents.log Health state changes
Installer log /var/log/nftban/installer.log Install/upgrade operations

Log rotation is handled by OS logrotate (/etc/logrotate.d/nftban), not by NFTBan maintenance service.


Verification (MANDATORY)

Every operational action should end with verification:

# After any change — verify system state
nftban-validate --json | jq '.status'

# After ban/unban — verify set state
nft list set ip nftban blacklist_manual_ipv4

# After rebuild — verify structure
nft list chain ip nftban input | grep ANCHOR

# After daemon restart — verify modules
journalctl -u nftband --since "1 min ago" | grep "module_start"

Limitations

  • Validator is a snapshot. It shows state at the moment of query. Transient issues (daemon restart, rebuild in progress) may show DEGRADED temporarily.
  • LoginMon enforcement not visible in validator. The validator reports LoginMon as IDLE because journal queries are outside its scope. Use nftban login status for LoginMon-specific evidence.
  • Shared counter attribution. Drop counters for blacklist and manual blacklist cannot attribute drops to specific sources (feeds vs geoban, LoginMon vs operator ban). Journal evidence is required for attribution.
  • Counter reset on rebuild. All counters return to zero after rebuild. This is NEUTRAL, not an indication of problems.

Clone this wiki locally