- Clone the repository:
git clone https://github.com/anders-wartoft/air-gap.git
cd air-gap2. Build the binaries:
make allThis builds upstream, downstream, and the deduplication Java application.
3. Install dependencies:
- Go 1.18+ (for upstream/downstream)
- Java 17+ (for deduplication)
- Kafka 3.9+ (for event streaming)
- Optional: Metricbeat, Jolokia for monitoring
4. Prepare configuration files:
- Copy and edit example configs in
config/andconfig/testcases/. - See below for details.
5. Generate keys for encryption (optional):
- See README.md section "Keys" for key generation commands.
Air-gap supports both UDP (default) and TCP transports. Choose based on your network environment:
- UDP: For hardware diodes, high throughput, low latency
- TCP: For software connections, unreliable networks, connection state awareness
See Transport Configuration.md for detailed transport options and migration guide.
Edit your upstream config file (e.g., config/upstream.properties):
id=Upstream_1
nic=en0
targetIP=127.0.0.1
targetPort=1234
# Use TCP instead of UDP (optional, defaults to UDP)
transport=tcp
source=kafka
bootstrapServers=192.168.153.138:9092
topic=transfer
groupID=test
publicKeyFile=certs/server2.pem
generateNewSymmetricKeyEvery=500
mtu=autoOverride any setting with environment variables (see README for details).
For TCP with environment variable:
export AIRGAP_UPSTREAM_TRANSPORT=tcp
./upstream config/upstream.propertiesEdit your downstream config file (e.g., config/downstream.properties):
id=Downstream_1
nic=en0
targetIP=0.0.0.0
targetPort=1234
transport=tcp # Use TCP instead of UDP (optional, defaults to UDP)
bootstrapServers=192.168.153.138:9092
topic=log
privateKeyFiles=certs/private*.pem
target=kafka
mtu=auto
clientId=downstreamFor TCP with environment variable:
export AIRGAP_DOWNSTREAM_TRANSPORT=tcp
./downstream config/downstream.propertiesEdit your deduplication config (e.g., config/create.properties):
RAW_TOPICS=transfer
CLEAN_TOPIC=dedup
GAP_TOPIC=gaps
BOOTSTRAP_SERVERS=192.168.153.148:9092
STATE_DIR_CONFIG=/tmp/dedup_state_16_1/
WINDOW_SIZE=10000
MAX_WINDOWS=10000
GAP_EMIT_INTERVAL_SEC=60
PERSIST_INTERVAL_MS=10000
APPLICATION_ID=dedup-gap-appFor details on how to run the applications as services, see README.md
For high event rates (e.g., 10,000 eps):
PERSIST_INTERVAL_MS: 100–1000 ms (persist state every 0.1–1 second)COMMIT_INTERVAL_MS: 100–1000 ms (commit progress every 0.1–1 second)
Start with:
PERSIST_INTERVAL_MS=500
COMMIT_INTERVAL_MS=500This means state and progress are checkpointed every 0.5 seconds, so at most 5,000 events would need to be re-processed after a crash.
Tuning tips:
- Lower values = less data loss on crash, but more I/O.
- Higher values = less I/O, but more data to reprocess after a failure.
- Monitor RocksDB and Kafka broker load; adjust if you see bottlenecks.
Upstream:
go run src/cmd/upstream/main.go config/upstream.propertiesor (after build):
./src/cmd/upstream/upstream config/upstream.propertiesDownstream:
go run src/cmd/downstream/main.go config/downstream.propertiesor (after build):
./src/cmd/downstream/downstream config/downstream.propertiesDeduplicator:
java -jar java-streams/target/air-gap-deduplication-fat-<version>.jar- If UDP sending fails, check static ARP and route setup.
- If performance is low, tune buffer sizes and batch settings (see README).
- If deduplication is not working, check environment variable scoping and config file paths.
- For monitoring, see
doc/Monitoring.md.
See doc/Monitoring.md for instructions on using Metricbeat, Jolokia, and JMX for resource and application monitoring.
To completely uninstall air-gap and its components:
- Stop running services:
sudo systemctl stop upstream.service
sudo systemctl stop downstream.service
sudo systemctl stop dedup.service2. Disable services:
sudo systemctl disable upstream.service
sudo systemctl disable downstream.service
sudo systemctl disable dedup.service3. Remove binaries:
rm -f /opt/airgap/upstream/bin/*
rm -f /opt/airgap/downstream/bin/*
rm -f /opt/airgap/dedup/bin/*
rm -f /usr/local/bin/upstream
rm -f /usr/local/bin/downstream
rm -f /usr/local/bin/dedup4. Remove configuration files:
rm -rf /opt/airgap/upstream/*.properties
rm -rf /opt/airgap/downstream/*.properties
rm -rf /opt/airgap/dedup/*.properties
rm -rf /etc/airgap/5. Remove keys and certificates (if used):
rm -rf /opt/airgap/certs/6. Remove systemd service files:
sudo rm -f /etc/systemd/system/upstream.service
sudo rm -f /etc/systemd/system/downstream.service
sudo rm -f /etc/systemd/system/dedup.service
sudo systemctl daemon-reload7. Remove log files:
rm -rf /var/log/airgap/8. (Optional) Remove cloned source directory:
rm -rf ~/air-gapNote: Adjust paths as needed for your installation. If you installed to custom locations, remove those as well.
The air-gap system provides detailed statistics logging to help you monitor throughput, performance, and system health. Statistics are emitted as JSON-formatted log messages with the prefix STATISTICS: or MISSING-REPORT:.
To enable statistics logging, configure the logStatistics parameter in your upstream or downstream configuration file:
logStatistics=60This configures the application to log statistics every 60 seconds. Set to 0 to disable statistics logging.
You can also override this setting using an environment variable:
export UPSTREAM_LOG_STATISTICS=60Both upstream and downstream applications emit the following fields in their STATISTICS: log entries:
id: The identifier of the application instance (from config)time: Current Unix timestamp (seconds since epoch) when the statistics were loggedtime_start: Unix timestamp when the application started (used to calculate uptime)interval: The configured statistics logging interval in secondsreceived: Number of events received during the last intervalsent: Number of events sent during the last intervalfiltered: Number of events filtered (blocked) during the last interval (input filter only)unfiltered: Number of events that passed through the input filter during the last intervalfilter_timeouts: Number of regex timeout errors during the last interval (input filter only)eps: Events per second during the last interval (calculated asreceived / interval)total_received: Total number of events received since application starttotal_sent: Total number of events sent since application starttotal_filtered: Total number of events filtered (blocked) since application start (input filter only)total_unfiltered: Total number of events that passed through the input filter since application starttotal_filter_timeouts: Total number of regex timeout errors since application start (input filter only)
Example upstream statistics log:
STATISTICS: {"id":"Upstream_1","time":1732723200,"time_start":1732720000,"interval":60,"received":3600,"sent":3600,"eps":60,"total_received":72000,"total_sent":72000}Example downstream statistics log:
STATISTICS: {"id":"Downstream_1","time":1732723200,"time_start":1732720000,"interval":60,"received":3590,"sent":3590,"eps":59,"total_received":71800,"total_sent":71800}The Java deduplication application emits MISSING-REPORT: log entries for each partition, containing:
partition: The Kafka partition numbertotal_missing: Total number of gaps/missing events detecteddelta_missing: Number of new gaps detected during the last intervaltotal_received: Total number of events received on this partitiondelta_received: Number of events received during the last intervaltotal_emitted: Total number of deduplicated events emitteddelta_emitted: Number of events emitted during the last intervaleps: Events per second received during the last interval
Example deduplication statistics log:
[MISSING-REPORT] [{"partition":0,"total_missing":15,"delta_missing":2,"total_received":72000,"delta_received":3600,"total_emitted":71985,"delta_emitted":3598,"eps":60}]Statistics can be used to:
- Monitor throughput: Track
epsto ensure the system is processing events at the expected rate - Detect packet loss: Compare
total_sent(upstream) withtotal_received(downstream) - Identify gaps: Monitor
total_missinganddelta_missingin deduplication logs - Calculate uptime: Use
time - time_startto determine how long the application has been running - Verify deduplication: Compare
total_receivedvstotal_emittedto see how many duplicates were filtered
You can extract and analyze statistics using standard log processing tools. For example, to extract statistics from logs:
grep "STATISTICS:" /var/log/airgap/upstream.log | jq .Or to monitor gaps in real-time:
tail -f /var/log/airgap/dedup.log | grep "MISSING-REPORT" | jq '.[0].total_missing'For more comprehensive monitoring solutions using Metricbeat and Jolokia, see doc/Monitoring.md.