The SD-WAN Traffic Generator includes a sophisticated Voice over IP (VoIP) simulation engine. Unlike standard HTTP traffic, this module simulates real-time RTP (Real-time Transport Protocol) streams to test Quality of Service (QoS) and SD-WAN path selection policies.
Main VoIP simulation dashboard showing active streams, aggregated MOS, and target list:

The system consists of two main components:
- Voice Orchestrator: Manages multiple simultaneous calls, handles timing between calls, and logs start/end events.
- RTP Engine (
rtp.py): A Scapy-based engine that forges raw Ethernet/IP/UDP/RTP packets to simulate specific audio codecs.
The following diagram illustrates the flow of simulated voice (RTP) calls and how ports are utilized for SD-WAN flow tracking.
sequenceDiagram
participant S as Source (Voice Orchestrator)
participant RA as Source Router
participant RB as Destination Router
participant T as Target Site (Voice Echo)
Note over S: Source Port: 31000 + CALL-ID
S->>RA: RTP Stream (G.711 / G.729)
RA->>RB: SD-WAN Path / Tunnel
RB->>T: Destination Port: 6100
Note over T: Silence Detection (5s)
T-->>RB: Echo RTP Response
RB-->>RA: Reverse Path
RA-->>S: Latency, Jitter & MOS Analysis
Located in your config/ directory, this file defines where the traffic is sent.
Format: target_ip:port|codec_name|weight|duration_sec
Example:
192.168.100.10:6100|G.711-ulaw|100|30
192.168.100.11:6100|G.729|50|60
- Target: The IP and UDP port of the receiver (usually an
sdwan-voice-echocontainer). - Codec: Display name for the UI.
- Weight: Probability of picking this target (higher = more frequent).
- Duration: How long the call lasts in seconds.
Accessible via the Voice tab in the Web UI:
- Max Simultaneous Calls: How many concurrent RTP streams to run (Max 10 recommended).
- Sleep Between Calls: Delay before starting a new call after one ends.
- Source Interface: The network interface to use (e.g.
eth0,eth1). Note: Inhostnetworking mode, it sees all physical interfaces of the machine.
Target configuration menu for selecting codecs (G.711, G.729, OPUS) and call weights:

Deploy the Voice Echo Server on your target sites. It tracks incoming calls and bounces back the RTP traffic for end-to-end path validation.
# Pull and start the echo server (Now listens on 6100 and 6101 by default)
docker run -d --name sdwan-voice-echo \
-p 6100-6101:6100-6101/udp \
--restart unless-stopped \
jsuzanne/sdwan-voice-echo:stableThe generator embeds its internal CALL-ID (e.g., CALL-0001) directly inside the RTP payload. The Echo Server decodes this ID and logs it, allowing you to trace exactly which call is hitting which target in your logs.
To provide realistic SD-WAN testing and easy flow correlation, each call now uses a deterministic source port (Range 31000+) derived from the CALL-ID.
- Deterministic Mapping:
CALL-0001→ Source Port31001CALL-0015→ Source Port31015
- SD-WAN Benefit: You can search for
src-port 31015in your SD-WAN Orchestrator flow browser to isolate and trace exactly which tunnels/circuits a specific call utilized. - Graceful Fallback: If a deterministic port is already in use by the OS, the engine automatically falls back to a random port in the 40000-65535 range.
- Echo Logic: The Echo server identifies call completion using a 5-second silence timeout per flow.
Call history table with source port mapping, essential for SD-WAN path correlation:

On every restart, the Voice simulation checks Settings → System Info → State Persistence → Voice / RTP Calls:
- Toggle OFF (default): The simulation starts in Disabled mode for safety — clean slate.
- Toggle ON + voice was active before reboot: The simulation automatically resumes with the same targets and settings.
- Toggle ON + voice was stopped before reboot: The simulation stays Disabled — it respects the pre-reboot state.
The Voice toggle is only clickable if at least one server is configured in the Voice tab. If no servers are defined, it remains grayed out.
On every restart of the container (when State Persistence is OFF or voice was inactive):
- Stats logs are truncated (empty start).
- Call counters reset to
CALL-0001. - The simulation starts in
Disabledmode for safety. This ensures the Web UI and Console are always perfectly synchronized.
To ensure lab-grade diagnostic results, the Voice Simulation module uses high-precision timing and industry-standard mathematical models.
Unlike standard application-level pings, the RTP engine measures the literal flight time of every packet.
- Precision: Uses high-resolution hardware monotonic counters (
time.perf_counter()). - Robustness: Measurements are unaffected by system clock adjustments or "time jumps" during testing.
- Total Path: Values reflect the round-trip performance of the primary link and the return path from the echo server.
Jitter is calculated strictly according to RFC 3550 (Standard RTP):
- Why it matters: Jitter is the leading indicator of "robotic" sounding voice. Tracking the inter-arrival variance helps network engineers tune Jitter Buffers on SD-WAN edges or identify congestion on service provider links.
- Reliability: Since this follows the exact RFC used by professional analyzers like Wireshark, you can cross-reference our results with packet captures for perfect correlation.
The system calculates a predictive MOS score using the Simplified ITU-T G.107 E-model.
| MOS Score | Quality | User Experience |
|---|---|---|
| 4.0 - 4.4 | Excellent | Toll-quality audio. No perceptible delay or loss. |
| 3.6 - 4.0 | Good | Business-quality. Very slight delay, perfectly usable. |
| 2.5 - 3.6 | Fair | Noticeable distortion. Occasional robotic sound or clipping. |
| < 2.5 | Poor | Severe clipping. Call dropouts likely. Unusable for business. |
The math behind the score:
- Effective Latency: We combine RTT and Jitter into an "effective" burden:
Effective Latency = RTT + (Jitter * 2) + 10ms. - Delay Impairment: Latency over 160ms begins to sharply penalize the score.
- Codec Modeling: We apply specific equipment impairment factors for the G.711 codec, ensuring the score reflects the codec being simulated in the lab.
If you are testing the generator or the echo server on Windows, keep these three points in mind:
- Network Mode:
network_mode: hostis not supported on Windows.- Edit
docker-compose.ymland comment outnetwork_mode: host. - Windows will use its internal
bridgenetworking instead.
- Edit
- Firewall: Windows Defender Firewall is strict. You must manually create an Inbound Rule for UDP Port 6100 to allow traffic to reach the Echo Server.
- WSL2 Engine: Ensure Docker is configured to use the WSL2 based engine for better Scapy performance.
Aggregated QoS statistics per target including average loss, RTT, Jitter, and MOS scores:

The Voice Monitoring tab shows calls in real-time.
- Active Calls: Shows currently running streams.
- Recent History: Shows the last 500 events (Newest First).
Generator logs:
docker compose logs -f sdwan-voice-genOutput example:
[CALL-0012] 📞 CALL STARTED: 192.168.203.100:6100 | G.711-ulaw | 15s
If you enable DEBUG=true in your environment, you will also see the explicit source port being used for flow tracking:
[11:00:00] [CALL-0012] ⚙️ Source Port: 31012 (Flow tracking enabled)
Echo Server (Target) logs:
docker compose logs -f sdwan-voice-echoOutput example (with Call ID tracing):
📞 [22:42:22] Incoming call: CALL-0012 from 192.168.217.5:54321
✅ [22:42:42] Call CALL-0012 finished (last from 192.168.217.5:54321)
- Unknown in Echo logs? Your script or container is likely from an older version or a native Python execution without the Call ID injection.
- Skipping call? Check if the destination IP is pingable from the generator machine.