Skip to content

Commit d824bcd

Browse files
committed
Tweak
1 parent c97b27e commit d824bcd

File tree

1 file changed

+49
-1
lines changed

1 file changed

+49
-1
lines changed

NEWS

Lines changed: 49 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,56 @@ New Functionality
3939
This entire feature can be disabled by loading the new
4040
``policy/protocols/conn/disable-unknown-ip-proto-support.zeek`` policy script.
4141

42+
- Broker's message I/O buffering now operates on per-peering granularity at the
43+
sender (it was previously global) and provides configurable overflow handling
44+
when a fast sender overwhelms a slow receiver, via the following new constants
45+
in the ``Broker`` module:
46+
47+
const peer_buffer_size = 2048 &redef;
48+
const peer_overflow_policy = "disconnect" &redef;
49+
const web_socket_buffer_size = 512 &redef;
50+
const web_socket_overflow_policy = "disconnect" &redef;
51+
52+
When a send buffer overflows (i.e., it is full when a node tries to transmit
53+
another message), the sender may unpeer the slow receiver (policy
54+
``disconnect``, the default), drop the newest message in the buffer
55+
(``drop_newest``), or drop the oldest (``drop_oldest``). Buffer sizes are
56+
measured in number of messages, not bytes. Note that "sender" and "receiver"
57+
here are independent of the direction in which Zeek originally established the
58+
peering. After disconnects Zeek automatically tries to re-establish peering
59+
with the slow node, in case it recovers.
60+
61+
Zeek notifies you in two ways of the fact that such disconnects occur:
62+
63+
* A cluster.log entry indicates for the sending node that a slow peered node
64+
has been removed. Here node ``worker01`` has removed a peered ``proxy01`:
65+
66+
1733468802.626622 worker01 removed due to backpressure overflow: 127.0.0.1:42204/tcp (proxy01)
67+
68+
* A labeled counter metric ``zeek_broker_backpressure_disconnects_total`` in
69+
the telemetry framework tracks the number of times such disconnects have
70+
occurred between respective nodes. For example this indicates the same
71+
disconnect as above:
72+
73+
zeek_broker_backpressure_disconnects_total{endpoint="worker01",peer="proxy01"} 1
74+
75+
To implement custom handling of a backpressure-induced disconnect, add a
76+
``Broker::peer_removed`` event, as follows:
77+
78+
event Broker::peer_removed(endpoint: Broker::EndpointInfo, msg: string)
79+
{
80+
if ( "caf::sec::backpressure_overflow" !in msg )
81+
return;
82+
83+
# The local node has disconnected the given endpoint,
84+
# add your logic here.
85+
}
86+
87+
These new policies fix a problem in which misbehaving nodes could trigger
88+
cascading "lockups" of nodes, each ceasing to transmit any messages.
89+
4290
- Zeek now includes a PostgreSQL protocol analyzer. This analyzer is enabled
43-
by default. The analyzer's events and its ``postgresql.log`` should be
91+
by default. The analyzer's events and its ``postgresql.log`` should
4492
considered preliminary and experimental until the arrival of Zeek's next
4593
long-term-stable release (8.0).
4694

0 commit comments

Comments
 (0)