Skip to content

Commit b99a3e2

Browse files
start a doc on amqp_dist
1 parent 59faab5 commit b99a3e2

7 files changed

Lines changed: 366 additions & 16 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
ebin
22
_build/
33
rebar3.*
4+
doc/*.tex

config/sys.config

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,7 @@
11
[
22
{amqp_dist, [
33
{connections, [
4-
"amqp://guest:guest@rabbitmq:5676/lx1"
5-
,"amqp://guest:guest@rabbitmq:5676/lx2"
6-
,"amqp://guest:guest@rabbitmq:5676/lx3"
4+
"amqp://guest:guest@localhost:5672"
75
]}
86
]}
97

doc/amqp_dist.org

Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,201 @@
1+
* DistErl
2+
3+
Erlang nodes need to agree on a way to talk to each other over the
4+
network. That agreed-upon way is called a distribution protocol.
5+
6+
Erlang ships with a default distribution protocol called EPMD (Erlang
7+
Port Mapper Daemon). When a node starts up, it registers itself with a
8+
little helper process (EPMD) running on the same machine, and other
9+
nodes ask EPMD "hey, how do I reach the node named foo@mymachine?"
10+
EPMD answers with a port number, and the connection proceeds from
11+
there. It's a bit like a phone book: you look up a name, get a number,
12+
dial it.
13+
14+
** Simplified example
15+
Malls, business park, etc have directories of what stores or businesses are there and where to find them. The default for most (the EPMD) is a physical board with names and locations on it.
16+
17+
Alternatives could be QR codes that can be scanned for that information, a mall app folks can load on their phone to navigate, or TV screens that the owners can update in real time.
18+
19+
* Why alternatives?
20+
21+
** Security
22+
The default setup sends a cookie (a shared secret) in a way that might not be acceptable in a locked-down environment. You might want TLS from the very first byte, before any handshake.
23+
24+
** Different environments
25+
In a Kubernetes cluster, for example, pods discover each other through DNS or a service registry, not through EPMD. A custom module can speak that language natively.
26+
27+
** Avoiding an extra process
28+
EPMD is a separate OS-level daemon. Some deployment setups prefer not to have it running at all, especially in containers where you want one process per container.
29+
30+
** Custom transports
31+
The default distribution uses TCP. If you wanted to run node-to-node communication over something else — a Unix domain socket, a shared memory channel, or a message bus — you could do that by implementing the right callbacks.
32+
33+
* Distribution contract
34+
** Registration
35+
"Hi, I'm a node named foo@bar.com and can be reached on port 4369"
36+
** Node Lookup
37+
"Do you know how to reach bob@loblaw.com"
38+
** Connections
39+
Managing sockets (listening and accepting connections)
40+
41+
* DistErl over AMQP
42+
** Why use AMQP
43+
Traditional EPMD-based distribution creates direct connections between connecting nodes, creating either a fully-connected mesh or (using hidden nodes) a hub-spoke arrangement.
44+
45+
With AMQP, each node instead connects to the AMQP broker (we recommend RabbitMQ obviously), and inter-node traffic flows through the broker via exchanges, queues and their bindings.
46+
47+
While this introduces the AMQP broker as a potential SPOF (single point of failure), presumably the broker is already part of your application's infrastructure and thus not introducing new risks.
48+
49+
The benefits include interesting routing options now and more observability of inter-node communication in a single place (the broker).
50+
51+
** Node Startup
52+
When the node =foo@bar.com= starts up, it would publish its presence to an pre-determined exchange
53+
54+
** Inter-node communication
55+
Instead of TCP-based "connect to IP =A= on port =B=", nodes will "publish to exchange =A= with routing key =B=" where =A= is a direct exchange (where routing key = binding key) and =B= is the routing key of the destination node.
56+
57+
58+
59+
* amqp_dist
60+
** BEAM startup requirements
61+
Code can only be used from the application itself, =kernel= and =stdlib=.
62+
63+
The module controlling distribution interactions should be suffixed =_dist=: we called ours =amqp_dist= accordingly.
64+
65+
** Infrastructure to setup
66+
67+
- A listener entity (a process or port)
68+
- An acceptor process to accept incoming connections via the listening entity
69+
70+
*** Per Connection
71+
Once a connection is accepted, the module needs to create:
72+
- a connection supervisor process (handles handshake for setting up the connection)
73+
- a distribution controller (process or port) for putting data onto the connection
74+
75+
Both should be linked so they're cleaned up when the connection goes down
76+
77+
*** Example dist module
78+
From https://www.erlang.org/doc/apps/erts/alt_dist.html
79+
80+
#+begin_example
81+
An example implementation of a distribution module can be found [[https://www.erlang.org/doc/apps/erts/assets/gen_tcp_dist.erl][here]]
82+
83+
It implements the distribution over TCP/IP using the gen_tcp API with
84+
distribution controllers implemented by processes. This instead of
85+
using port distribution controllers as the ordinary TCP/IP
86+
distribution uses.
87+
#+end_example
88+
89+
** Required Callbacks
90+
91+
#+begin_src erlang
92+
-export([listen/1
93+
,accept/1
94+
,accept_connection/5
95+
,setup/5
96+
,close/1
97+
,select/1
98+
,is_node_name/1
99+
,address/0
100+
]).
101+
#+end_src
102+
103+
*** listen(Name)
104+
Called once, when Erlang distribution is brought up, to listen for incoming connection requests
105+
106+
=Name= is the username part of a =Name@Host= full node name (can be =atom()= or =string()=).
107+
108+
Returns a 3-tuple of ={Socket, Address, Creation}=:
109+
110+
- =Socket= Could be a =#socket{}= but in our case, we have a =#fake_socket{}= record, represents a handle which will be passed to the =accept/1= callback later.
111+
- =Address= a =#net_address{}= record (defined in =kernel/include/net_address.hrl=) about the node
112+
- =Creation= is an integer between =1..3=; we chose 3.
113+
114+
**** =#fake_socket{}=
115+
#+begin_src erlang
116+
-record(fake_socket, {read = 0,
117+
write = 0,
118+
pending = 0,
119+
pid = self() :: pid(),
120+
name :: term(),
121+
mypid :: pid()
122+
}).
123+
124+
125+
{ok, Pid} = amqp_dist_acceptor:start(self(), Name) % start a gen_server
126+
127+
#fake_socket{name=Name, mypid=Pid} % Name from listen/1 arg, Pid is the amqp_dist_acceptor gen_server
128+
#+end_src
129+
130+
**** =#net_address{}=
131+
#+begin_src erlang
132+
#net_address{address = []
133+
,host = inet:gethostname()
134+
,protocol = amqp
135+
,family = amqp
136+
}
137+
#+end_src
138+
139+
*** accept(Listen)
140+
141+
Accepts new connection attempts from other Erlang nodes.
142+
143+
#+begin_src erlang
144+
accept(Listen) ->
145+
spawn_opt(?MODULE, start_accept, [self(), Listen], [link, {priority, max}]).
146+
#+end_src
147+
148+
=accept_loop= receives connection tuples ={connection, Tag, Node, Connection, Queue}= from =amqp_dist_acceptor=
149+
150+
The loop will message the kernel process =Kernel ! {accept, self(), {Tag, Node, Connection, Queue, Listen}, amqp, amqp}= to accept the connection and wait for the Kernel to respond with the supervising process via ={Kernel, controller, SupervisorPid}= message back. The =accept= tuple has the shape of ={accept,AcceptPid,Socket,Family,Proto}= which maps back to the =#net_address{}= returned in =listen/1=.
151+
152+
=net_kernel= will call =amqp_dist:accept_connection/5= which will spawn a process into the =do_accept/6= function to perform the needed handshake. A new record =#hs_data{}= tracks the handshake information.
153+
154+
** =amqp_dist_acceptor=
155+
This =gen_server= accepts AMQP payloads from other nodes to connect.
156+
157+
After initializing, it starts AMQP connection(s) to the configured brokers via =start_connections/0=. These settings are fetched with =application:get_env/3= with the app's =env= might look like:
158+
#+begin_src erlang
159+
{env,[{heartbeat_period_ms, 30000}
160+
,{heartbeat_timeout_ms, 45000}
161+
,{connection_timeout_ms, 10000}
162+
,{pause_before_reconnect_ms, 3500}
163+
,{server_call_timeout_ms, 750}
164+
,{connections, ["amqp://guest:guest@broker.add.re.ss:5672"]
165+
]}
166+
#+end_src
167+
168+
Once the broker connection is established:
169+
1. an AMQP channel is started
170+
2. the exchange =amq.headers= is configured
171+
- Headers exchange =amq.match= (and =amq.headers= in RabbitMQ) see [[https://www.rabbitmq.com/tutorials/amqp-concepts#exchange-headers][here]]. For routing on attributes vs routing keys
172+
3. an exclusive queue is declared: =list_to_binary(["amqp_dist_acceptor-", atom_to_list(node()), "-", pid_to_list(self())]);=
173+
4. the queue is bound to the exchange with a header argument ={<<"distribution.ping">>, bool, true}=
174+
5. start consuming from the queue
175+
176+
Once a broker is up and channel and queue configured, amqp_dist_acceptor starts a heartbeat timer (default 60s) which will publish a message with headers:
177+
#+begin_src erlang
178+
,reply_to = QueueName
179+
,headers = [{<<"distribution.ping">>, bool, true}
180+
,{<<"node.start">>, timestamp, Start}
181+
]
182+
#+end_src
183+
184+
Which should match all the bindings for any other existing nodes' queues bound to the broker.
185+
186+
*** AMQP Message handling
187+
**** New node present
188+
When a remote node publishes its heartbeat and the local node is seeing it for the first time, the =gen_server= will determine whether to auto-connect to the node (via the =auto_connect_nodes= env param), =net_kernel:connect_node(RemoteNode)= will be spawned to establish a connection to the remote node. Ultimately this will call =amqp_dist:select(RemoteNode)= which will call =amqp_dist_acceptor:is_up(RemoteNode)= which returns whether the remote node is known and the "connection" is established in =net_kernel=.
189+
190+
**** Remote node wants to connect
191+
When the =amqp_dist_acceptor= receives a payload off AMQP, it will be the heartbeat of another node.
192+
193+
The payload is a term_to_binary-encoded two-tuple ={amqp_dist, connect}=. When received, a 5-tuple will be sent to the =amqp_dist= acceptor process ={connection, Label, Node, Connection, RemoteQueue}=.
194+
** =amqp_dist_node=
195+
=gen_server= that handles sending and receiving data between the local node and a connected remote node.
196+
197+
Once the handshake is completed, messages between nodes can begin. While not necessary, =amqp_dist= spawns an input handler process =amqp_dist:dist_cntrlr_input_setup/3= to register itself with =amqp_dist_node= as the receiver process for data from the remote node.
198+
199+
For data from the local node to send to the remote node, =erlang:dist_ctrl_get_data(DHandle)= will be called and if data is returned, =amqp_dist_node:send/2= will take care of publishing the data to the correct remote node's AMQP queue (as the routing key).
200+
201+
Arbitrary Erlang terms are encoded using =base64:encode(term_to_binary(Term))= for sending and decoded in reverse.

doc/codebeam2026.org

Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
#+options: ':nil *:t -:t ::t <:t H:3 \n:nil ^:{} arch:headline
2+
#+options: author:t broken-links:nil c:nil creator:nil
3+
#+options: d:(not "LOGBOOK") date:t e:t email:nil expand-links:t f:t
4+
#+options: inline:t num:t p:nil pri:nil prop:nil stat:t tags:t
5+
#+options: tasks:t tex:t timestamp:t title:t toc:t todo:t |:t
6+
#+title: CodeBEAM 2026: DistErl over AMQP
7+
#+date: <2026-03-18 Wed>
8+
#+author: James Aimonetti
9+
#+email: james@jamesaimonetti.com
10+
#+language: en
11+
#+select_tags: export
12+
#+exclude_tags: noexport
13+
#+creator: Emacs 28.2 (Org mode 9.8)
14+
#+cite_export:
15+
#+startup: beamer
16+
#+LaTeX_CLASS: beamer
17+
#+LaTeX_CLASS_OPTIONS: [bigger]
18+
#+OPTIONS: H:2
19+
#+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %4BEAMER_col(Col) %10BEAMER_extra(Extra)
20+
#+latex_header: \mode<beamer>{\usetheme{Madrid}}
21+
* CodeBEAM 2026: DistErl over AMQP
22+
** Hi and welcome
23+
James Aimonetti (Eye Moe Net Tea)
24+
- KAZOO Architect at 2600Hz/OOMA
25+
- Erlanger since 2007
26+
** What is KAZOO?
27+
- Telecom platform started in 2010
28+
- Distributed telecom nodes (FreeSWITCH and Kamailio)
29+
- Distributed control plane (KAZOO)
30+
- Distributed data store (CouchDB)
31+
- Message bus to tie it together (RabbitMQ)
32+
** What is AMQP (Advanced Message Queuing Protocol)?
33+
- [[https://www.rabbitmq.com/tutorials/amqp-concepts][Network protocol]]
34+
- Decouples the publisher (sender) from the consumer (receiver)
35+
- Provides exchanges, queues, and flexible routing strategies
36+
- TCP connection to broker, virtual channels for publishers and consumers
37+
- Exchanges can route messages
38+
- directly to queues
39+
- fanned out to all queues
40+
- or use binding (consumer) and routing (publisher) keys for flexible reception
41+
* Erlang Distribution
42+
** Erlang Distribution
43+
- Connect to remote nodes
44+
- Send and receive Erlang terms directly between nodes
45+
- Process location is transparent (encoded in the PID)
46+
** Default DistErl
47+
- Uses a separate program EPMD (Erlang Port Mapper Daemon)
48+
- Maintains a directory of running VMs on the server and what ports
49+
50+
#+begin_example
51+
> epmd -names
52+
epmd: up and running on port 4369 with data:
53+
name kazoo_apps at port 11500
54+
name node1 at port 38957
55+
name rabbit at port 25672
56+
#+end_example
57+
58+
- Remote nodes ask EPMD how to reach a node
59+
- Connected nodes over TCP, using cookies for authentication
60+
- Fully connected mesh (or hub-and-spoke for "hidden" nodes)
61+
62+
** Why not use EPMD-based DistErl?
63+
- EPMD provides insight into your internal BEAM VMs, further attack surface
64+
- Cookie-based auth in VMs isn't for security (read more from [[https://erlef.org/blog/eef/epmd-public-exposure][The EEF]] on how to secure EPMD)
65+
- Mesh networking doesn't scale well (50 is an oft-repeated limit; 1500 if you have FAANG infra to play with)
66+
- EPMD complicates matters in docker/kubernetes
67+
- Observing inter-node communication from the outside is challenging
68+
- More firewall ports to manage
69+
** Extensible distribution
70+
- Alternative carrier protocols are supported
71+
- TCP/IP is common
72+
- =uds_dist= provides distribution over Unix domain sockets for Sun Solaris 2
73+
- Erlang drivers are complicated - we didn't touch this and just use TCP/IP
74+
- Distribution modules provide well-defined callbacks
75+
** Distribution modules
76+
- Details on finding other nodes
77+
- Create a listen port (or an approximation)
78+
- Connect to other nodes
79+
- Perform the handshake and cookie verification
80+
- =dist_util= makes a lot of this straight-forward
81+
* Distribution over AMQP
82+
** Why AMQP?
83+
- In KAZOO clusters, RabbitMQ is already running
84+
- Engineering and operation teams are familiar, existing tooling
85+
- Single TCP connection to broker
86+
- Publishes and consumes messages from other nodes
87+
- Segment clusters using RabbitMQ virtual hosts on same broker
88+
** =amqp_dist=
89+
- Erlang library: https://github.com/2600hz/erlang-amqp_dist
90+
** =amqp_dist=
91+
Configure you AMQP broker(s)
92+
#+begin_example
93+
❯ cat config/sys.config
94+
[
95+
{amqp_dist, [
96+
{connections, [
97+
"amqp://guest:guest@localhost:5672"
98+
]}
99+
]}
100+
101+
].
102+
#+end_example
103+
** =amqp_dist=
104+
Start some nodes
105+
#+begin_example
106+
❯ erl -pa _build/default/lib/*/ebin \
107+
-proto_dist amqp \
108+
-no_epmd \
109+
-name node01@your.host.com \
110+
-setcookie change_me \
111+
-config ./config/sys.config
112+
#+end_example
113+
** Module architecture
114+
- =amqp_dist=
115+
- Provides the callbacks for net_kernel
116+
- =listen(Name)= when VM brings up distribution, starts the whole shebang
117+
- =amqp_dist_acceptor=
118+
- Handles broker connections and node heartbeats
119+
- Tracks new nodes joining/leaving
120+
- Receives when remote nodes want to connect
121+
- =amqp_dist_node=
122+
- started after a node handshake is completed
123+
- handles inter-node message passing
124+
- Erlang terms are encoded using term_to_binary and base64 encoding before publishing
125+
** Failure modes
126+
- Broker down: all nodes disconnected
127+
- KAZOO architecture already has handles for this too
128+
- Clustered rabbit possible, multi-broker works too
129+
- [[https://www.rabbitmq.com/docs/flow-control][Broker flow control]]
130+
- Rabbit can slow "busy" channels and connections
131+
- More noticable than TCP backpressure when its active
132+
- Two hops introduces latency
133+
- Throughput is capped at the broker
134+
** Benefits
135+
- Access controls and vhosts
136+
- Inter-node traffic is observable at the broker
137+
- Piggy-backs on existing infrastructure and knowledge
138+
- Flexible routing options (currently direct to queue) available
139+
- Single AMQP port to manage
140+
* Wrapup
141+
** AltDistErl
142+
- Alternative distribution modules are straight-forward to write!
143+
- Plug in your favorite protocols to carry DistErl
144+
- AMQP-based can simplify setup if you already have it in your infra
145+
** Thanks
146+
- =amqp_dist=: https://github.com/2600hz/erlang-amqp_dist
147+
148+
#+begin_center
149+
Questions?
150+
#+end_center

doc/codebeam2026.pdf

198 KB
Binary file not shown.

rebar.config

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11

2-
{deps, [{amqp_client, "4.0.3"}
2+
{deps, [{amqp_client, "4.2.1"}
33
]
44
}.
55

0 commit comments

Comments
 (0)