Saturday, May 29, 2010

Raw Socket Programming in Erlang: Reading Packets Using PF_PACKET

BSD has BPF, Solaris has DLPI and Linux, well, has had many interfaces. The latest uses a linux specific protocol family, PF_PACKET. PF_PACKET can receive whole packets from the network as well as generate them, like a combination of BPF and the BSD raw socket interface.

PCAP is an abstraction over these different interfaces. epcap uses a system process linked to the PCAP library to read ethernet frames and send them as messages into Erlang using the port interface. Using procket with the PF_PACKET socket option, I've been playing with reading packets directly off the network and generating them as well.

The PF_PACKET interface is used by passing options to socket().
int socket(int domain, int type, int protocol);
  • The protocol family is, of course, PF_PACKET.
  • The type may be either SOCK_RAW or SOCK_DGRAM. SOCK_RAW will return the whole packet, including the ethernet header. A process sending a packet must prepend a link layer header. A socket with type SOCK_DGRAM will strip off the link layer header and generate a valid header for outgoing packets.
  • The protocol is selected from one of the values in linux/if_ether.h.
    #define ETH_P_IP    0x0800
    #define ETH_P_ALL   0x0003
    ETH_P_ALL will retrieve all network packets and ETH_P_IP just the IP packets. The values are in host-endian format and will need to be converted to network byte order before being used as arguments to socket().

Receving Packets using recvfrom()

To send and receive packets from a socket using PF_PACKET, the normal connection-less socket operations are used: sendto() and recvfrom(). By default, socket operations will block, unless the O_NONBLOCK flag is set using fcntl(). The gen_udp module in Erlang internally calls recvfrom(), so it can deal with raw sockets. Another example of using gen_udp in this way is for sending ICMP packets and reading the ICMP ECHO replies. Alternatively, I've added a recvfrom/2 function to the procket NIF for testing.

Sniffing Packets in Erlang

To read packets from the network device, either gen_udp or the NIF recvfrom/2 can be used. Using gen_udp:

-define(ETH_P_IP, 16#0008).
-define(ETH_P_ALL, 16#0300).

sniff() ->
    {ok, S} = procket:listen(0, [
            {protocol, ?ETH_P_ALL},
            {type, raw},
            {family, packet}
    {ok, S1} = gen_udp:open(0, [binary, {fd, S}, {active, false}]),

loop(S) ->
    Data = gen_udp:recv(S, 2048),
    error_logger:info_report([{data, Data}]),
The definitions of ETH_P_IP and ETH_P_ALL are in big endian format. The port is irrelevant and is set to 0 in procket:listen/2. The type is raw but can also be set to dgram. Using the NIF, the process must poll the socket. procket:recvfrom/2 will return the atom nodata if the socket returns EAGAIN; the tuple {ok, binary} with the binary data representing the packet or a tuple holding the value of errno, e.g., {error, {errno, strerror(errno)}}. The return values will probably change in the future.
sniff() ->
    {ok, S} = procket:listen(0, [
            {protocol, ?ETH_P_ALL},
            {type, raw},
            {family, packet}

loop(S) ->
    case procket:recvfrom(S, 2048) of
        nodata ->
        {ok, Data} ->
            error_logger:info_report([{data, Data}]),
        Error ->

Monday, May 24, 2010

ICMP Ping in Erlang

(Also see ICMP Ping in Erlang, part 2)

ICMP ECHO Packet Structure

RFC 792 describes an ICMP ECHO packet as:
  • Type:8
  • Code:8
  • Checksum:16
  • Identifier:16
  • Sequence Number:16
  • Data1:8
  • ...
  • DataN:8

The number after the colon represents the number of bits in the field.
  • The type field for ICMP ECHO is set to 8. The response (ICMP ECHO REPLY) has a value of 0.
  • The code is 0.
  • The checksum is a one's complement checksum that covers both the ICMP header and the data portion of the packet. An Erlang version looks like:
    makesum(Hdr) -> 16#FFFF - checksum(Hdr).
    checksum(Hdr) ->
        lists:foldl(fun compl/2, 0, [ W || <<W:16>> <= Hdr ]).
    compl(N) when N =< 16#FFFF -> N;
    compl(N) -> (N band 16#FFFF) + (N bsr 16).
    compl(N,S) -> compl(N+S).
  • The identifier and sequence number allow clients on a host to differentiate their packets, for example, if multiple ping's are running. The client will usually increment the sequence number for each ICMP ECHO packet sent.
  • Data is the payload. Traditionally, it holds a struct timeval so the client can calculate the delay without having to maintain state, but any value can be used, such as the output of erlang:now/0. The remainder is padded with ASCII characters.
The description of an ICMP packet in Erlang is very close to the specification. For ICMP ECHO:
<<8:8, 0:8, Checksum:16, Id:16, Sequence:16, Payload/binary>>
The ICMP ECHO reply is the same packet returned, with the type field set to 0 and an updated checksum:
<<0:8, 0:8, Checksum:16, Id:16, Sequence:16, Payload/binary>>

Opening a Socket

Sending out ICMP packets requires opening a raw socket. Aside from the issues of having the appropriate privileges, Erlang does not have native support for handling raw sockets. I used procket to handle the privileged socket operations and pass the file descriptor into Erlang. Once the socket is returned to Erlang, we can perform operations on it as an unprivileged user. Since there isn't a gen_icmp module, we need some way of calling sendto()/recvfrom() on the socket. gen_udp uses sendto(), so we can misuse it (with some quirks) for our icmp packets.
% Get an ICMP raw socket
{ok, FD} = procket:listen(0, [{protocol, icmp}]),
% Use the file descriptor to create an Erlang socket structure
{ok, S} = gen_udp:open(0, [binary, {fd, FD}]),
The port is meaningless, so 0 is passed in as an argument. We create the packet payload twice: first with a zero'ed checksum, then with the results of the checksum.
make_packet(Id, Seq) ->
    {Mega,Sec,USec} = erlang:now(),
    Payload = list_to_binary(lists:seq(32, 75)),
    CS = makesum(<<?ICMP_ECHO:8, 0:8, 0:16, Id:16, Seq:16, Mega:32, Sec:32, USec:32, Payload/binary>>),
        8:8,    % Type
        0:8,    % Code
        CS:16,  % Checksum
        Id:16,  % Id
        Seq:16, % Sequence
        Mega:32, Sec:32, USec:32,   % Payload: time
The packet can be sent via the raw socket using gen_udp:send/4, with the port again set to 0.
ok = gen_udp:send(S, IP, 0, Packet)
Since we're abusing gen_udp, we can wait for a message to be sent to the process:
    {udp, S, _IP, _Port, <<_:20/bytes, Data/binary>>} ->
        {ICMP, <<Mega:32/integer, Sec:32/integer, Micro:32/integer, Payload/binary>>} = icmp(Data),
            {type, ICMP#icmp.type},
            {code, ICMP#icmp.code},
            {checksum, ICMP#icmp.checksum},
            {sequence, ICMP#icmp.sequence},
            {payload, Payload},
            {time, timer:now_diff(erlang:now(), {Mega, Sec, Micro})}
    5000 ->
        error_logger:error_report([{noresponse, Packet}])
In the above code snippet, you may have noticed the first 20 bytes of the payload is stripped off. Comparing the ICMP packet we sent and the response handed to the process by gen_udp:
icmp: <<8,0,186,30,80,228,0,0,0,0,4,250,0,12,16,77,0,1,69,0,32,33,34,35,36,
    response: <<69,0,0,84,101,155,64,0,64,1,154,44,192,168,220,187,192,168,220,
While the process sent a 64 byte ICMP packet, gen_udp hands it an 84 byte packet which includes the 20 byte IPv4 header. An example of an Erlang ping is included with procket on github. The example will just print out the packets using error_logger:info_report/1:
1> icmp:ping("").

=INFO REPORT==== 24-May-2010::16:21:37 ===
    type: 0
    code: 0
    checksum: 52034
    id: 14837
    sequence: 0
    payload: <<" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJK">>
    time: 16790