Monday, June 14, 2010

Fun with Raw Sockets in Erlang: Sending ICMP Packets

I've covered pinging other hosts before using the IPPROTO_ICMP protocol and raw sockets, all in Erlang.

Now I'd like to go through the exercise of sending ICMP echo packets using the Linux PF_PACKET interface. The process is somewhat tedious and complicated, so this post will reflect this, but it should be helpful since documentation for PF_PACKET tends to be a bit sparse.

Even if you're only interested in the PF_PACKET C interface, this tutorial should be helpful. But you'll have to read a bit and mentally censor the Erlang bits.

Erlang Binaries vs C structs

To provide a direct interface to sendto(), I've added an NIF interface for Erlang in procket.

The procket sendto() interface is system dependent, relying on the layout of your computer's struct sockaddr. struct sockaddr is typically constructed as follows:
struct sockaddr {
    sa_family_t     sa_family;      /* unsigned short int */
    char            sa_data[14];    /* buffer holding data, dependent on socket type */
The data held in the sa_data member of struct sockaddr varies based on the different socket types. For example, for a typical internet socket, a struct sockaddr_in socket address is used:
struct sockaddr_in {
    sa_family_t     sin_family;
    in_port_t       sin_port;
    struct          in_addr sin_addr;
    char            sin_zero[8];
Of course, these structures will vary by platform. On Linux, aside from a few inscrutable macros, the layout is similar to those shown above. BSD's, such as Mac OS X, add another structure member with the size of the structure:
u_int8_t sin_len
The appearance and placement of this attribute will cause a lot of portability problems for you if you need to get code running on different OS'es. And it's only natural, since in this tutorial we are bypassing the normal library interfaces.

So, just remember, PF_PACKET is pretty much a Linux specific interface, so we will be concentrating on the Linux eccentricities.

An Erlang Interface to sendto()

According the man page, sendto() takes the following arguments:
ssize_t sendto(
        int s,
        const void *buf,
        size_t len,
        int flags,
        const struct sockaddr *to,
        socklen_t tolen
  • s is the file descriptor, representing the socket returned by open().
  • buf is the payload to be sent in the packet.
  • len is the size of the buffer in bytes.
  • flags is the result of OR'ing together integers which affects the behaviour of the socket. Typically, flags is set to 0.
  • struct sockaddr is a buffer based on the type of socket. It is cast to the "generic" sockaddr structure. Different types of socket addresses are, for example, sockaddr_in for Internet sockets, sockaddr_un for Unix (local) sockets and sockaddr_ll for link layer sockets. We'll be looking at sockaddr_in sockets in this section and sockaddr_ll sockets when investigating sending out packets using the PF_PACKET raw socket interface later on.
It's worth noting that
sendto(socket, buf, buflen, flags, NULL, 0)
is equivalent to
sendto(socket, buf, buflen, 0, NULL, 0)
is equivelent to
With a bit of tweaking (may have to change the procket NIF a bit), we'll be able to use the sendto() to do both send()'s and write()'s in the future (both can be used when the socket has been already been bound using bind()). The procket Erlang sendto/4 interface looks like this:
sendto(Socket, Packet, Flags, Sockaddr)
  • Socket is an integer returned from procket:open/1 representing the file descriptor.
  • Packet is a binary holding the packet payload.
  • Flags is the result of OR'ing the socket options. See the sendto() man page for the possible parameters.
  • Sockaddr is an Erlang binary representation of the sockaddr structure for the type of socket in use.

An Example of Using sendto/4

In the original example of sending an ICMP echo packet from Erlang, we (mis-)used gen_udp to send and receive ICMP packets. Here is an example of sending ICMP packets using the sendto/4 NIF: To send the ICMP packet using sendto/4, we must create the struct sockaddr_in as an Erlang binary. In linux/in.h, the structure is defined as:
struct sockaddr_in {
    sa_family_t     sin_family; /* Address family: 2 bytes */
    in_port_t       sin_port;   /* Port number: 2 bytes */
    struct in_addr  sin_addr;   /* Internet address: 4 bytes */

    /* Pad to size of `struct sockaddr'. */
    unsigned char   sin_zero[8];
Both sa_family_t and in_port_t are 2 bytes. The total size of the struct is 16 bytes. The Erlang binary used to represent this is:
?PF_INET:16/native,             % sin_family
0:16,                           % sin_port
IP1:8, IP2:8, IP3:8, IP4:8,     % sin_addr
0:64                            % sin_zero
  • The value of the PF_INET macro (or 2) is taken from bits/socket.h. The value of the different PF_* macros is always in native endian format.
  • Since we are sending an ICMP packet, the port has no meaning and is set to 0.
  • IP1 through IP4 refer to the components of an IPv4 address, represented in Erlang as a 4-tuple of bytes such as {192,168,10,1}.
  • The sin_zero member is always set to 8 zero'ed bytes.
The corresponding NIF function can be found in procket.c:
nif_sendto(ErlNifEnv *env, int argc, const ERL_NIF_TERM argv[])
    int sockfd = -1;
    int flags = 0;

    ErlNifBinary buf;
    ErlNifBinary sa;

    if (!enif_get_int(env, argv[0], &sockfd))
        return enif_make_badarg(env);

    if (!enif_inspect_binary(env, argv[1], &buf))
        return enif_make_badarg(env);

    if (!enif_get_int(env, argv[2], &flags))
        return enif_make_badarg(env);

    if (!enif_inspect_binary(env, argv[3], &sa))
        return enif_make_badarg(env);

    if (sendto(sockfd,, buf.size, flags, (struct sockaddr *), sa.size) == -1)
        return enif_make_tuple(env, 2,
            enif_make_tuple(env, 2,
            enif_make_int(env, errno),
            enif_make_string(env, strerror(errno), ERL_NIF_LATIN1)));

    return atom_ok;
The nif_sendto() function takes the Erlang binary and casts it to a sockaddr structure.

Tedium, or the Perils of Constructing Packets by Hand

When requesting a file descriptor using socket(), the PF_PACKET interface allows the user to construct either whole ethernet frames (using the SOCK_RAW type) or cooked packets to which the kernel will prepend ethernet headers (using the SOCK_DGRAM type). I had some problems with SOCK_DGRAM packets which I'll probably talk about in another blog post. But for now, I'll describe how to create ICMP echo packets using the PF_PACKET SOCK_RAW type. To get a file descriptor with the appropriate settings from procket:
{ok, FD} = procket:listen(0, [{protocol, 16#0008}, {type, raw}, {family, packet}])
Notice that, since I'm on a little endian platform, I byte swapped the defintion of ETH_P_IP to big endian format.

Retrieving the Interface Index

To figure out the index of our interface, we need to call an ioctl(). Conveniently, procket provides an NIF ioctl() interface. The C ioctl() interface is defined as:
int ioctl(int d, int request, ...);
  • d is the file descriptor.
  • request is an integer representing a device dependent instruction.
  • The remaining argument to ioctl() is usually a buffer holding a device dependent structure. In this case, we will pass an ifreq structure. The buffer acts as both the input and output for the ioctl.
struct ifreq, as defined in net/if.h, is composed of 2 unions.
struct ifreq
# define IFHWADDRLEN    6
        char ifrn_name[IFNAMSIZ];   /* Interface name, e.g. "en0".  */
    } ifr_ifrn;

        struct sockaddr ifru_addr;
        struct sockaddr ifru_dstaddr;
        struct sockaddr ifru_broadaddr;
        struct sockaddr ifru_netmask;
        struct sockaddr ifru_hwaddr;
        short int ifru_flags;
        int ifru_ivalue;
        int ifru_mtu;
        struct ifmap ifru_map;
        char ifru_slave[IFNAMSIZ];  /* Just fits the size */
        char ifru_newname[IFNAMSIZ];
        __caddr_t ifru_data;
    } ifr_ifru;
But to get the interface index, we're only interested in these struct members:
struct ifreq {
    char ifrn_name[16];
    int  ifr_ifindex;
# define ifr_name   ifr_ifrn.ifrn_name  /* interface name   */
# define ifr_ifindex    ifr_ifru.ifru_ivalue    /* interface index      */
In Erlang terms, we can pass in the full 32 byte structure (only 4 bytes of the second union is actually used). On input, if we are interested in using the "eth0" interface:
"eth0", 96:0,   % ifrn_name, 16 bytes
0:128           % ifr_ifru union for the response, 16 bytes
On output:
"eth0", 96:0,   % ifrn_name, 16 bytes
Ifr:32,         % interface index
0:96            % unused
So, to retrieve the value in Erlang:
{ok, <<_Ifname:16/bytes, Ifr:32, _/binary>>} = procket:ioctl(S,
            Dev, <<0:((16*8) - (length(Dev)*8)), 0:128>>
  • Dev is a list holding the device name, such as "eth0" or "ath0".
  • Ifr is the part of the binary holding the interface index returned by the ioctl().
The corresponding NIF function can be found in procket.c:
nif_ioctl(ErlNifEnv *env, int argc, const ERL_NIF_TERM argv[])
    int s = -1;
    int req = 0;
    ErlNifBinary ifr;

    if (!enif_get_int(env, argv[0], &s))
        return enif_make_badarg(env);

    if (!enif_get_int(env, argv[1], &req))
        return enif_make_badarg(env);

    if (!enif_inspect_binary(env, argv[2], &ifr))
        return enif_make_badarg(env);

    if (!enif_realloc_binary(env, &ifr, ifr.size))
        return enif_make_badarg(env);

    if (ioctl(s, req, < 0)
        return error_tuple(env, strerror(errno));

    return enif_make_tuple(env, 2,
            enif_make_binary(env, &ifr));
The nif_ioctl() function takes, as arguments, the socket descriptor and a binary buffer representing the ifreq structure. The binary is made writable, passed to ioctl() and returned to the caller.

Preparing the ICMP Packet

Unlike the other examples of sending an ICMP packet, we'll need to prepare more than the ICMP header and payload. Because we are sending directly out on the interface, we have to add the ethernet and IPv4 header.

Ethernet Header

The ethernet header is composed of 6 bytes each for the destination and source MAC addresses and two bytes for the ethernet type.
  • Destination MAC Address:48
  • Source MAC Address:48
  • Type:16
The list of ethernet types can be found in linux/if_ether.h. The Erlang specification for this message format would be (assuming the destination mac address is 00:aa:bb:cc:dd:ee and the source mac address is 00:11:22:33:44:55):
16#00, 16#aa: 16#bb, 16#cc, 16#dd, 16#ee,   % destination MAC address
16#00, 16#11: 16#22, 16#33, 16#44, 16#55,   % source MAC address
16#08, 16#00                                % type: ETH_P_IP

IPv4 Header

The IPv4 header is:
  • Version:4
  • IHL:4
  • ToS:8
  • Total Length:16
  • Identification:16
  • Flags:3
  • Fragment Offset:13
  • Time to Live:8
  • Protocol:8
  • Checksum:16
  • Source Address:32
  • Destination Address:32
I won't bother to explain each field. See RFC 791 for details. Constructing an Erlang IPv4 header involves declaring the header once with the checksum field set to zero, performing a checksum on the header, then incorporating the checksum in the 2 byte checksum field.
IPv4 = <<
4:4, 5:4, 0:8, 84:16,
Id:16, 0:1, 1:1, 0:1,
0:13, TTL:8, ?IPPROTO_ICMP:8, 0:16,
SA1:8, SA2:8, SA3:8, SA4:8,
DA1:8, DA2:8, DA3:8, DA4:8
  • Id is a hint for reconstructing fragmented packets by the receiving host.
  • IPPROTO_ICMP is a macro set to 1. The value is defined in netinet/in.h.
  • The checksum field is set to 0 for checksumming purposes. After the checksum has been calculated, the resulting value is placed in this field.
  • The TTL is set to 64. Packets with a time to live of 0 are discarded.
  • SA1 to SA4 are the bytes representing the IPv4 source address.
  • DA1 to DA4 are the bytes representing the IPv4 destination address.

ICMP Header

I won't go over constructing the ICMP header, since it's been covered here.

Finally Sending the Packet

We have a raw PF_PACKET socket, the index of the interface to use the sendto() operation and a binary representing the ICMP packet and payload. We have the pieces in place now to send out the ping. We could bind() the interface and then use write() or send() to push out packets. In this example, we'll specify the link layer socket address structure holding the routing information for each packet.
struct sockaddr_ll {
    unsigned short sll_family;   /* Always AF_PACKET */
    unsigned short sll_protocol; /* Physical layer protocol */
    int            sll_ifindex;  /* Interface number */
    unsigned short sll_hatype;   /* Header type */
    unsigned char  sll_pkttype;  /* Packet type */
    unsigned char  sll_halen;    /* Length of address */
    unsigned char  sll_addr[8];  /* Physical layer address */
  • sll_family is, as the comment says, always PF_PACKET in host endian format.
  • sll_protocol is usually either ETH_P_ALL or ETH_P_IP. It is passed in big endian format but is defined in the header file in host endian format. For many linux installs, this will be little endian, so it will need to be byte swapped.
  • sll_halen is the length of the physical layer address. Although there are up to 8 bytes allowed for for the physical layer address, only 6 bytes are used for ethernet.
?PF_PACKET:16/native,   % sll_family: PF_PACKET
16#0:16,             % sll_protocol: Physical layer protocol, big endian
Interface:32/native,    % sll_ifindex: Interface number
0:16,                   % sll_hatype: Header type
0:8,                    % sll_pkttype: Packet type
0:8,                    % sll_halen: address length

0:8,                    % sll_addr[8]: physical layer address
0:8,                    % sll_addr[8]: physical layer address
0:8,                    % sll_addr[8]: physical layer address
0:8,                    % sll_addr[8]: physical layer address
0:8,                    % sll_addr[8]: physical layer address
0:8,                    % sll_addr[8]: physical layer address

0:8,                    % sll_addr[8]: physical layer address
0:8                     % sll_addr[8]: physical layer address
From trial and error, only sll_ifindex needs to be set. Even the sll_family does not seem be required in this context, although the man page suggests it is required. (sll_halen and sll_addr values would otherwise be set to 6 for sll_halen and the first 6 bytes of sll_addr to the MAC address of the destination ethernet device.) The source and destination appear to be read directly from the ethernet header. The pkt module will construct an ethernet frame and send it on the network. The function interface is a bit cumbersome, forcing you to specify the MAC and IP address of both the source and destination, but allows spoofing packets from different IP/MAC combinations.
    {"eth0", {16#00,16#11,16#22,16#33,16#44,16#55}, {192,168,213,213}},
    {{16#00,16#aa,16#bb,16#cc,16#dd,16#ee}, {192,168,213,1}}
The first argument is a 3-tuple representing the network interface, source MAC and IP address. The second argument is a 2-tuple representing the destination MAC and IP address. Looking at the output from tcpdump:
# tcpdump -n -s 0 -XX -i ath0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ath0, link-type EN10MB (Ethernet), capture size 65535 bytes
18:58:40.077600 IP > ICMP echo request, id 7338, seq 0, length 64
        0x0000:  0011 2233 4455 00aa bbcc ddee 0800 4500  ....>....Y.&..E.
        0x0010:  0054 1caa 4000 4001 f1d6 c0a8 d5d5 c0a8  .T..@.@.........
        0x0020:  d501 0800 ea06 1caa 0000 0000 04fc 0007  ................
        0x0030:  2ba0 0001 2e02 2021 2223 2425 2627 2829  +......!"#$%&'()
        0x0040:  2a2b 2c2d 2e2f 3031 3233 3435 3637 3839  *+,-./0123456789
        0x0050:  3a3b 3c3d 3e3f 4041 4243 4445 4647 4849  :;<=>?@ABCDEFGHI
        0x0060:  4a4b                                     JK
18:58:40.078464 IP > ICMP echo reply, id 7338, seq 0, length 64
        0x0000:  00aa bbcc ddee 0011 2233 4455 0800 4500  ...Y.&....>...E.
        0x0010:  0054 86ee 0000 4001 c792 c0a8 d501 c0a8  .T....@.........
        0x0020:  d5d5 0000 f206 1caa 0000 0000 04fc 0007  ................
        0x0030:  2ba0 0001 2e02 2021 2223 2425 2627 2829  +......!"#$%&'()
        0x0040:  2a2b 2c2d 2e2f 3031 3233 3435 3637 3839  *+,-./0123456789
        0x0050:  3a3b 3c3d 3e3f 4041 4243 4445 4647 4849  :;<=>?@ABCDEFGHI
        0x0060:  4a4b                                     JK


  1. Hi, I have a small doubt, can't erlang directly retrieve the raw packets using socket options? do we need a procket kind of stuff? to get and set the TCP datagram?

  2. Hey Marutha!

    If you want to manipulate the whole packet (TCP, IP, ... headers), you are bypassing the OS TCP/IP stack and so would have to use something like procket.

    Don't confuse raw packets with the raw option in inet:getopts/2 and inet:setopts/2. These basically allow you to pass arbitrary structs to setsockopt(2) and getsockopt(2). See, for example, socket(7) for socket options.

  3. This comment has been removed by the author.

  4. Thanks for your inputs Michael. I work on windows, does procket work with windows too? If not what should I do to make it work on windows.

  5. Install Ubuntu in a VM :) procket uses a Linux specific interface. Can you provide some details about what you need to do? Feel free to email me, my address in on my github page!