Pages

Tuesday, March 9, 2010

When the Bugs Have Bugs

A few months ago, I found this. Compiling a regular expression would crash beam.
N = 819,
re:compile([lists:duplicate(N, $(), lists:duplicate(N, $))]).
After going through a bit of effort, I figured out how to compile a debug version of beam. And then, of course, I discovered the clever minds behind Erlang have already thought about this and made it easy. Essentially, after compiling Erlang:
# Recommended if you are a vi user
# Yes, the debugger forces you to use emacs
cat >> ~/.emacs
(setq viper-mode t)
(require 'viper)
^D

export ERL_TOP=$(pwd)
cd erts/emulator
make debug FLAVOR=plain # or smp
cd ~-
bin/cerl -debug -gdb # -smp
After reading through the source code and adding a few printf's, I tracked the bug down to an incorrect test in PCRE. The magic number (819) apparently comes from:
819 x 5 bytes (capturing bracket) + 3 bytes (opening bracket) = 4098 bytes
The compile workspace is 4096 bytes, so there is a 2 byte overflow. Well, today Phillip Hazel, the author of PCRE, corrected the bug. Awesome!! Thanks, Phillip!
So here I am making the world safer one bug at a time, preparing a patch for Erlang. Except when I went to test the fix on Mac OS X, beam crashed. Ouch. This time:
% works!
N = 611,
re:compile([lists:duplicate(N, $(), lists:duplicate(N, $))]).

% booo! crashes!!
N = 612,
re:compile([lists:duplicate(N, $(), lists:duplicate(N, $))]).
Except, beam didn't crash when running inside gdb. I figured out the debug beam was non-smp and, after compiling a debug smp version, I got the longest backtrace EVAH.

Yet the same code works with an SMP Erlang on Solaris.
Blah, debugging threaded code is a pain. If someday, someone figures out how to do something malicious with this, please send me a postcard from whatever island retreat you've purchased with all your stolen credit cards or DoS extortions.

Monday, February 15, 2010

Erlang and Excessively Long Hostnames

I've been reading through the Erlang source code, trying to get familiar with it and looking for small bugs.
inet_gethost is a port that handles name lookups. The source code for it is in:
$ERL_TOP/erts/etc/common/inet_gethost.c
inet_gethost is suprisingly complicated, mainly because of portability concerns and probably due to age as well (the code looks to be about 10+ years old). inet_gethost works by starting up a master process that forks a pool of slave processes and waits for data coming from stdin. When it reads a packet from the Erlang side, it sends the data to the slave over a pipe. The slave does a gethostbyname() (or the IPv6 equivalents), blocking in the lookup, then writes the response.
Setting the environment variable "ERL_INET_GETHOST_DEBUG" will print out some extra debug messages. The values in the code range from 0 (debug disabled) to 5:
export ERL_INET_GETHOST_DEBUG=5
erl
To test how inet_gethost handles some simple edge cases, we run the following:
1> inet:gethostbyname(lists:duplicate(3,"n")).
inet_gethost[4924] (DEBUG):Saved domainname .
inet_gethost[4924] (DEBUG):Created worker[4925] with fd 3
inet_gethost[4924] (DEBUG):Saved domainname .
inet_gethost[4925] (DEBUG):Worker got request, op = 1, proto = 1, data = nnn.
inet_gethost[4925] (DEBUG):Starting gethostbyname(nnn)
inet_gethost[4925] (DEBUG):gethostbyname error 1
{error,nxdomain}
Increasing the number of characters in the domain name reveals something interesting:
4> inet:gethostbyname(lists:duplicate(100000,"n")).
inet_gethost[4924] (DEBUG):Saved domainname .
inet_gethost[4924] (DEBUG):Saved domainname .
inet_gethost[4924] (DEBUG):reap_children: res = -1, errno = 10.
inet_gethost[4924] (DEBUG):End of file while reading from pipe.
inet_gethost[4924]: WARNING:Malformed reply (header) from worker process 4925.
inet_gethost[4924] (DEBUG):Killing worker[4925] with fd 3, serial 4
{error,timeout}
On Mac OS X, the CPU usage for inet_gethost shoots up as well.
The weird error ("Malformed reply ...") happens because the slave process crashes. If you look at the two outputs, the error message that should be displayed after "Saved domainname." ("Worker got request ...") is never printed. That's because of an overflow that happens when the debug output is printed (buff is only 2048 bytes):
static void debugf(char *format, ...)
{
    char buff[2048];
    char *ptr;
    va_list ap;

    va_start(ap,format);
    sprintf(buff,"%s[%d] (DEBUG):",program_name,(int) getpid());
    ptr = buff + strlen(buff);
    vsprintf(ptr,format,ap);
    strcat(ptr,"\r\n");
    write(2,buff,strlen(buff));
    va_end(ap);
}
Replacing vsprintf() with vsnprintf() fixes that bug, but inet_gethost will still crash on Mac OS X Snow Leopard.
Writing a small program to call gethostbyname() on Mac OS X proves it is not an Erlang bug:

Looks like a bug in gethostbyname() on Mac OS X Snow Leopard, while doing a multicast DNS lookup:
Feb 15 12:42:48 ack mDNSResponder[18]:  77: ERROR: read_msg - hdr.datalen 70001 (11171) > 70000
Feb 15 12:42:48 ack ./gho[4852]: dnssd_clientstub write_all(4) failed -1/70028 32 Broken pipe
Feb 15 12:42:48 ack ./gho[4852]: dnssd_clientstub deliver_request ERROR: write_all(4, 70028 bytes) failed
Feb 15 12:42:48 ack ./gho[4852]: dnssd_clientstub write_all(4) failed -1/28 32 Broken pipe

Sunday, February 7, 2010

Erlang Ternary Operators

tidier came up with a cool suggestion for some code. One of the case statements:
Valid = case makesum(Hdr) of
    0 -> true;
    _ -> false
end,
was replaced with:
Valid = makesum(Hdr) =:= 0,
Maybe it's succintness at the expense of readability but I liked it.
Sometimes, when I'm programming in Erlang, I miss the C style ternary operators. You know, the ones that look like this:
i = ( (i == 0) ? 1 : 0);
It's an easy way of toggling between values. In Erlang, if you want to flip between true and false, you could write it as:
NewValue = case Value of
    true -> false;
    false -> true
end
But the tidier way reveals a simpler version:
NewValue = Value =/= true
The new version isn't semantically identical to the old version though.
true = foo =/= true
The old version would throw a case_clause exception if value were 'foo'. In both cases, the value of "Value" may already be controlled with guards, so it may be ok.

Tuesday, January 26, 2010

SoDS and TXT Records

Sometimes the sods client (sdt) will return an error "Invalid base64 encoded packet". If run with the default options, sdt will use TXT records and it's likely that someone, in between you and the sods server, is re-writing the TXT records.

In this particular case, it might be the DNS hosting service that I used for the test domain (GoDaddy) inserting an SPF record. Thanks a bunch for that.

But I've seen hotel networks where TXT records are MITM'ed, for some sort of nefarious Active Directory scheme.

Anyway, the fix is to run sdt with the "-t" flag set to use NULL or CNAME records ("-t null" or "-t cname").

Monday, January 25, 2010

wwallo is a tag cloud for where you are

wwallo uses the HTML5 geolocation capabilities of your browser to generate a tag cloud for wherever you are (or, at least, where it guesses you are). Right now, wwallo gathers data just by looking at Twitter; in my experience that means your location is marked by people bitching about work and school, dropping the f-bomb, boasting about how drunk they've gotten or will get and trying to be clever (aka twitter "trends"). Also, celebrity gossip. Awesomeness. But maybe it's just where I live.

So, beside the insight into that little slice of life outside your dark, linux infested man cave and creeping on the cute neighbourhood girls (not that I condone such behaviour, you perv o.O), what is wwallo useful for? Aside from marvelling about the inaccuracy of Twitter's geographic data and meditating on the 504's resulting from their "RESTful" API?

wwallo, like a lot of web applications nowadays, is just a bunch of (well, 2, sort of) RESTful interfaces exposed as an API that a Javascript application calls ("exposes" an API is sort of funny, teetering between jargon and a criminal warrant for indecent usage). The application, in this case, runs in a browser.

Well, you can see for yourself. I'll wait.

It's pretty simple, isn't it? None of the complications you've probably been dreading if you've worked with any enterprise web applications <cough>SOAP<cough>. The interfaces are simple and predictable, therefore safer. With many web apps, even the developer can't tell you all the paths for input and output of data (hence, the need for tools like LAPSE in the Java world (BTW, check out the URL for LAPSE, that username is awesome and inspires, in me, just a bit of jealousy)).

You can try out wwallo yourself by visiting the website. It's running on a test server for now but maybe someday it'll move to somewhere permanent.

Using wwallo's RESTful Interface


If for some insane reason you want to integrate your web application with wwallo, it's simple. Given the user's location, use a GET request:
http://wwallo.com/geocode/<lat>,<lon>,<radius>km
The radius must be in kilometers (don't even ask, just multiply by 1.6 already).

If you don't know the user's location, you can call into wwallo using the following and wwallo will figure out the location by the IP address:
http://wwallo.com/geocode/null,<radius>km
But if you're proxying the user's request from your service, well, you've just made a tag cloud for wherever your server, not your user, is.

wwallo will return a JSON data structure. The beauty of webmachine is that it can, very simply, be made to return any presentation format; however, at the moment, wwallo only returns JSON.
"pos" is the location based on your IP address. "address" is, obviously, the source IP address. "tweet", "author" and "image" are arrays, sharing a common offset. "count" contains each word, after processing through a stemming library.
{
    "pos": "45.000000,-69.500000,5km",
    "address": "1.1.1.11",
    "country": "Antartica",
    "city": "Penguinistaville",
    "tweet": [
        "What am I going to spend my Saturdays doing now that college football is done??",
        "@SI_PeterKing dick butkis still my fav name! http://myloc.me/2KNBX",
        "@SI_PeterKing I the ray guy sent u that tweet using a different name! Lol http://myloc.me/2KNqq",
        "I bought my 1st snow shovel today. Not really sure how I feel about that.",
    ],
    "author": [
        "clint_on_barley",
        "Botha420",
        "Botha420",
        "clint_on_barley",
    ],
    "image": [
        "http://a1.twimg.com/profile_images/579214364/Photo_on_2009-11-23_at_10.20__5_normal.jpg",
        "http://a1.twimg.com/profile_images/280041192/1041_normal.gif",
        "http://a1.twimg.com/profile_images/280041192/1041_normal.gif",
        "http://a1.twimg.com/profile_images/579214364/Photo_on_2009-11-23_at_10.20__5_normal.jpg",
    ],
    "count": {
        "that": "5",
        "be":"2",
        "big":"2",
        "know":"2",
        "love":"2",
        "not":"2",
        "1st":"1",
        "abl":"1",
        "about":"1",
        "after":"1",
        "appear":"1",
        "appl":"1",
        "around":"1",
        "becom":"1",
    }
}
In conclusion, wwallo allows you to voyeuristically live your life through the lives of others who aren't leading very interesting lives either. Maybe the next version of wwallo should be a tagcloud for where you want to be, geocoordinates set to Tahiti.

And did I mention that webmachine and jQuery are freaking awesome?

Monday, January 11, 2010

Continuing the procket clean up

Some small changes to procket. If you're playing along at home, the changes don't affect the procket:listen/1 and procket:listen/2 interface.

The procket NIF now requires the elements to be acted on (the path to the Unix socket and/or the file descriptor of the listener on the Unix socket) to be explicitly passed. The Erlang code must track the path to the Unix socket, the fd listening on the Unix socket and the privileged socket returned from procket_cmd.

Since the procket NIF does not internally track any state or allocate memory, it can handle multiple file desciptors up to the process' resource limits.

The changes from the commit message:

open/2 -> open/1 : vestigal protocol arg removed, stick to streams. Erlang module changed to match (along with the bizarre passing in of the port as a protocol, who did that? o_O)

Returns the socket descriptor listening on the Unix socket: {ok, FD}

poll/0 -> poll/1 : takes the socket descriptor, returns {ok, Socket}

close/0 -> close/2 : close(SocketPath, SocketDescriptor), closes the socket descriptor and deletes the socket path, returns "ok"

Saturday, January 9, 2010

Sockets with Privileges, or Avoiding Setuid Erlang

There are a few things which require, or at least are easier to handle, in process: PAM, GSS-API, file descriptors .... While it may be possible to handle these using a separate process, like an Erlang port, it's more comfortable to handle it within the virutal machine. Previously, the only way to do this would have been by using a linked-in driver; though I haven't tried writing one, they have the reputation of being complex and error prone, an easy way of reducing the reliability of the Erlang VM.

With the introduction of the Erlang NIF interface, there now exists a simple, though synchronous, method of interfacing with C libraries. Using the NIF interface is pretty easy.

One of the problems with using a virtual machine is dealing with actions that require heightened privileges. In a standalone program, you would simply run as root, do whatever needs to be done as root, then drop your privileges. With a language running in a virtual machine, the VM does way too much before your code would be able to drop its privs. The result is unpredictable.

Ways of Handling Privileged Sockets


One example is requesting a socket on a low port.

For completeness, there are many ways of handling privileged sockets without resorting to setuid:
  1. Use the firewalling capability of the OS, like iptables, to forward packets from the low port to the unrestricted port on which your application is running.
  2. Use a separate device, like a load balancer, router, or firewall, to forward packets from the low port to your service.
  3. Run an application in userspace that binds the low port and forwards packets; this could be a dedicated proxy like squid or nginx or a small shim like socat.
  4. Run as root. All the power of Erlang combined with complete system access! What could go wrong?

    For certain tasks, this may be the way to go though. It'd be interesting (and very likely awesome) to see a version of cfengine (or puppet or chef) written in Erlang.
There may be times that doing any of the above is not too convenient. procket provides a way for the Erlang VM to grab a privileged socket without too much hassle.

procket consists of an NIF shared library and a setuid executable. On Unix systems, file descriptors are allowed to be passed between processes using local (Unix) domain sockets. procket works by creating a local domain socket from within the Erlang VM using an NIF, listening on the socket, then spawning a setuid exectuable that binds the privileged socket. The setuid executable then drops its privs and passes the file descriptor back to the VM over the local socket.

To handle the actual setup of the local domain sockets, I used libancillary (which quite rightly bills itself as "black magic on Unix domain sockets").

The Erlang NIF


The NIF consists of a C file that is compiled into a shared library and an Erlang module to load the library. The NIF interface is setup using a macro:
ERL_NIF_INIT(procket, nif_funcs, load, reload, NULL, NULL)
The first argument is the name of the module to be called from within Erlang. The second argument points to the struct holding our exports. In this case:
static ErlNifFunc nif_funcs[] = {
        {"open", 2, sock_open},
        {"poll", 0, poll},
        {"close", 0, sock_close}
};
The export "open" in the procket module, called with an arity of 2, calls our C function sock_open.

The NIF can optionally call other functions based on certain events (as defined in the ERL_NIF_INIT() macro), such as on load, upgrade, etc. I've only handled load and reload to allocate memory for the data structures.

NIF's can hold state. While it would be more predictable and safer to require arguments to be returned and passed in again on each function call (the NIF holds the file descriptor and path to the Unix socket), for expediency, I didn't do it. And even in this small example, it led to a few annoying bugs. Let that be a lesson.

The module interface has three actions: open, poll and close.

"open" takes a path (a string) to the location of the Unix domain socket and a port number. For safety, the socket should probably reside somewhere only the user running the Erlang VM has access to (by default, the socket will be placed under a directory in /tmp or wherever you have TMPDIR set to). On some platforms, Unix sockets are created with mode 777, which could lead to hijinks on multi-user systems.

"open" creates the Unix socket, sets the socket to non-blocking, then listens on it.

"poll" attempts to accept a connection on the Unix socket. If there is a connected client, it will receive the message (our file descriptor) and return a tuple in the format {ok, FD}.

"close" removes the socket and cleans up the data structure and pipe file descriptor.

The socket path and Unix socket file descriptor are stored in a data structure within the NIF, so the caller doesn't need to provide them when using poll() or close(). Arguably, both poll() and close() should have an arity of 2 as well (path, pipe file descriptor); that way the module would be stateless and would be able to handle as many sockets as required by the program. Probably a mistake that should be corrected in the future.

The Setuid Binary


procket_cmd.c is spawned from the procket Erlang module with root privileges, using setuid or sudo. It essentially does:
  1. parses command line arguments (potentially dangerous since we are parsing user input and performing actions, such as memory allocation, based on it)
  2. opening the socket and performing any additional operations required by the protocol
  3. dropping our privileges
  4. connecting to the Unix socket and writing the socket file descriptor
  5. exiting

On error, the "procket" command will print out some debug output.

The Erlang Module


The Erlang module's interface mirrors the C NIF: open/2, poll/0 and close/0. To make it easier to use, these exports are wrapped by listen/1 and listen/2.
listen(Port) ->
    listen(Port, []).
listen(Port, Options) when is_integer(Port), is_list(Options) ->
    Opt = case proplists:lookup(pipe, Options) of
        none -> Options ++ [{pipe, mktmp:dir() ++ "/sock"}];
        _ -> Options
    end,
    ok = open(proplists:get_value(pipe, Opt), Port),
    Cmd = make_args(Port, Opt),
    case os:cmd(Cmd) of
        [] ->
            poll();
        Error ->
            {error, procket_cmd, Error}
    end.
listen/2 composes the command line for the procket binary, atomically creates a unique path for the Unix socket if one was not provided (to prevent symlink race conditions), runs the setuid executable then polls the Unix socket.

Preparing command line arguments to be passed to an executable is, admittedly, pretty ugly and risky, in a sort of retro, pre-millenial CGI sort of way. Passing the arguments from the program is fine, but allowing any sort of user input is just nasty. I once hacked a web app by putting a ";some command here" into the password field of an account signup (account details, I discovered, were set up by running a command in the shell).

An Example


Here's how you would use procket:
$ cd procket
$ erl -pa ebin
Erlang R13B03 (erts-5.7.4) [source] [rq:1] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.7.4  (abort with ^G)
1> {ok, FD} = procket:listen(53, [{progname, "sudo priv/procket"},{protocol, udp}]).
{ok,9}
netstat should display a socket listening on port 53/udp.
2> {ok, S} = gen_udp:
open(53, [{fd,FD}]).
{ok,#Port<0.929>}
gen_udp:open/2 and gen_tcp:listen/2 both take a file descriptor as an option.
3> receive M -> M end.
Use netcat to send some data to the Erlang process:
nc -u localhost 53
blah
^C
At your Erlang prompt, you should see something like:
{udp,#Port<0.929>,{127,0,0,1},47483,"blah\n"}
There is also an example of an echo server using procket in the source, see the README.

At any rate, I hope this will be marginally useful to other Erlang n00bs. In the future, I am going to play with setting socket options and requesting raw sockets. This will allow creating ICMP packets, like ping, within Erlang, as well as doing some packet manipulation.