Tuesday, January 26, 2010

SoDS and TXT Records

Sometimes the sods client (sdt) will return an error "Invalid base64 encoded packet". If run with the default options, sdt will use TXT records and it's likely that someone, in between you and the sods server, is re-writing the TXT records.

In this particular case, it might be the DNS hosting service that I used for the test domain (GoDaddy) inserting an SPF record. Thanks a bunch for that.

But I've seen hotel networks where TXT records are MITM'ed, for some sort of nefarious Active Directory scheme.

Anyway, the fix is to run sdt with the "-t" flag set to use NULL or CNAME records ("-t null" or "-t cname").

Monday, January 25, 2010

wwallo is a tag cloud for where you are

wwallo uses the HTML5 geolocation capabilities of your browser to generate a tag cloud for wherever you are (or, at least, where it guesses you are). Right now, wwallo gathers data just by looking at Twitter; in my experience that means your location is marked by people bitching about work and school, dropping the f-bomb, bragging about how drunk they've gotten or will get and trying to be clever (twitter "trends"). Also, celebrity gossip. Awesomeness. But maybe it's just where I live.

So, beside the insight into that little slice of life outside your dark, linux infested man cave and creeping on the cute neighbourhood girls (not that I condone such behaviour, you perv o.O), what is wwallo useful for? Aside from marvelling at the inaccuracy of Twitter's geographic data and meditating on the 504's resulting from their "RESTful" API?

wwallo, like a lot of web applications nowadays, is just a bunch of (well, 2, sort of) RESTful interfaces exposed as an API that a Javascript application calls ("exposes" an API is sort of funny, teetering between jargon and a criminal warrant for indecent usage). The application, in this case, runs in a browser.

Well, you can see for yourself. I'll wait.

It's pretty simple, isn't it? None of the complications you've probably been dreading if you've worked with any enterprise web applications <cough>SOAP<cough>. The interfaces are simple and predictable, therefore safer. With many web apps, even the developer can't tell you all the paths for input and output of data (hence, the need for tools like LAPSE in the Java world (BTW, check out the URL for LAPSE, that username is awesome and inspires, in me, just a bit of jealousy)).

You can try out wwallo yourself by visiting the website. It's running on a test server for now but maybe someday it'll move to somewhere permanent.

Using wwallo's RESTful Interface

If for some insane reason you want to integrate your web application with wwallo, it's simple. Given the user's location, use a GET request:<lat>,<lon>,<radius>km
The radius must be in kilometers (don't even ask, just multiply by 1.6 already).

If you don't know the user's location, you can call into wwallo using the following and wwallo will figure out the location by the IP address:,<radius>km
But if you're proxying the user's request from your service, well, you've just made a tag cloud for wherever your server, not your user, is.

wwallo will return a JSON data structure. The beauty of webmachine is that it can, very simply, be made to return any presentation format; however, at the moment, wwallo only returns JSON.
"pos" is the location based on your IP address. "address" is, obviously, the source IP address. "tweet", "author" and "image" are arrays, sharing a common offset. "count" contains each word, after processing through a stemming library.
    "pos": "45.000000,-69.500000,5km",
    "address": "",
    "country": "Antartica",
    "city": "Penguinistaville",
    "tweet": [
        "What am I going to spend my Saturdays doing now that college football is done??",
        "@SI_PeterKing dick butkis still my fav name!",
        "@SI_PeterKing I the ray guy sent u that tweet using a different name! Lol",
        "I bought my 1st snow shovel today. Not really sure how I feel about that.",
    "author": [
    "image": [
    "count": {
        "that": "5",
In conclusion, wwallo allows you to voyeuristically live your life through the lives of others who aren't leading very interesting lives either. Maybe the next version of wwallo should be a tagcloud for where you want to be, geocoordinates set to Tahiti.

And did I mention that webmachine and jQuery are freaking awesome?

Monday, January 11, 2010

Continuing the procket clean up

Some small changes to procket. If you're playing along at home, the changes don't affect the procket:listen/1 and procket:listen/2 interface.

The procket NIF now requires the elements to be acted on (the path to the Unix socket and/or the file descriptor of the listener on the Unix socket) to be explicitly passed. The Erlang code must track the path to the Unix socket, the fd listening on the Unix socket and the privileged socket returned from procket_cmd.

Since the procket NIF does not internally track any state or allocate memory, it can handle multiple file desciptors up to the process' resource limits.

The changes from the commit message:

open/2 -> open/1 : vestigal protocol arg removed, stick to streams. Erlang module changed to match (along with the bizarre passing in of the port as a protocol, who did that? o_O)

Returns the socket descriptor listening on the Unix socket: {ok, FD}

poll/0 -> poll/1 : takes the socket descriptor, returns {ok, Socket}

close/0 -> close/2 : close(SocketPath, SocketDescriptor), closes the socket descriptor and deletes the socket path, returns "ok"

Saturday, January 9, 2010

Sockets with Privileges, or Avoiding Setuid Erlang

There are a few things which require, or at least are easier to handle, in process: PAM, GSS-API, file descriptors .... While it may be possible to handle these using a separate process, like an Erlang port, it's more comfortable to handle it within the virutal machine. Previously, the only way to do this would have been by using a linked-in driver; though I haven't tried writing one, they have the reputation of being complex and error prone, an easy way of reducing the reliability of the Erlang VM.

With the introduction of the Erlang NIF interface, there now exists a simple, though synchronous, method of interfacing with C libraries. Using the NIF interface is pretty easy.

One of the problems with using a virtual machine is dealing with actions that require heightened privileges. In a standalone program, you would simply run as root, do whatever needs to be done as root, then drop your privileges. With a language running in a virtual machine, the VM does way too much before your code would be able to drop its privs. The result is unpredictable.

Ways of Handling Privileged Sockets

One example is requesting a socket on a low port.

For completeness, there are many ways of handling privileged sockets without resorting to setuid:
  1. Use the firewalling capability of the OS, like iptables, to forward packets from the low port to the unrestricted port on which your application is running.
  2. Use a separate device, like a load balancer, router, or firewall, to forward packets from the low port to your service.
  3. Run an application in userspace that binds the low port and forwards packets; this could be a dedicated proxy like squid or nginx or a small shim like socat.
  4. Run as root. All the power of Erlang combined with complete system access! What could go wrong?

    For certain tasks, this may be the way to go though. It'd be interesting (and very likely awesome) to see a version of cfengine (or puppet or chef) written in Erlang.
There may be times that doing any of the above is not too convenient. procket provides a way for the Erlang VM to grab a privileged socket without too much hassle.

procket consists of an NIF shared library and a setuid executable. On Unix systems, file descriptors are allowed to be passed between processes using local (Unix) domain sockets. procket works by creating a local domain socket from within the Erlang VM using an NIF, listening on the socket, then spawning a setuid exectuable that binds the privileged socket. The setuid executable then drops its privs and passes the file descriptor back to the VM over the local socket.

To handle the actual setup of the local domain sockets, I used libancillary (which quite rightly bills itself as "black magic on Unix domain sockets").

The Erlang NIF

The NIF consists of a C file that is compiled into a shared library and an Erlang module to load the library. The NIF interface is setup using a macro:
ERL_NIF_INIT(procket, nif_funcs, load, reload, NULL, NULL)
The first argument is the name of the module to be called from within Erlang. The second argument points to the struct holding our exports. In this case:
static ErlNifFunc nif_funcs[] = {
        {"open", 2, sock_open},
        {"poll", 0, poll},
        {"close", 0, sock_close}
The export "open" in the procket module, called with an arity of 2, calls our C function sock_open.

The NIF can optionally call other functions based on certain events (as defined in the ERL_NIF_INIT() macro), such as on load, upgrade, etc. I've only handled load and reload to allocate memory for the data structures.

NIF's can hold state. While it would be more predictable and safer to require arguments to be returned and passed in again on each function call (the NIF holds the file descriptor and path to the Unix socket), for expediency, I didn't do it. And even in this small example, it led to a few annoying bugs. Let that be a lesson.

The module interface has three actions: open, poll and close.

"open" takes a path (a string) to the location of the Unix domain socket and a port number. For safety, the socket should probably reside somewhere only the user running the Erlang VM has access to (by default, the socket will be placed under a directory in /tmp or wherever you have TMPDIR set to). On some platforms, Unix sockets are created with mode 777, which could lead to hijinks on multi-user systems.

"open" creates the Unix socket, sets the socket to non-blocking, then listens on it.

"poll" attempts to accept a connection on the Unix socket. If there is a connected client, it will receive the message (our file descriptor) and return a tuple in the format {ok, FD}.

"close" removes the socket and cleans up the data structure and pipe file descriptor.

The socket path and Unix socket file descriptor are stored in a data structure within the NIF, so the caller doesn't need to provide them when using poll() or close(). Arguably, both poll() and close() should have an arity of 2 as well (path, pipe file descriptor); that way the module would be stateless and would be able to handle as many sockets as required by the program. Probably a mistake that should be corrected in the future.

The Setuid Binary

procket_cmd.c is spawned from the procket Erlang module with root privileges, using setuid or sudo. It essentially does:
  1. parses command line arguments (potentially dangerous since we are parsing user input and performing actions, such as memory allocation, based on it)
  2. opening the socket and performing any additional operations required by the protocol
  3. dropping our privileges
  4. connecting to the Unix socket and writing the socket file descriptor
  5. exiting

On error, the "procket" command will print out some debug output.

The Erlang Module

The Erlang module's interface mirrors the C NIF: open/2, poll/0 and close/0. To make it easier to use, these exports are wrapped by listen/1 and listen/2.
listen(Port) ->
    listen(Port, []).
listen(Port, Options) when is_integer(Port), is_list(Options) ->
    Opt = case proplists:lookup(pipe, Options) of
        none -> Options ++ [{pipe, mktmp:dir() ++ "/sock"}];
        _ -> Options
    ok = open(proplists:get_value(pipe, Opt), Port),
    Cmd = make_args(Port, Opt),
    case os:cmd(Cmd) of
        [] ->
        Error ->
            {error, procket_cmd, Error}
listen/2 composes the command line for the procket binary, atomically creates a unique path for the Unix socket if one was not provided (to prevent symlink race conditions), runs the setuid executable then polls the Unix socket.

Preparing command line arguments to be passed to an executable is, admittedly, pretty ugly and risky, in a sort of retro, pre-millenial CGI sort of way. Passing the arguments from the program is fine, but allowing any sort of user input is just nasty. I once hacked a web app by putting a ";some command here" into the password field of an account signup (account details, I discovered, were set up by running a command in the shell).

An Example

Here's how you would use procket:
$ cd procket
$ erl -pa ebin
Erlang R13B03 (erts-5.7.4) [source] [rq:1] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.7.4  (abort with ^G)
1> {ok, FD} = procket:listen(53, [{progname, "sudo priv/procket"},{protocol, udp}]).
netstat should display a socket listening on port 53/udp.
2> {ok, S} = gen_udp:
open(53, [{fd,FD}]).
gen_udp:open/2 and gen_tcp:listen/2 both take a file descriptor as an option.
3> receive M -> M end.
Use netcat to send some data to the Erlang process:
nc -u localhost 53
At your Erlang prompt, you should see something like:
There is also an example of an echo server using procket in the source, see the README.

At any rate, I hope this will be marginally useful to other Erlang n00bs. In the future, I am going to play with setting socket options and requesting raw sockets. This will allow creating ICMP packets, like ping, within Erlang, as well as doing some packet manipulation.

Monday, January 4, 2010

SoDS and Domain Names

As an experiment in further obscuring traffic, I made a small change to the sods client and server to take multiple domain names from the command line. The maximum number of domains is hard coded to 256, just because.

The domain names can either be subdomains, if you have only one domain (e.g., ...) or unique domains (e.g.,

The sods server needs to be started with the same list of domains (e.g., or the DNS requests will be rejected. If you want to disable this behaviour, start the sods server with the domain name set to "any". (The sods server checks the domain name to prevent it from answering to DNS scans, otherwise it wouldn't be too stealthy. Of course, if someone knows your domain name, they can easily scan for sods, spoof requests, etc.).

I wonder if DNS servers ever throttle by domain. It would make more sense to restrict queries by client IP address, though even this could be worked around on a non-switched network, like most public wifi internet access, by ARP'ing fake MAC/IP addresses, sending out UDP packets from these IP's and sniffing the responses. On a switched network, the same effect is achievable in tandem with ettercap. Maybe a future feature to think about.