The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Thursday, March 08, 2007

Scaling daemons

I'm still deep in programming.

So now I'm writing a daemon.

The first problem was getting MySQL to contact the daemon, and I forgot—I'm working under Unix, and interprocess communications suck under Unix. You have pipes, but they only work between processes that have a common ancestor. You can get around that problem by using named pipes, which get around the common ancestor, but there's still a limit to the amount of data that can be in the pipe, and if one side isn't listening (say, the daemon) then the other side is blocked. No good.

Oh, I could try using message queues. But it too, has problems—no automatic reclaimation of system resources when one side (or both!) crash. They're not identified by name and there are no tools to list the exiting message queues or delete them! And they can't be used with the multiplexing I/O API (select() or poll(), which I'll probably be using if I'm dealing with tons of connections).

The same problems exist for shared memory by the way, plus a whole slew of synchronization problems between unrelated processes, which probably mandates the use of semaphores, which again, have similar problems with message queues and shared memory.

Told you interprocess communication under Unix sucks.

Leaving sockets. Since they use regular file descriptors, they work with the multiplexed I/O API, but I hate using select(), since you end up scanning through arrays. The code that uses select() typically looks like:

while(1)
{
  FD_ZERO(&list);
  
  for (i = 0 ; i < files_count ; i++)
    FD_SET(files[i],&list);
  
  rc = select(FD_SETSIZE,&list,NULL,NULL,NULL);
  
  if (rc < 0)	/* select() returned an error */
  {
    handle_error(errno);
    continue;
  }
  else if (rc > 0)	/* we got some */
  {
    for (i = 0 ; i < files_count ; i++)
    {
      if (FD_ISSET(files[i],&list)
      {
        if (files[i] == listen_socket)
        {
          len = sizeof(remote_addr);
          connection = accept(listen_socket,&remote_addr,&len);
          
          /*----------------------------------
          ; oh great, we need to add this to the end
          ; of the files array, but that readjusts the 
          ; file_count variable ... buyer beware ... 
          ;-------------------------------------*/
          
          add_to_list(files,connection);
        }
        
        /*-----------------------------------
        ; oh bloody hell, we're listening to 
        ; MySQL as well ... sigh.
        ;----------------------------------*/
        
        else if (files[i] == mysql_connection)
        {
          handle_that_mess(mysql_connection);
        }
        
        /*---------------------------------------
        ; otherwise it's a connection from outside
        ;--------------------------------------*/
        
        else
        {
          /*-----------------------------------
	  ; oh man, we need to find the data
          ; associated with this connection, so
          ; that means another scan of some other
          ; list ... Aiiiiieeeeeeeeeeeeeeee!

I've been down this route before, and it resulted in some of the most convoluted code I've ever written. And looking at poll(), it doesn't appear much better.

I could get around using select() or poll() by creating a multithreaded or multiprocess application, but that's a whole new can of worms I'm opening up (deadlocks or race conditions anyone?) in addition to the problems I mentioned above about interprocess communication.

In looking around for a usable solution, I came across epoll, which is a new multiplexing I/O API in the newer Linux kernels. Reading over the documentation, it looks like you add file descriptors to an “epoll queue” (which itself is a file descriptor), then you call epoll_wait() which returns an array of file descriptors that are ready for reading or writing! It saves scanning through an entire list of file descriptors continuously asking “do you have data?”

What sold me was looking at the definition of the event structure:

typedef union epoll_data {
	void *ptr;
	int   fd;
	__uint32_t u32;
	__uint64_t u64;
} epoll_data_t;

struct epoll_event {
	__uint32_t events; /* Epoll events */
	epoll_data_t data; /* User data variable */
};

User data variable?

I get a pointer?

Associated with a file descriptor?

No way?!

Define a few structures with function pointers, and boom! The main loop now looks like:

void mainloop(int queue)
{
  struct epoll_event list[10];
  int                events;
  int                i;
  struct foo         data;

  while(1)
  {
    events = epoll_wait(queue,list,10,TIMEOUT);
    if (events < 0)
      continue;	/* error, but we ignore for now */
    for (i = 0 ; i < events ; i++)
    {
      data = list[i].data.ptr;
      (*data->fn)(&list[i]);	/* call our function */
    }
  }
}

Man, this now becomes easy. No more having to constantly check file descriptors or maintaining lists of file descriptors. I'm in heaven with this stuff. Even better that this method scales beautifully.

Obligatory Picture

[I'm wearing a goatee—I must be my evil twin brother.]

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: http://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

http://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2014 by Sean Conner. All Rights Reserved.

Listed on BlogShares