Scaling daemons

Thursday, March 08, 2007

I'm still deep in programming.

So now I'm writing a daemon.

The first problem was getting MySQL to contact the daemon, and I forgot—I'm working under Unix, and interprocess communications suck under Unix. You have pipes, but they only work between processes that have a common ancestor. You can get around that problem by using named pipes, which get around the common ancestor, but there's still a limit to the amount of data that can be in the pipe, and if one side isn't listening (say, the daemon) then the other side is blocked. No good.

Oh, I could try using message queues. But it too, has problems—no automatic reclaimation of system resources when one side (or both!) crash. They're not identified by name and there are no tools to list the exiting message queues or delete them! And they can't be used with the multiplexing I/O API (select() or poll(), which I'll probably be using if I'm dealing with tons of connections).

The same problems exist for shared memory by the way, plus a whole slew of synchronization problems between unrelated processes, which probably mandates the use of semaphores, which again, have similar problems with message queues and shared memory.

Told you interprocess communication under Unix sucks.

Leaving sockets. Since they use regular file descriptors, they work with the multiplexed I/O API, but I hate using select(), since you end up scanning through arrays. The code that uses select() typically looks like:

while(1)
{
  FD_ZERO(&list);
  
  for (i = 0 ; i < files_count ; i++)
    FD_SET(files[i],&list);
  
  rc = select(FD_SETSIZE,&list,NULL,NULL,NULL);
  
  if (rc < 0)	/* select() returned an error */
  {
    handle_error(errno);
    continue;
  }
  else if (rc > 0)	/* we got some */
  {
    for (i = 0 ; i < files_count ; i++)
    {
      if (FD_ISSET(files[i],&list)
      {
        if (files[i] == listen_socket)
        {
          len = sizeof(remote_addr);
          connection = accept(listen_socket,&remote_addr,&len);
          
          /*----------------------------------
          ; oh great, we need to add this to the end
          ; of the files array, but that readjusts the 
          ; file_count variable ... buyer beware ... 
          ;-------------------------------------*/
          
          add_to_list(files,connection);
        }
        
        /*-----------------------------------
        ; oh bloody hell, we're listening to 
        ; MySQL as well ... sigh.
        ;----------------------------------*/
        
        else if (files[i] == mysql_connection)
        {
          handle_that_mess(mysql_connection);
        }
        
        /*---------------------------------------
        ; otherwise it's a connection from outside
        ;--------------------------------------*/
        
        else
        {
          /*-----------------------------------
	  ; oh man, we need to find the data
          ; associated with this connection, so
          ; that means another scan of some other
          ; list ... Aiiiiieeeeeeeeeeeeeeee!

I've been down this route before, and it resulted in some of the most convoluted code I've ever written. And looking at poll(), it doesn't appear much better.

I could get around using select() or poll() by creating a multithreaded or multiprocess application, but that's a whole new can of worms I'm opening up (deadlocks or race conditions anyone?) in addition to the problems I mentioned above about interprocess communication.

In looking around for a usable solution, I came across epoll, which is a new multiplexing I/O API in the newer Linux kernels. Reading over the documentation, it looks like you add file descriptors to an “epoll queue” (which itself is a file descriptor), then you call epoll_wait() which returns an array of file descriptors that are ready for reading or writing! It saves scanning through an entire list of file descriptors continuously asking “do you have data?”

What sold me was looking at the definition of the event structure:

typedef union epoll_data {
	void *ptr;
	int   fd;
	__uint32_t u32;
	__uint64_t u64;
} epoll_data_t;

struct epoll_event {
	__uint32_t events; /* Epoll events */
	epoll_data_t data; /* User data variable */
};

User data variable?

I get a pointer?

Associated with a file descriptor?

No way?!

Define a few structures with function pointers, and boom! The main loop now looks like:

void mainloop(int queue)
{
  struct epoll_event list[10];
  int                events;
  int                i;
  struct foo         data;

  while(1)
  {
    events = epoll_wait(queue,list,10,TIMEOUT);
    if (events < 0)
      continue;	/* error, but we ignore for now */
    for (i = 0 ; i < events ; i++)
    {
      data = list[i].data.ptr;
      (*data->fn)(&list[i]);	/* call our function */
    }
  }
}

Man, this now becomes easy. No more having to constantly check file descriptors or maintaining lists of file descriptors. I'm in heaven with this stuff. Even better that this method scales beautifully.

The Boston Diaries

Thursday, March 08, 2007

Scaling daemons

Obligatory Picture

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous