Thursday, March 08, 2007
Scaling daemons
I'm still deep in programming.
So now I'm writing a daemon.
The first problem was getting MySQL to contact the daemon, and I forgot—I'm working under Unix, and interprocess communications suck under Unix. You have pipes, but they only work between processes that have a common ancestor. You can get around that problem by using named pipes, which get around the common ancestor, but there's still a limit to the amount of data that can be in the pipe, and if one side isn't listening (say, the daemon) then the other side is blocked. No good.
Oh, I could try using message queues. But it too, has problems—no
automatic reclaimation of system resources when one side (or both!) crash.
They're not identified by name and there are no tools to list the exiting
message queues or delete them! And they can't be used with the multiplexing
I/O API (select()
or
poll()
, which I'll probably be using if I'm dealing with tons
of connections).
The same problems exist for shared memory by the way, plus a whole slew of synchronization problems between unrelated processes, which probably mandates the use of semaphores, which again, have similar problems with message queues and shared memory.
Told you interprocess communication under Unix sucks.
Leaving sockets. Since they use regular file descriptors, they work with
the multiplexed I/O API, but I hate
using select()
, since you end up scanning through arrays. The
code that uses select()
typically looks like:
while(1) { FD_ZERO(&list); for (i = 0 ; i < files_count ; i++) FD_SET(files[i],&list); rc = select(FD_SETSIZE,&list,NULL,NULL,NULL); if (rc < 0) /* select() returned an error */ { handle_error(errno); continue; } else if (rc > 0) /* we got some */ { for (i = 0 ; i < files_count ; i++) { if (FD_ISSET(files[i],&list) { if (files[i] == listen_socket) { len = sizeof(remote_addr); connection = accept(listen_socket,&remote_addr,&len); /*---------------------------------- ; oh great, we need to add this to the end ; of the files array, but that readjusts the ; file_count variable ... buyer beware ... ;-------------------------------------*/ add_to_list(files,connection); } /*----------------------------------- ; oh bloody hell, we're listening to ; MySQL as well ... sigh. ;----------------------------------*/ else if (files[i] == mysql_connection) { handle_that_mess(mysql_connection); } /*--------------------------------------- ; otherwise it's a connection from outside ;--------------------------------------*/ else { /*----------------------------------- ; oh man, we need to find the data ; associated with this connection, so ; that means another scan of some other ; list ... Aiiiiieeeeeeeeeeeeeeee!
I've been down this route
before, and it resulted in some of the most convoluted code I've ever
written. And looking at poll()
, it doesn't appear much
better.
I could get around using select()
or poll()
by
creating a multithreaded or multiprocess application, but that's a whole new
can of worms I'm opening up (deadlocks or race conditions anyone?) in
addition to the problems I mentioned above about interprocess
communication.
In looking around for a usable solution, I came across epoll
, which
is a new multiplexing I/O API in the newer Linux
kernels. Reading over the documentation, it looks like you add file
descriptors to an “epoll queue” (which itself is a file descriptor), then
you call epoll_wait()
which returns an array of file
descriptors that are ready for reading or writing! It saves scanning
through an entire list of file descriptors continuously asking “do you have
data?”
What sold me was looking at the definition of the event structure:
typedef union epoll_data { void *ptr; int fd; __uint32_t u32; __uint64_t u64; } epoll_data_t; struct epoll_event { __uint32_t events; /* Epoll events */ epoll_data_t data; /* User data variable */ };
User data variable?
I get a pointer?
Associated with a file descriptor?
No way?!
Define a few structures with function pointers, and boom! The main loop now looks like:
void mainloop(int queue) { struct epoll_event list[10]; int events; int i; struct foo data; while(1) { events = epoll_wait(queue,list,10,TIMEOUT); if (events < 0) continue; /* error, but we ignore for now */ for (i = 0 ; i < events ; i++) { data = list[i].data.ptr; (*data->fn)(&list[i]); /* call our function */ } } }
Man, this now becomes easy. No more having to constantly check file descriptors or maintaining lists of file descriptors. I'm in heaven with this stuff. Even better that this method scales beautifully.