thundering herd; bleh

Jul 03, 2007 18:06

Imagine the following scenario:
  • Parent process opens listening socket, but will never accept() on it. Opening in parent so future forked child processes will inherit the fd, and because it's a low port (perhaps) and we want to open it before we drop root.
  • Child processes (one per CPU) inherit listening fd, but they're event-based, handling many ( Read more... )

tech, mogilefs, lazyweb

Leave a comment

Comments 31

taral July 4 2007, 01:48:36 UTC
Message queues?

Reply

brad July 4 2007, 02:14:44 UTC
Didn't Linux just get them only recently? Which means I'd have to implement something else for portability anyway, so not quite worth it.

Reply

taral July 4 2007, 02:51:05 UTC
Message queues? No, they've been around for a while (2.6.6) and are very portable.

Reply

grumpy_sysadmin July 4 2007, 07:25:51 UTC
Or, you know, since POSIX defined them. Back before the dawn of time. Just cause Linux is late to the game...

Reply


taral July 4 2007, 01:54:36 UTC
There's no simple fix to this problem. The correct solution is a single-producer (epoll waiter)/multiple-consumer model with a single queue for transmission of work to the workers.

Reply

brad July 4 2007, 02:11:19 UTC
Isn't that what I described?

Reply

taral July 4 2007, 02:50:13 UTC
Not quite. Since the workers are blocking on the queue, only one of them will be woken up when work is available, and work is guaranteed to be picked up by the first available worker. You don't have to do anything about "best".

Reply


crucially July 4 2007, 02:09:50 UTC
Accept in the parent and pass the fd over to the child that should have it?

Something like http://search.cpan.org/~addi/File-FDpasser-0.09/ does it.

Reply

brad July 4 2007, 02:13:55 UTC
As long as I have a socket to the child anyway, I don't gain anything that I can see from having the child do the accept.

Either way I have to:

1) wait for readability
2) notify child
3) accept

It might even be better to have the child accept, as that's
spread out between multiple CPUs.

With fd passing, I'm just introducing portability and dependency problems.

Reply

(The comment has been removed)

crucially July 4 2007, 03:39:37 UTC
I am pretty sure ther is a syscall to figure out how many pending connections there are, but I am too lazy too look up my Stevens right now.

Reply


trim the herd ext_53623 July 4 2007, 03:01:04 UTC
How about not having every child put that listening socket fd in its select/epoll/queue set?

Instead have a token (or a few of them) that get passed among the child processes; if a child has the token then they put the fd in their select/epoll/etc., otherwise they don't. Once they accept a new connect they pass the token to the "next" child process. That at least lets you control the size of the herd, which will be equal to the number of tokens in circulation.

Reply


You're overblowing it baudehlo July 4 2007, 03:02:31 UTC
Thundering herd is an issue when you have a big herd (e.g. in Apache's 500 children situation).

How many children are you talking about here? In Qpsmtpd I basically launch one per CPU, and while yes, 4 processes get notified about readability, it's really not a big deal.

I did try the send_fd/read_fd method, and the pipe to children method, but neither really helped any as far as I could see.

Perhaps it's time to do some real world testing to see if it's really an issue for your app.

Reply


Leave a comment

Up