Home > Archive > Unix Programming > September 2006 > http webserver using select()
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
http webserver using select()
|
|
|
| Hi,
I read some books about network programming..
Still, I have some questions:
1. let's assume that we have http server written using nonblocking i/o
and select().
How is it possible to server many requests simultaneously?
I mean - we have, for example, static binary file that is 700 MB long
(and many other, much smaller files, that are, for example, html
webpages)
..
Let's assume that server found new socket descriptor with select(), and
accepted it..
after receiving data from client, and after parsing request header we
found that he want this big file, and we started to send him chunks of
data..
I think that this situation should block sending data for other
clients, does it (until this 700 MB file is send) ?
If not, how is it possible to serve simultaneously many connections ?
"Binding" file descriptor with open file to accepted socket descriptor
with struct{} and queuing some chunks of data with fifo, and then -
iterating to the next socket descriptor returned by select()?
--
THR
| |
| Barry Margolin 2006-09-17, 9:59 pm |
| In article <1158538340.649323.167050@d34g2000cwd.googlegroups.com>,
"Thr" <dataskin@gmail.com> wrote:
> Hi,
>
> I read some books about network programming..
> Still, I have some questions:
>
> 1. let's assume that we have http server written using nonblocking i/o
> and select().
> How is it possible to server many requests simultaneously?
>
> I mean - we have, for example, static binary file that is 700 MB long
> (and many other, much smaller files, that are, for example, html
> webpages)
> .
> Let's assume that server found new socket descriptor with select(), and
> accepted it..
> after receiving data from client, and after parsing request header we
> found that he want this big file, and we started to send him chunks of
> data..
>
> I think that this situation should block sending data for other
> clients, does it (until this 700 MB file is send) ?
>
> If not, how is it possible to serve simultaneously many connections ?
>
> "Binding" file descriptor with open file to accepted socket descriptor
> with struct{} and queuing some chunks of data with fifo, and then -
> iterating to the next socket descriptor returned by select()?
This is usually done using multiple processes or threads. Each one
handles one request, and the operating system or thread library arranges
for all the processes or threads to run concurrently.
If you really wanted to do this in a single-threaded server, the server
would have to make sure it looks for new requests periodically, even
while it's handling a request. And if there are multiple outstanding
requests, it would have to rotate through all of them, sending bits of
each response, rather than attempting to handle each one to completion
before looking for new work.
Programming something like this is likely to be very tricky, which is
why it's usually done with threads or processes. Then all the
multiplexing is handled automatically for you.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
| |
| David Schwartz 2006-09-18, 4:00 am |
|
Thr wrote:
> 1. let's assume that we have http server written using nonblocking i/o
> and select().
> How is it possible to server many requests simultaneously?
By never blocking.
> I mean - we have, for example, static binary file that is 700 MB long
> (and many other, much smaller files, that are, for example, html
> webpages)
> .
> Let's assume that server found new socket descriptor with select(), and
> accepted it..
> after receiving data from client, and after parsing request header we
> found that he want this big file, and we started to send him chunks of
> data..
>
> I think that this situation should block sending data for other
> clients, does it (until this 700 MB file is send) ?
>
> If not, how is it possible to serve simultaneously many connections ?
You just keep track of what work you were doing for each client and
make sure all your sockets are non-blocking.
You generally keep a memory buffer of a reasonable size, say 64KB. When
you get a 'select' hit for 'write' on that socket, you try to send the
data left in the buffer and update the buffer information with how much
data you sent. If the buffer gets empty, you refill it.
It's a bit tricky, you need to keep some kind of data structure for
each file descriptor indicating the status of that descriptor. Then
when you receive data from that descriptor, you check if it changes the
state of that client, and if so you modify the state in the data
structure.
I don't agree with the view that multiple threads simplify this.
Whether you're using one thread or many, you still need to keep exactly
the same information in order to know what to do when you receive some
data or need to get some more together to send.
DS
| |
| Hubble 2006-09-18, 4:00 am |
|
Thr wrote:
> Hi,
>
> I read some books about network programming..
> Still, I have some questions:
>
> 1. let's assume that we have http server written using nonblocking i/o
> and select().
> How is it possible to server many requests simultaneously?
>
> I mean - we have, for example, static binary file that is 700 MB long
> (and many other, much smaller files, that are, for example, html
> webpages)
> .
> Let's assume that server found new socket descriptor with select(), and
> accepted it..
> after receiving data from client, and after parsing request header we
> found that he want this big file, and we started to send him chunks of
> data..
>
> I think that this situation should block sending data for other
> clients, does it (until this 700 MB file is send) ?
>
> If not, how is it possible to serve simultaneously many connections ?
>
> "Binding" file descriptor with open file to accepted socket descriptor
> with struct{} and queuing some chunks of data with fifo, and then -
> iterating to the next socket descriptor returned by select()?
>
The trick lies in the accept call which generates a new file descriptor
for each
accepted connection. So you have on fd for the listening socket and
other
fds for each connection. Using select, you can detect which of the
connections
are read/writeable and if the listen socket will accept a new
connection.
The basic scheme is
generate socket fd via socket(2) -> listening socket
use bind(2) to set addr/port info
use listen(2)
loop
generate fd_set containing 1 bits for listening socket and each
connection socket
call select(2)
when fd_set is set for the listening socket, call accept(2) and
remember the
new connection.
For other 1 bits in fd_set, handle I/O
- if a fd is ready for read, read *all* pending data (this is
imporant)
- if you detect an eof, close or shutdown the socket
- if fd is ready for write, you can write data (if you want)
- if read has reached eof and you do not want to write more
data, close
socket and remove it from open connections.
end loop
The I/O handling is tricky, since the documentation is not really
clear. If says basically, that select will signal if a read I/O channel
is ready to read and a write I/O channel can receive data. If you try
to implement this using non blocking I/O, you must be aware that the
ready bit is only set on read fd's when *new* data arrives. If you read
the data
partially, you will be signalled that data is left only when the socket
delivers more data.
A much more simple approach is without select:
socket/bind/listen
while (newfd=accept)
{ fork
in child: handle I/O via newfd (you can dup this to
stdin/stdout)
}
Hubble.
| |
| Nils O. Selåsdal 2006-09-18, 4:00 am |
| Thr wrote:
> Hi,
>
> I read some books about network programming..
> Still, I have some questions:
>
> 1. let's assume that we have http server written using nonblocking i/o
> and select().
> How is it possible to server many requests simultaneously?
>
> I mean - we have, for example, static binary file that is 700 MB long
> (and many other, much smaller files, that are, for example, html
> webpages)
> .
> Let's assume that server found new socket descriptor with select(), and
> accepted it..
> after receiving data from client, and after parsing request header we
> found that he want this big file, and we started to send him chunks of
> data..
>
> I think that this situation should block sending data for other
> clients, does it (until this 700 MB file is send) ?
Depends on how you program the thing.
What one usually does is place the fd(set it to nonblocking first) in
the write set for select.
When select returns you check and service all descriptors as usual.
As usual for the one in question here, will be to read a chunk,
from the file, write that chunk(doing any preprocessing if needed) to
the fd. You need to note how much of that was actually written so you
know from where to start writing the next time select indicates you can.
That's alteast the basic idea. You need a bit furniture to sustain the
state needed when you resume writing after select returns, and perhaps
some mildly clever buffer management for the data read from file and
sent to the client.
| |
| Nils O. Selåsdal 2006-09-18, 4:00 am |
| Nils O. Selåsdal wrote:
> Thr wrote:
> Depends on how you program the thing.
>
> What one usually does is place the fd(set it to nonblocking first) in
> the write set for select.
"usual" was perhaps not the best word here. For such servers it's
also common to fork a child or spawn a thread that handles each client.
| |
| Hubble 2006-09-18, 4:00 am |
| Nils O. Sel=E5sdal wrote:
> Nils O. Sel=E5sdal wrote:
>
> "usual" was perhaps not the best word here. For such servers it's
> also common to fork a child or spawn a thread that handles each client.
AFAIK, squid (www.squid-cache.org) is a famous example which does use
neither thread nor fork and achieves very good performance.
Apache (www.apache.org) uses pre-forking of processes and passes
accepted fd's via (OS dependent) ioctl's.
If you have many connections open, there are alternative system calls
like poll(2) on some OSs. These avoid scanning FD_SETs, which can only
be done in linear time.
http://www.usenix.org/events/usenix..._html/node4.ht=
ml
Hubble.
| |
| Nils O. Selåsdal 2006-09-18, 7:59 am |
| Hubble wrote:
> Nils O. Selåsdal wrote:
>
> AFAIK, squid (www.squid-cache.org) is a famous example which does use
> neither thread nor fork and achieves very good performance.
Yaws is a better one, but performance wasn't my point :-)
For some applications you do need a seperate execution path per
client, as backends (e.g. PHP) that runs in the same process as
the serving one can do blocking calls.
| |
| Rick Jones 2006-09-18, 10:01 pm |
| Barry Margolin <barmar@alum.mit.edu> wrote:
> If you really wanted to do this in a single-threaded server, the
> server would have to make sure it looks for new requests
> periodically, even while it's handling a request. And if there are
> multiple outstanding requests, it would have to rotate through all
> of them, sending bits of each response, rather than attempting to
> handle each one to completion before looking for new work.
And at least one web server out there - Zeus - does this and does it
quite well.
> Programming something like this is likely to be very tricky, which is
> why it's usually done with threads or processes. Then all the
> multiplexing is handled automatically for you.
When done with threads, at the cost of the programmer having to
protect critical datastructures being accessed by multiple threads.
rick jones
--
denial, anger, bargaining, depression, acceptance, rebirth...
where do you want to be today?
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
| |
| James Antill 2006-09-18, 10:01 pm |
| On Mon, 18 Sep 2006 18:23:19 +0000, Rick Jones wrote:
> Barry Margolin <barmar@alum.mit.edu> wrote:
>
> And at least one web server out there - Zeus - does this and does it
> quite well.
If anything I'd say is was more likely, the obvious ones that come to my
mind are:
And-httpd
lighttpd
thttpd
....and there are quite a few others. Also "single-threaded" is a bad
description as it often means "can not take advantage of of multiple CPUs"
where both And-httpd and lighttpd can use multiple tasks to serve
requests from a single listening socket (and thttpd runs CGIs in seperate).
--
James Antill -- james@and.org
http://www.and.org/and-httpd
| |
| David Schwartz 2006-09-18, 10:01 pm |
|
Nils O. Sel=E5sdal wrote:
> For some applications you do need a seperate execution path per
> client, as backends (e.g. PHP) that runs in the same process as
> the serving one can do blocking calls.
My experience has been that this just doesn't work. Consider what
happens when you have two such scripts running in the same process and
they start calling 'chdir'. Something always goes wrong.
DS
| |
| William Ahern 2006-09-18, 10:01 pm |
| On Sun, 17 Sep 2006 21:14:24 -0400, Barry Margolin wrote:
> This is usually done using multiple processes or threads. Each one
> handles one request, and the operating system or thread library arranges
> for all the processes or threads to run concurrently.
"Usually" is a strong word. thttpd, Zeus, Boa, X15... there might be more
examples which use a single-process, event based model than not.
> If you really wanted to do this in a single-threaded server, the server
> would have to make sure it looks for new requests periodically, even while
> it's handling a request. And if there are multiple outstanding requests,
> it would have to rotate through all of them, sending bits of each
> response, rather than attempting to handle each one to completion before
> looking for new work.
Why? Assuming you have one CPU, what's the difference? All it comes down
to is request latency, and unless you're blocking on an operation why
accept more requests when clearly you cannot handle more (otherwise you'd
have already finished it!).
> Programming something like this is likely to be very tricky, which is why
> it's usually done with threads or processes. Then all the multiplexing is
> handled automatically for you.
Tricky, certainly. You typically cannot [easily] control paging/swapping
of virtual memory, and there are other such issues which make for writing
a strictly "non-blocking" server near impossible (i.e. spotty non-blocking
support for file-base I/O). Nonetheless, even those applications which
don't deal with paging issues typically outperform the "traditional"
designs, sometimes by leaps and bounds.
| |
| William Ahern 2006-09-18, 10:01 pm |
| On Mon, 18 Sep 2006 13:37:03 +0200, Nils O. Selåsdal wrote:
> For some applications you do need a seperate execution path per client, as
> backends (e.g. PHP) that runs in the same process as the serving one can
> do blocking calls.
Or you could use a scripting language like Lua which has support for
co-routines such that the interpreter has a built-in notion of yielding
and resuming, controllable by the underlying C code.
Xavante is a "non-blocking" HTTP server where client state is maintained
implicitly in a dormant thread within the interpreter. I/O operations
are managed in a non-blocking fashion by the hosting C application. The
overhead would be about as much as storing a cookie and related
information in a SQL database.
I understand Barry's point (and I suspect he already knew all about
event-oriented designs). Why re-implement what the operating system does
for you? Sometimes you can do it more intelligently, more efficiently, and
in a manner which is more conducive to productively solving the issues at
hand.
Linux's new splice(), tee() and vmsplice() systems call will raise the
ceiling on how efficient event-based servers can be.
| |
| David Schwartz 2006-09-19, 7:02 pm |
|
William Ahern wrote:
[color=darkred]
> Why? Assuming you have one CPU, what's the difference? All it comes down
> to is request latency, and unless you're blocking on an operation why
> accept more requests when clearly you cannot handle more (otherwise you'd
> have already finished it!).
The difference is that if you try to complete one request before
working on another, you will slow down to the speed of the network.
Consider what happens if one request requires sending 1GB to the
client.
[color=darkred]
> Tricky, certainly. You typically cannot [easily] control paging/swapping
> of virtual memory, and there are other such issues which make for writing
> a strictly "non-blocking" server near impossible (i.e. spotty non-blocking
> support for file-base I/O). Nonetheless, even those applications which
> don't deal with paging issues typically outperform the "traditional"
> designs, sometimes by leaps and bounds.
It depends what you consider traditional.
DS
| |
|
| >AFAIK, squid (www.squid-cache.org) is a famous example which does use
>neither thread nor fork and achieves very good performance.
Does this mean that the squid process is limited to only run on one CPU at
any one time and so 'wasting' any multiple CPUs installed in the host?
(okay there may be not much computational work to be done in a web cache).
So what I'm asking is: If you don't create any new threads or fork a new
process is there any other way your code can utilise multiple CPUs (and have
code running concurrently)? I didn't think there was (on Unix).
Thanks for an interesting thread!
Cheers,
Jason
| |
| David Schwartz 2006-09-19, 10:01 pm |
|
Jason wrote:
[color=darkred]
> Does this mean that the squid process is limited to only run on one CPU at
> any one time and so 'wasting' any multiple CPUs installed in the host?
> (okay there may be not much computational work to be done in a web cache).
Not really. Most of squid's work is in I/O, and the kernel's I/O code
is multi-threaded. If, for example, a network packet arrives and
there's another CPU, that other CPU can process the packet, verify its
checksum, do the TCP stuff, and put it in a buffer for squid to pick it
up, all without the squid process being interrupted.
In addition, if squid uses functions like 'sendfile', a significant
fraction of its work is done by the kernel itself. Once you tell the
kernel to send several files, it can continue working on that using all
available CPUs.
> So what I'm asking is: If you don't create any new threads or fork a new
> process is there any other way your code can utilise multiple CPUs (and have
> code running concurrently)? I didn't think there was (on Unix).
There are basically two ways. On is the implicit concurrent processing
in the kernel, through interrupts, timers, and calls like 'sendfile'.
Another is a process pool architecture. In a process pool architecture,
you pre-fork, but then you don't necessarily dedicate a process to a
request (or a request to a particular process). The request can either
migrate across processes as needed or each process can concurrently
handle more than one request (using a select/pool loop).
Although it really is hard to do right now (due to the lack of
libraries, tools, and experience needed to get it right), a process
pool architecture could potentially have many of the advantages of
multi-threaded and process-per-client architectures with few of the
di vantages. You don't need to fork for each request, and a bug can't
(or at least is not likely to) take down all current requests.
DS
| |
| Barry Margolin 2006-09-19, 10:01 pm |
| In article <pan.2006.09.19.01.49.58.115971@25thandClement.com>,
William Ahern <william@25thandClement.com> wrote:
> On Sun, 17 Sep 2006 21:14:24 -0400, Barry Margolin wrote:
>
> "Usually" is a strong word. thttpd, Zeus, Boa, X15... there might be more
> examples which use a single-process, event based model than not.
>
>
> Why? Assuming you have one CPU, what's the difference? All it comes down
> to is request latency, and unless you're blocking on an operation why
> accept more requests when clearly you cannot handle more (otherwise you'd
> have already finished it!).
The same reason for any other multiprocessing performed by most
computers. By your logic, if two compute-bound programs are running on
a single-CPU computer, there's no reason to timeshare between them, you
should run them sequentially.
So if a request comes in for a multi-gigabyte file, should all the other
requests have to wait for it to finish, or get to a point in the
transfer where send() would block?
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
| |
| Barry Margolin 2006-09-19, 10:01 pm |
| In article <pan.2006.09.19.02.24.21.98741@25thandClement.com>,
William Ahern <william@25thandClement.com> wrote:
> On Mon, 18 Sep 2006 13:37:03 +0200, Nils O. Selåsdal wrote:
>
> Or you could use a scripting language like Lua which has support for
> co-routines such that the interpreter has a built-in notion of yielding
> and resuming, controllable by the underlying C code.
Aren't co-routines "separate execution paths"? I never intended to
imply that the threading used needs to be implemented by the kernel;
user-mode threading libraries (which appeared long before thread support
was added to most Unix kernels) are just as effective.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
| |
| Rick Jones 2006-09-20, 7:00 pm |
| David Schwartz <davids@webmaster.com> wrote:
> In addition, if squid uses functions like 'sendfile', a significant
> fraction of its work is done by the kernel itself. Once you tell the
> kernel to send several files, it can continue working on that using
> all available CPUs.
That presupposes that the size of the file(s) being send is <
SO_SNDBUF size yes? And that the file's blocks are already in the FS
cache.
As for the work to get data out the system, not to trivialize it but
the "only" difference between send() and sendfile() is the avoidance
of the copy.
rick jones
--
denial, anger, bargaining, depression, acceptance, rebirth...
where do you want to be today?
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
| |
| Alex Fraser 2006-09-20, 7:00 pm |
| "Rick Jones" <rick.jones2@hp.com> wrote in message
news:LCeQg.201$Di.149@news.cpqcorp.net...
> David Schwartz <davids@webmaster.com> wrote:
>
> That presupposes that the size of the file(s) being send is <
> SO_SNDBUF size yes? And that the file's blocks are already in the FS
> cache.
I can't see why you think either of these must be the case.
> As for the work to get data out the system, not to trivialize it but
> the "only" difference between send() and sendfile() is the avoidance
> of the copy.
And the avoidance of some kernel <-> user mode transitions, and the ability
of the kernel to optimise the process.
Alex
| |
| David Schwartz 2006-09-20, 7:01 pm |
|
Rick Jones wrote:
> As for the work to get data out the system, not to trivialize it but
> the "only" difference between send() and sendfile() is the avoidance
> of the copy.
Ack, you are correct. I was thinking of the Win32 version of the
function, which is a true asynchronous send done without process
intervention.
DS
| |
| William Ahern 2006-09-20, 7:01 pm |
| On Wed, 20 Sep 2006 00:04:03 -0400, Barry Margolin wrote:
> In article <pan.2006.09.19.02.24.21.98741@25thandClement.com>,
> William Ahern <william@25thandClement.com> wrote:
>
>
> Aren't co-routines "separate execution paths"? I never intended to imply
> that the threading used needs to be implemented by the kernel; user-mode
> threading libraries (which appeared long before thread support was added
> to most Unix kernels) are just as effective.
Absolutely. However, when people suggest threads along with multiple
processes this typically implies to the person a suggestion to use
pthreads or some other extant, portable interface.
Nonetheless, the fact remains that the fastest web servers out there are
event-based--and so far single-process, single-threaded in most
environments. For the same reason that a web server written in
hand-optimized assembly would be faster: you can make more efficient use
of your resources. Though, I don't buy the argument(s) that using
threads over an event-oriented design is easier to accomplish for typical
HTTP server duties, and this is why I originally spoke up. I'm
usually happy to join the rest of the crowd when somebody approaches the
group wanting to write a network server and the group dispenses with the
"use fork() and you'll do just fine" advice.
Personally I use a hybrid approach. A mix of preemptive threads and/or
processes supporting an event loop with possibly special purpose
"threading" on top for those particular areas that might require it (e.g.
CGI through higher-level languages). I've done this for HTTP, RTSP and
SMTP and have easily outperformed existing, well-worn applications
(Postfix, Apache, QTSS, etc) before I even got to profiling and optimizing
sub-components (and I don't tend to sacrifice features).
| |
| William Ahern 2006-09-20, 7:01 pm |
| On Wed, 20 Sep 2006 00:00:53 -0400, Barry Margolin wrote:
<snip>
> So if a request comes in for a multi-gigabyte file, should all the other
> requests have to wait for it to finish, or get to a point in the transfer
> where send() would block?
No. I apologize. I misread your original post and myself.
I'm not trying to say this at all. I've gotten into the habit of thinking
of abstract things like "send a file over HTTP" as a set of discrete
events. And when you suggested that you'd need to check for incoming
requests when you were processing an existing one, I thought to myself
"why would I need to check for new requests when parsing an incoming
header", an operation which is CPU bound (though small and finite), and
thus wouldn't necessitate any type of concurrency. Which is pretty silly
on my part.
|
|
|
|
|