Home > Archive > Unix Programming > March 2008 > Non-blocking sockets
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Non-blocking sockets
|
|
|
| Hi,
I'm trying to communicate with a given server via HTTP. I'm sending
the request and receiving the response. But there is one problem...
how do I know how much data to expect?
A solution that came to mind is to use non-blocking I/O. I tried that,
but read() or recv() returns before any data is read, which is no use.
How can I read the entire response from the server without knowing how
much data to expect?
Thanks
Daniel
| |
| Rainer Weikusat 2008-03-25, 7:25 pm |
| Gigi <dandago@gmail.com> writes:
> I'm trying to communicate with a given server via HTTP. I'm sending
> the request and receiving the response. But there is one problem...
> how do I know how much data to expect?
>
> A solution that came to mind is to use non-blocking I/O. I tried that,
> but read() or recv() returns before any data is read, which is no use.
>
> How can I read the entire response from the server without knowing how
> much data to expect?
Not at all. The amount of data to expect is either implicit in the
response type (eg 204 or 304 responses), implicitly signalled by the
server closing the connection, explicitly given by a
Content-Length-header or marked by a zero-sized 'chunk of response
data' (chunked transfer-coding).
If you don't need persistent connections, using HTTP/1.0 and always
closing a connection [shutdown(fd, SHUT_WR)] after a single request
has been sent 'should' cause the server to always use the 'closing the
connection' method. Especially, you wouldn't have to deal with chunked
transfer-coding then.
| |
|
| On Mar 25, 7:01 pm, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:
> Gigi <dand...@gmail.com> writes:
>
>
>
> Not at all. The amount of data to expect is either implicit in the
> response type (eg 204 or 304 responses), implicitly signalled by the
> server closing the connection, explicitly given by a
> Content-Length-header or marked by a zero-sized 'chunk of response
> data' (chunked transfer-coding).
>
> If you don't need persistent connections, using HTTP/1.0 and always
> closing a connection [shutdown(fd, SHUT_WR)] after a single request
> has been sent 'should' cause the server to always use the 'closing the
> connection' method. Especially, you wouldn't have to deal with chunked
> transfer-coding then.
Hi,
Thanks for the reply, but I didn't quite understand what I need to do
to get things working. As in, if the amount of data to expect is
somehow sent by the server, how do I access this amount?
Right now I'm setting my socket descriptor to non-blocking and
attempting to recv(), but it's failing with EAGAIN (Resource
temporarily unavailable) and not receiving any data. On the other hand
I can use blocking I/O, but this brings me back to the problem that I
don't know how much data to expect.
| |
|
| > Thanks for the reply, but I didn't quite understand what I need to do
> to get things working. As in, if the amount of data to expect is
> somehow sent by the server, how do I access this amount?
As Rainer said, you should fetch it from the 'Content-Length' header,
from the http response.
this http header tells you how much data is expected after parsing the
Http headers.
So you should:
1- read the htpp headers i.e. when receiving data keep on reading the
stream until you reach "\r\n\r\n" - this is the header termination
char sequence.
2- parse the headers received an look for Content-Length, extract the
length from it
3- resume reading as much bytes as parsed from the header.
The other method described by Rainer is:
0- specify HTTP/1.0 as your protocol when connecting to the http
server
1- read the http headers i.e. when receiving data keep on reading the
stream until you reach "\r\n\r\n" - this is the header termination
char sequence.
2- resume reading all bytes sent by the server until it closes the
connection
of course you should get the http rfc numbered 2616, it deals with
headers, parsing etc.
>
> Right now I'm setting my socket descriptor to non-blocking and
> attempting to recv(), but it's failing with EAGAIN (Resource
> temporarily unavailable) and not receiving any data. On the other hand
> I can use blocking I/O, but this brings me back to the problem that I
> don't know how much data to expect.
async / sync I/O has nothing to do with your initial problem: it is a
different way to process incoming data and nothing else. A byte stream
(like TCP) is a byte stream: it does not convey any information
regarding messages size, separators etc. As such do not expect to find
message boundaries/size using the socket API (well unless you use UDP,
but that s a different story.).
Http, running on top of TCP, gives you these infos in the header or by
closing the connection.
-- paulo
| |
| Scott Lurndal 2008-03-25, 7:25 pm |
| Gigi <dandago@gmail.com> writes:
>Thanks for the reply, but I didn't quite understand what I need to do
>to get things working. As in, if the amount of data to expect is
>somehow sent by the server, how do I access this amount?
Parse the HTTP headers. Look for the Content-Length: header.
c.f. http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
scott
| |
|
| OK regarding content-length I got the idea... but that only gives you
how much data there is in the payload. First you need to know how much
data is in the header.
What I've been doing so far was call recv() and giving it a certain
number... an expected size for the whole data (header + payload). So
from what you're saying I suppose I have to call recv() repeatedly
with 1 as the size to receive? (i.e. receive one byte at a time until
\r\n\r\n is reached)
I don't know if I'm understanding you correctly, but if I am, isn't
this inefficient? I mean, you're calling a system call for every byte,
which I suppose causes considerable overhead because of continuous
context switching.
| |
| David Schwartz 2008-03-25, 7:25 pm |
| On Mar 25, 3:26 pm, Gigi <dand...@gmail.com> wrote:
> What I've been doing so far was call recv() and giving it a certain
> number... an expected size for the whole data (header + payload). So
> from what you're saying I suppose I have to call recv() repeatedly
> with 1 as the size to receive? (i.e. receive one byte at a time until
> \r\n\r\n is reached)
> I don't know if I'm understanding you correctly, but if I am, isn't
> this inefficient? I mean, you're calling a system call for every byte,
> which I suppose causes considerable overhead because of continuous
> context switching.
That would be horrible, don't do that. Use the following logic:
1) Call 'recv' passing it a reasonable amount of bytes to get.
2) See if you have a complete header, by searching for a '\r\n\r\n' in
the data you got. (Caution, do not use 'strstr' until/unless you zero-
terminate the data!)
3) If you don't have a complete header, you definitely don't have a
complete request.
4) If you do have a complete header, check if you you have all the
data too. If not, go to 1.
DS
| |
|
| On Mar 25, 11:31 pm, David Schwartz <dav...@webmaster.com> wrote:
> On Mar 25, 3:26 pm, Gigi <dand...@gmail.com> wrote:
>
>
> That would be horrible, don't do that. Use the following logic:
>
> 1) Call 'recv' passing it a reasonable amount of bytes to get.
>
> 2) See if you have a complete header, by searching for a '\r\n\r\n' in
> the data you got. (Caution, do not use 'strstr' until/unless you zero-
> terminate the data!)
>
> 3) If you don't have a complete header, you definitely don't have a
> complete request.
>
> 4) If you do have a complete header, check if you you have all the
> data too. If not, go to 1.
>
> DS
OK, so let's say I receive 8 bytes, and the \r\n\r\n is at the
beginning of it. Let's assume that the payload is less than 4 bytes
long (for example, when issuing a HEAD request). Then I am expecting
to read more data than I will actually receive, which means recv()
will block.
Another problem is that the \r\n\r\n can be split over two 8-byte
fragments, but that's not really a big deal... a bit of extra code
will cater for this situation.
Of course, your method will work perfectly for a response with a
decent payload. But if it can work universally, even for HEAD
requests, then it would be so much better.
| |
| Barry Margolin 2008-03-25, 10:16 pm |
| In article
<60793e0f-7ee8-4111-ba5d-7c3b7f2b237f@s8g2000prg.googlegroups.com>,
Gigi <dandago@gmail.com> wrote:
> OK, so let's say I receive 8 bytes, and the \r\n\r\n is at the
> beginning of it. Let's assume that the payload is less than 4 bytes
> long (for example, when issuing a HEAD request). Then I am expecting
> to read more data than I will actually receive, which means recv()
> will block.
Recv() only blocks if there's NOTHING available. If you give a
1000-byte buffer, and the server sends 40 bytes, recv() will return
those 40 bytes immediately and you won't block.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE don't copy me on replies, I'll read them in the group ***
| |
| David Schwartz 2008-03-25, 10:16 pm |
| On Mar 25, 3:59 pm, Gigi <dand...@gmail.com> wrote:
> OK, so let's say I receive 8 bytes, and the \r\n\r\n is at the
> beginning of it. Let's assume that the payload is less than 4 bytes
> long (for example, when issuing a HEAD request). Then I am expecting
> to read more data than I will actually receive, which means recv()
> will block.
You can't receive 8 bytes unless the other end sends 8 bytes. If you
mean the 'recv' will block trying to receive 8 bytes, no, it will not.
The default behavior is only to block if no data is available.
> Another problem is that the \r\n\r\n can be split over two 8-byte
> fragments, but that's not really a big deal... a bit of extra code
> will cater for this situation.
I'm not sure what you mean. You can always get data broken up any
which way from a TCP connection, that's the nature of TCP. You might
get a single byte with just a '\r' and then a single byte with just a
'\n'. TCP is a byte-stream protocol.
> Of course, your method will work perfectly for a response with a
> decent payload. But if it can work universally, even for HEAD
> requests, then it would be so much better.
I'm not sure I understand why you think it wouldn't. Is it just
because you think 'recv' will normally block even if data is
available? If so, what do you think MSG_WAITALL is for?
DS
| |
|
| Turns out you're right... I was under the wrong impression that recv()
blocks if it finds no data, or not enough data... in fact it was my
own code that was blocking (I had a recv() in a while loop to make it
non-failing).
Thanks for your help.
|
|
|
|
|