For Programmers: Free Programming Magazines  


Home > Archive > Unix Programming > February 2005 > Sockets debugging tools









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Sockets debugging tools
T Koster

2005-02-15, 4:01 am

Hi folks,

I'm looking for a tool to test my networking code. Currently I am
limited to testing using clients on the LAN, so all my packets deliver
almost instantaneously with 100% probability, thereby never giving my
application a chance to test its fail-safe mechanisms.

These LAN conditions are obviously not typical of normal usage, so I'm
looking for a tool, some kind of proxy or something, however that can be
configured to slow traffic down and occasionally drop packets even, so
that these can be tested.

Thanks,
Thomas

Andrei Voropaev

2005-02-15, 8:59 am

On 2005-02-15, T Koster <reply-to-group@use.net> wrote:
> Hi folks,
>
> I'm looking for a tool to test my networking code. Currently I am
> limited to testing using clients on the LAN, so all my packets deliver
> almost instantaneously with 100% probability, thereby never giving my
> application a chance to test its fail-safe mechanisms.
>
> These LAN conditions are obviously not typical of normal usage, so I'm
> looking for a tool, some kind of proxy or something, however that can be
> configured to slow traffic down and occasionally drop packets even, so
> that these can be tested.


Do you use TCP? Then probably you won't find anything :) Maybe I'm
wrong, but I can't think up something in the network that needs to be
emulated in order to test fail-safe mechanisms of a program that uses
TCP. Oh well. Sometimes I establish connection and then pull out network
cable, to see how the program reacts to it. In all other cases
everything depends on the peer. So I program my peer to emulate all
possible problems (ie. peer crash, peer deadlock, remote host crash).

Generally speaking, emulation of network problems tests the networking
code in OS. And that is of no good for your program (unless you write
networking code for OS :) Your program should rely on the fact, that the
networking code in OS will do everything possible to deliver your data
to the peer host, and if it fails then your program will be notified
(this is for TCP). So you should worry yourself only about your peer
application actually getting the data from the system (if it is
important for you, if not then you can simply give your data to the OS
and forget about it :)

The same should be true for UDP. There you don't get a notification from
the system that your data couldn't be delivered to the peer host, so if
the delivery is important for you, then you arrange confirmations from
peer, etc. Again, you can program your peer not to send confirmation to
emulate lost UDP packet. (Though I think that even in LAN this happens
on its own very often :)

--
Minds, like parachutes, function best when open
Anders Mikkelsen

2005-02-15, 9:00 am

T Koster wrote:
> Hi folks,
>
> I'm looking for a tool to test my networking code. Currently I am
> limited to testing using clients on the LAN, so all my packets deliver
> almost instantaneously with 100% probability, thereby never giving my
> application a chance to test its fail-safe mechanisms.
>
> These LAN conditions are obviously not typical of normal usage, so I'm
> looking for a tool, some kind of proxy or something, however that can be
> configured to slow traffic down and occasionally drop packets even, so
> that these can be tested.
>
> Thanks,
> Thomas
>


Nistnet might help you...

http://www-x.antd.nist.gov/nistnet/

Regards,
Anders
T Koster

2005-02-16, 4:04 am

Andrei Voropaev wrote:
> On 2005-02-15, T Koster <reply-to-group@use.net> wrote:
>
>
> Do you use TCP? Then probably you won't find anything :) Maybe I'm
> wrong, but I can't think up something in the network that needs to be
> emulated in order to test fail-safe mechanisms of a program that uses
> TCP. Oh well. Sometimes I establish connection and then pull out network
> cable, to see how the program reacts to it. In all other cases
> everything depends on the peer. So I program my peer to emulate all
> possible problems (ie. peer crash, peer deadlock, remote host crash).


Yes, my connections are TCP, and yes those sorts of things I mostly
wouldn't need to worry about.

However, since my application uses non-blocking sockets and the select
system call, I need to be able to test conditions like when send doesn't
manage to send an entire message, or its buffer fills (through lack of
or delayed ACKs from the peer due to poor network conditions) and
reports that it would block. In these cases, my application *should* be
storing the remaining portion of the message in its own user-space
buffer to send later, when the select system call says that it's okay to
send again. Currently I have no way to force this behaviour without
coding some sort of command to the application protocol almost like
"ping -f" in behaviour, to try to flood things.

(I'm talking about 'messages' on a TCP stream, which some may complain
about, but a 'message' is simply one line of text in the stream in this
case.)

> Generally speaking, emulation of network problems tests the networking
> code in OS. And that is of no good for your program (unless you write
> networking code for OS :) Your program should rely on the fact, that the
> networking code in OS will do everything possible to deliver your data
> to the peer host, and if it fails then your program will be notified
> (this is for TCP).


Yes, it is how these notifications are handled that I need to test, but
network conditions on my LAN are too good to raise most of these error
conditions, certainly too good to get 100% code coverage.

> So you should worry yourself only about your peer
> application actually getting the data from the system (if it is
> important for you, if not then you can simply give your data to the OS
> and forget about it :)


<snip UDP>

So, there is a possibility that my program is too paranoid about failure
to send a message by the OS? It is true that almost all sample sockets
programming code I find on the 'net doesn't seem to worry too much about
it. I've been trying to read through some IRC server source code, to
see how they do it, considering they generally handle giant loads, but
so far they all seem to stem from the same ancient ircd and the code is
fairly messy and difficult to read. I'll get there eventually. Luckily
this is a hobby project for me or I would have exceeded a hundred
deadlines already :)

Thanks,
Thomas

David Schwartz

2005-02-16, 4:04 am


"T Koster" <reply-to-group@use.net> wrote in message
news:PezQd.162849$K7.72847@news-server.bigpond.net.au...

> So, there is a possibility that my program is too paranoid about failure
> to send a message by the OS? It is true that almost all sample sockets


You need to stop talking about messages in connection with TCP. Phrases
like "failure to send a message" have no meaning in terms of TCP. Seriously,
thinking this way will eventually bite you. Your thinking will ultimately be
reflected in your code.

TCP sends bytes. The 'send' call could send all the bytes you asked it
to send, some of the bytes you asked it to send, or none of the bytes you
asked it to send.

DS


Andrei Voropaev

2005-02-16, 9:00 am

On 2005-02-16, T Koster <reply-to-group@use.net> wrote:
[...]
>
> However, since my application uses non-blocking sockets and the select
> system call, I need to be able to test conditions like when send doesn't
> manage to send an entire message, or its buffer fills (through lack of
> or delayed ACKs from the peer due to poor network conditions) and
> reports that it would block. In these cases, my application *should* be
> storing the remaining portion of the message in its own user-space
> buffer to send later, when the select system call says that it's okay to
> send again. Currently I have no way to force this behaviour without
> coding some sort of command to the application protocol almost like
> "ping -f" in behaviour, to try to flood things.


You are wrong. This is perfectly possible by programming your peer not
to read from the socket :) (Emulating the deadlock of the peer). In this
case after the system buffers are filled, your call to send will process
less (or even none) of the data. Note, system buffers are fairly large
so you may need to pass lots of data before they fill up. But they do
fill up. Tested :)

>
> So, there is a possibility that my program is too paranoid about failure
> to send a message by the OS? It is true that almost all sample sockets
> programming code I find on the 'net doesn't seem to worry too much about
> it. I've been trying to read through some IRC server source code, to
> see how they do it, considering they generally handle giant loads, but
> so far they all seem to stem from the same ancient ircd and the code is
> fairly messy and difficult to read. I'll get there eventually. Luckily
> this is a hobby project for me or I would have exceeded a hundred
> deadlines already :)


Well. Your program should be ready to handle any error conditions
returned by recv or send calls. (If you use select, poll or similar,
then you have to watch for error flags there as well). See man 7 tcp,
man 7 ip for the errors. Since TCP is reliable protocol, the only thing
that may prevent it from sending data is an abscence of the route to
peer. You can emulate this either by pulling the network cable out, or
by changing routing tables (not tested :) Anything else in the network
should not be a concern for you. (Though others may prove me wrong :)

--
Minds, like parachutes, function best when open
T Koster

2005-02-16, 4:01 pm

Andrei Voropaev wrote:
> On 2005-02-16, T Koster <reply-to-group@use.net> wrote:
> [...]
>
>
> You are wrong. This is perfectly possible by programming your peer not
> to read from the socket :) (Emulating the deadlock of the peer). In this
> case after the system buffers are filled, your call to send will process
> less (or even none) of the data. Note, system buffers are fairly large
> so you may need to pass lots of data before they fill up. But they do
> fill up. Tested :)


Hmmm okay. So to start filling up the system's I/O buffers I just
freeze the client program, and watch how my program handles send's
inability to send?

If so, what would happen if the client program never springs back to
life, and the socket just remains in a permanent would-block state?
After a while the OS will time-out the connection, right? Where is this
error raised? The tcp(7) man page only specifies ETIMEDOUT but no
indication of which calls raise it. Is is send and friends?

>
> Well. Your program should be ready to handle any error conditions
> returned by recv or send calls. (If you use select, poll or similar,
> then you have to watch for error flags there as well). See man 7 tcp,
> man 7 ip for the errors. Since TCP is reliable protocol, the only thing
> that may prevent it from sending data is an abscence of the route to
> peer. You can emulate this either by pulling the network cable out, or
> by changing routing tables (not tested :) Anything else in the network
> should not be a concern for you. (Though others may prove me wrong :)


In addition to the ETIMEDOUT error, I'm about whether I need to
look out for EPIPE or not. One man page (send(2)) tells me it is raised
when "the local end has been shut down on a connected socket." Surely
the local end only shuts down if I call close or shutdown on the socket,
right? Thus, I already know when the local end has been shut down and
such a socket would never be sent to anyway, or am I missing something?
Another man page (tcp(7)) has something else to say about EPIPE: "The
other end closed the socket unexpectedly or a read is executed on a shut
down socket." This is totally different from what send(2) has to say,
but sounds more likely. ip(7)'s story is similar to tcp(7), but not
identical, plus I find the following unreassuring note under the BUGS
section: "There are too many inconsistent error values." :(

ECONNRESET is elusive too: neither the ip(7) or tcp(7) man pages mention
it. It only appears to be mentioned in send(2). Can it be expected to
be raised by recv also? It makes more sense to me that recv should find
out that a connection has been reset by the peer than send.

I would never have thought sockets programming on Linux would be so
poorly-documented. I'm starting to read some FreeBSD man pages to see
if they contain the details I'm missing.

Thomas

Tim

2005-02-17, 4:01 am

T Koster <reply-to-group@use.net> writes:
> These LAN conditions are obviously not typical of normal usage, so I'm
> looking for a tool, some kind of proxy or something, however that can
> be configured to slow traffic down and occasionally drop packets even,
> so that these can be tested.


Dummynet. I've used it for this and it works great.

You can configure latency, loss rate, and bandwidth cap.

http://info.iet.unipi.it/~luigi/ip_dummynet/

tim
Andrei Voropaev

2005-02-17, 9:02 am

On 2005-02-16, T Koster <reply-to-group@use.net> wrote:
> Andrei Voropaev wrote:

[...]
>
> Hmmm okay. So to start filling up the system's I/O buffers I just
> freeze the client program, and watch how my program handles send's
> inability to send?
>
> If so, what would happen if the client program never springs back to
> life, and the socket just remains in a permanent would-block state?
> After a while the OS will time-out the connection, right? Where is this
> error raised? The tcp(7) man page only specifies ETIMEDOUT but no
> indication of which calls raise it. Is is send and friends?


Nope. OS won't time-out the connection in this case. Your application is
responsible for handling such situations. For example by expecting
confirmation messages from peer within certain time and closing
connection if the confirmation does not come (yep, even with TCP
*sigh* :)

As to ETIMEDOUT error, this happens in the case where there was no route
to peer for a long time (pulled out network cable). The time is defined
by OS. Usually it is few minutes. The error is returned when you call
one of socket function (recv, send etc.) Supposedly the indication of
this error is also visible in poll (POLLERR) and select (err set). In
this case you can check the error using getsockopt with SO_ERROR option.
I use this for handling non-blocking connection on Linux.

[...]

> In addition to the ETIMEDOUT error, I'm about whether I need to
> look out for EPIPE or not. One man page (send(2)) tells me it is raised
> when "the local end has been shut down on a connected socket." Surely
> the local end only shuts down if I call close or shutdown on the socket,
> right? Thus, I already know when the local end has been shut down and
> such a socket would never be sent to anyway, or am I missing something?
> Another man page (tcp(7)) has something else to say about EPIPE: "The
> other end closed the socket unexpectedly or a read is executed on a shut
> down socket." This is totally different from what send(2) has to say,
> but sounds more likely. ip(7)'s story is similar to tcp(7), but not
> identical, plus I find the following unreassuring note under the BUGS
> section: "There are too many inconsistent error values." :(


EPIPE is important. You can't control it. Usually you get EPIPE when you
are writing the data and remote peer crashes. In this case your OS get's
back from peer EOF and closes "local end of pipe". But if your program
does not read at the moment, it does not get indication about that. And
during write it'll get SIGPIPE because the pipe is closed by OS already.
That's why I usually use send with MSG_NOSIGNAL option to convert
SIGPIPE to EPIPE. There are few other situations that lead to the same
scenario.

>
> ECONNRESET is elusive too: neither the ip(7) or tcp(7) man pages mention
> it. It only appears to be mentioned in send(2). Can it be expected to
> be raised by recv also? It makes more sense to me that recv should find
> out that a connection has been reset by the peer than send.


Quite opposite. CONNRESET is something that happens during sending. When
TCP attempts to deliver some data to the peer, and the peer has no idea
what to do with the data it sends back RST to indicate that connection
must be reopened. That situation is again normall for the case when peer
application crashes while your application is sending data. In practice
though I've never seen this error. Usually I get EPIPE :) I guess this
is so because computers are fast now. I call send to pass some data,
this goes into system buffer. Then it is passed to peer. In reply comes
RST. Local end gets closed and now I'm trying to send second portion.
Boom. SIGPIPE :)

>
> I would never have thought sockets programming on Linux would be so
> poorly-documented. I'm starting to read some FreeBSD man pages to see
> if they contain the details I'm missing.


Man pages are written in assumption, that those who read them know how
the protocol works. In other words, if you really want to write good
networking code you have to read some good books first, and then use man
pages only as memory refreshment or to find out specifics of
implementation for given OS :) I guess the authoritive book for
networking programming is one by Stevens "Unix Network Programming"
vol.1

--
Minds, like parachutes, function best when open
T Koster

2005-02-17, 9:02 am

Andrei Voropaev wrote:
> On 2005-02-16, T Koster <reply-to-group@use.net> wrote:
>
> Nope. OS won't time-out the connection in this case. Your application is
> responsible for handling such situations. For example by expecting
> confirmation messages from peer within certain time and closing
> connection if the confirmation does not come (yep, even with TCP
> *sigh* :)


Okay. Will do.

> As to ETIMEDOUT error, this happens in the case where there was no route
> to peer for a long time (pulled out network cable). The time is defined
> by OS. Usually it is few minutes. The error is returned when you call
> one of socket function (recv, send etc.) Supposedly the indication of
> this error is also visible in poll (POLLERR) and select (err set). In
> this case you can check the error using getsockopt with SO_ERROR option.
> I use this for handling non-blocking connection on Linux.


Aha. I'll just stick to looking out for ETIMEDOUT from the socket I/O
calls, along with the others.

> [...]
>
>
> EPIPE is important. You can't control it. Usually you get EPIPE when you
> are writing the data and remote peer crashes. In this case your OS get's
> back from peer EOF and closes "local end of pipe". But if your program
> does not read at the moment, it does not get indication about that. And
> during write it'll get SIGPIPE because the pipe is closed by OS already.
> That's why I usually use send with MSG_NOSIGNAL option to convert
> SIGPIPE to EPIPE. There are few other situations that lead to the same
> scenario.


Yeah, I'm also using MSG_NOSIGNAL to suppress the SIGPIPE. So to recap:
EPIPE is send's analogue to recv returning 0, or rather, I found out
that recv also raises SIGPIPE/EPIPE if you try to call it on a socket it
already returned 0 for, so to have recv returning 0 is like getting a
forward notification that the connection is closed for reading and
calling recv again will raise SIGPIPE/EPIPE, just like calling send on a
socket closed for writing raises SIGPIPE/EPIPE. Correct?

>
> Quite opposite. CONNRESET is something that happens during sending. When
> TCP attempts to deliver some data to the peer, and the peer has no idea
> what to do with the data it sends back RST to indicate that connection
> must be reopened. That situation is again normall for the case when peer
> application crashes while your application is sending data. In practice
> though I've never seen this error. Usually I get EPIPE :) I guess this
> is so because computers are fast now. I call send to pass some data,
> this goes into system buffer. Then it is passed to peer. In reply comes
> RST. Local end gets closed and now I'm trying to send second portion.
> Boom. SIGPIPE :)


Got it. You send data to a computer that has rebooted in the meantime,
and it doesn't expect that data coming to that port, so it sends
something back with the RST flag set, causing ECONNRESET.

>
> Man pages are written in assumption, that those who read them know how
> the protocol works. In other words, if you really want to write good
> networking code you have to read some good books first, and then use man
> pages only as memory refreshment or to find out specifics of
> implementation for given OS :) I guess the authoritive book for
> networking programming is one by Stevens "Unix Network Programming"
> vol.1


As soon as uni starts again I'll borrow it from the library.

Thanks,
Thomas

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2010 codecomments.com