Home > Archive > Unix Programming > January 2008 > Replacing read() by a debugging function
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Replacing read() by a debugging function
|
|
| A. Farber 2008-01-28, 4:33 am |
| Hello,
I'm programming a daemon which calls read() in several spots.
When the daemon is being run with "-d", I'd like to see
the bytes being read() and currently do it this way:
n = read(pc->fd, pch->body + pc->nread,
sizeof(pch->body) - pc->nread);
....
if (debug)
dump_buffer(n, pch->body + pc->nread,
sizeof(pch->body) - pc->nread);
(full source code:
http://rtmpd.cvs.sourceforge.net/rt....c?revision=1.7
)
The "if (debug)" check slows down each read() call.
Ok, I could use an #ifdef NDEBUG but I'd rather have
the "-d" as a run-time option.
So my question is: is it possible to replace the actual
read() syscall in a portable way (want my program
to run on Linux, OpenBSD and Cygwin/MinGW)?
Something like this at the beginning of main():
if (debug)
read = read_and_dump;
Has anyone done/seen this?
Thank you
Alex
| |
| fnegroni 2008-01-28, 4:33 am |
| What you want to do is effectively change the function used to perform
the recv() at runtime.
Within the C language, you can use function pointers.
You must define your customised_recv() function to abide to the
contract for recv() (that is, receives the same number of arguments,
and returns the same values for all conditions).
What you would do is substitute each call to recv() you have at the
moment to use your function pointer variable.
You can then set up the function pointer variable in main() to point
to the appropriate implementation of recv() and provided the two
implementations abide to the same contract it should be transparent to
the calling code whether you are using the debug or retail version of
recv.
| |
| David Schwartz 2008-01-28, 8:23 am |
| On Jan 28, 2:13 am, "A. Farber" <Alexander.Far...@gmail.com> wrote:
> The "if (debug)" check slows down each read() call.
Do you have any evidence of this are you just assuming? It's hard to
imagine a perfectly predictable branch after a system call could have
a measurable effect.
> Has anyone done/seen this?
Looks like premature optimization to me.
DS
| |
| fnegroni 2008-01-28, 8:23 am |
| I wouldn't classify this as premature optimisation.
It is more of a virtualisation of semantics.
After all, C++ uses this all over the place... ;-)
| |
| Eric Sosman 2008-01-28, 8:23 am |
| A. Farber wrote:
> [...]
> n = read(pc->fd, pch->body + pc->nread,
> sizeof(pch->body) - pc->nread);
> ...
> if (debug)
> dump_buffer(n, pch->body + pc->nread,
> sizeof(pch->body) - pc->nread);
>
> (full source code:
> http://rtmpd.cvs.sourceforge.net/rt....c?revision=1.7
> )
>
> The "if (debug)" check slows down each read() call.
By how much? (You've measured it, haven't you?)
--
Eric Sosman
esosman@ieee-dot-org.invalid
| |
| David Schwartz 2008-01-28, 7:24 pm |
| On Jan 28, 3:45 am, fnegroni <f.e.negr...@googlemail.com> wrote:
> I wouldn't classify this as premature optimisation.
> It is more of a virtualisation of semantics.
> After all, C++ uses this all over the place... ;-)
C++ uses this all over the place, but not as an optimization. The more
natural, logical, maintainable semantic should be used.
The OP's rationale was:
[color=darkred]
This is an unsupported guess. I would be quite surprised if a
perfectly predictable branch following a system call were measurably
more expensive than an indirect call.
If he had an indirect function call and wanted to change it to an
branch because he thought that would be faster, I'd make the same
argument. The two constructs are likely to be nearly equivalent in
speed and it is far from obvious which is faster. Changing from one to
other on performance grounds is crazy.
DS
| |
| fnegroni 2008-01-28, 7:24 pm |
| The OP also states he would rather:
quote:
So my question is: is it possible to replace the actual
read() syscall in a portable way (want my program
to run on Linux, OpenBSD and Cygwin/MinGW)?
Something like this at the beginning of main():
if (debug)
read = read_and_dump;
Has anyone done/seen this?
And therefore function pointers achieve exactly the intended goal, bar
optimisation.
| |
| A. Farber 2008-01-28, 7:24 pm |
| On Jan 28, 3:45 pm, fnegroni <f.e.negr...@googlemail.com> wrote:
> And therefore function pointers achieve exactly the intended goal, bar
> optimisation.
Hello fnegroni,
you have replied 3 times in my thread, thank you.
But I don't get your point. In the 1st reply you have
rephrased my original question and in the following
just written "look above".
So my question again: is there a way to replace
a syscall? I'm trying the following, but it doesn't work:
4DEL03468:~ {542} cat replace-write.c
#include <stdio.h>
#include <unistd.h>
ssize_t write2(int d, const void *buf, size_t nbytes) {
fprintf(stderr, "ok, replaced\n");
}
int main(int argc, char *argv[]) {
write = write2;
write(STDERR_FILENO, "blah\n", 5);
}
4DEL03468:~ {543} gcc replace-write.c -o replace-write.exe
replace-write.c: In function `main':
replace-write.c:9: error: invalid lvalue in assignment
And yes Eric, I haven't measured yet how much does
an "if(debug)" last (I probably can do it with gprof?)
but I'm sure it takes more than nothing, esp. in a loop.
Regards
Alex
| |
|
|
> int main(int argc, char *argv[]) {
> write = write2;
> write(STDERR_FILENO, "blah\n", 5);
>
> }
>
> 4DEL03468:~ {543} gcc replace-write.c -o replace-write.exe
> replace-write.c: In function `main':
> replace-write.c:9: error: invalid lvalue in assignment
>
> And yes Eric, I haven't measured yet how much does
> an "if(debug)" last (I probably can do it with gprof?)
> but I'm sure it takes more than nothing, esp. in a loop.
>
> Regards
> Alex
Not that way.
write is _not_ a pointer function you can assign. As your your
compiler says, 'write' is lvalue, an adress of a function, you cannot
change that.
Use pointer functions like:
ssize_t (*write_ptr)(int, const void *, size_t);
int main(int argc, char *argv[])
{
write_ptr = write; /* default: syscall is used */
int debug = 0;
... do whatever is needed with the conf ...
if ( debug ) write_ptr = write2;
write_ptr(STDERR_FILENO, "blah\n", 5);
return 0;
}
A function pointers is a lvalue that holds function addresses.
write and write2 are symbols referencing addresses i.e. they _are_ the
addresses :-)
Regarding the debug function, read() is a syscall hence it _is_ slow
(i.e context switching etc.).
Your debug function can only be slower if it does I/O, yep I know -
that's the main purpose of debugging :-)
They are other schemes you may be interested in, like ring buffers -
maybe irrelevant for you - anyhow here an example
http://www.visibleworkings.com/trac...ring-buffer.pdf
Or even memory maped I/Os in some case: mapping a shared memory
segment to a file and writing to the file is like writing to memory.
ring buffers are much simpler and common though.
Cheers,
Paulo
| |
| A. Farber 2008-01-28, 7:24 pm |
| On Jan 28, 5:16 pm, ppi <vod...@gmail.com> wrote:
> int main(int argc, char *argv[])
> {
> write_ptr = write; /* default: syscall is used */
>
> int debug = 0;
>
> ... do whatever is needed with the conf ...
>
> if ( debug ) write_ptr = write2;
>
> write_ptr(STDERR_FILENO, "blah\n", 5);
> return 0;
>
> }
Ah ok, thanks!
| |
| David Schwartz 2008-01-28, 7:24 pm |
| On Jan 28, 7:29 am, "A. Farber" <Alexander.Far...@gmail.com> wrote:
> And yes Eric, I haven't measured yet how much does
> an "if(debug)" last (I probably can do it with gprof?)
> but I'm sure it takes more than nothing, esp. in a loop.
You really have no idea what you're talking about and have actively
ignored all the expert advice you've been given. I'll try one more
time just on the off chance that you're actually reading and thinking:
1) Actually, it will take very little in a loop. The branch will
*always* be predicted accurately since it will either always be taken
or never be taken. The cost of the system call likely swamp the cost
of a predictable branch by about two orders of magnitude.
2) Why would you compare it to "nothing"? The alternative to the "if"
is not nothing but an indirect function call. Indirect function calls
are not free. In the debug case, the indirect function call may
prevent the 'if(debug)' from inlining. It's not clear what affect this
will have. It will almost certainly be barely measurable, but it's not
immediately clear what the net effect will be.
So what you are "sure" about is almost certainly wrong, as I've now
explained twice. There is simply no way you can know off-hand which
will perform better, and there's every reason to expect that there
would be no significant difference and barely an insignificant one.
DS
| |
| David Schwartz 2008-01-28, 7:24 pm |
| On Jan 28, 6:09 am, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
[color=darkred]
> By how much? (You've measured it, haven't you?)
In my quick, unscientific tests, by about -2 to 8 cycles per system
call, depending upon the compiler version and flags. This translates
to about a maximum of an almost 2% difference in the absolute worst-
case combination with the tightest imaginable loop, reading a single
byte from /dev/zero repeatedly and throwing the data way.
Assuming you don't read from /dev/zero, read more than a single byte,
and actually do something with the data read, the difference drops to
almost nothing. And, again, with some combinations of compiler
versions and flags, the indirect function call was worse.
I only tested on a P3. With gcc-4.1.1 and "-O3" (but no other
optimization flags), the indirect function call was actually slower
than the predictable branch.
DS
| |
| Eric Sosman 2008-01-28, 7:24 pm |
| A. Farber wrote:
> [...]
> And yes Eric, I haven't measured yet how much does
> an "if(debug)" last (I probably can do it with gprof?)
> but I'm sure it takes more than nothing, esp. in a loop.
Alex, you must measure or at the very least estimate.
Let's turn it around: Suppose someone came up with a
way to eliminate the test's speed penalty altogether, to
make its cost exactly zero. Unfortunately, the method is
just a wee bit intricate and will take three months of an
expert programmer's time to implement. Is it worth doing?
Unless you have some idea of the size of the potential gain,
you have NO way to decide whether to do the work or not.
In my opinion, you're trying to improve your car's fuel
economy by giving it a coat of wax to make it slipperier
and reduce wind resistance. It might actually improve things,
but not by enough that you'd be able to notice.
--
Eric Sosman
esosman@ieee-dot-org.invalid
| |
| Scott Lurndal 2008-01-28, 7:24 pm |
| "A. Farber" <Alexander.Farber@gmail.com> writes:
>On Jan 28, 3:45 pm, fnegroni <f.e.negr...@googlemail.com> wrote:
>
>Hello fnegroni,
>
>you have replied 3 times in my thread, thank you.
>
>But I don't get your point. In the 1st reply you have
>rephrased my original question and in the following
>just written "look above".
>
>So my question again: is there a way to replace
>a syscall? I'm trying the following, but it doesn't work:
>
>4DEL03468:~ {542} cat replace-write.c
>#include <stdio.h>
>#include <unistd.h>
>
>ssize_t write2(int d, const void *buf, size_t nbytes) {
> fprintf(stderr, "ok, replaced\n");
>}
>
>int main(int argc, char *argv[]) {
> write = write2;
> write(STDERR_FILENO, "blah\n", 5);
>}
>
>4DEL03468:~ {543} gcc replace-write.c -o replace-write.exe
>replace-write.c: In function `main':
>replace-write.c:9: error: invalid lvalue in assignment
>
>And yes Eric, I haven't measured yet how much does
>an "if(debug)" last (I probably can do it with gprof?)
>but I'm sure it takes more than nothing, esp. in a loop.
>
>Regards
>Alex
Traditionally system calls have been defined 'weak' such that
they can be overridden by an application. Thus, you simply define
your own 'read' function, and internally call either "_read" or
perhaps "pread" to get the real system call functionality.
scott
| |
| Scott Lurndal 2008-01-28, 7:24 pm |
| David Schwartz <davids@webmaster.com> writes:
>On Jan 28, 6:09 am, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
>
>
>
>In my quick, unscientific tests, by about -2 to 8 cycles per system
>call, depending upon the compiler version and flags. This translates
>to about a maximum of an almost 2% difference in the absolute worst-
>case combination with the tightest imaginable loop, reading a single
>byte from /dev/zero repeatedly and throwing the data way.
>
>Assuming you don't read from /dev/zero, read more than a single byte,
>and actually do something with the data read, the difference drops to
>almost nothing. And, again, with some combinations of compiler
>versions and flags, the indirect function call was worse.
>
>I only tested on a P3. With gcc-4.1.1 and "-O3" (but no other
>optimization flags), the indirect function call was actually slower
>than the predictable branch.
And it should be, particularly if it's hitting cold i-cache. A
function call that will at a minimum push a word onto the stack and
branch out of the current locality domain will be significanltly
slower than a predicted branch.
scott
| |
| David Schwartz 2008-01-28, 7:24 pm |
| On Jan 28, 12:04 pm, sc...@slp53.sl.home (Scott Lurndal) wrote:
> And it should be, particularly if it's hitting cold i-cache. A
> function call that will at a minimum push a word onto the stack and
> branch out of the current locality domain will be significanltly
> slower than a predicted branch.
That was my intuition as well. My guess is that the calls to functions
in dynamically-loaded libraries are always indirect jumps, so I was
actually comparing the cost of an *additional* indirect jump to the
cost of the branch. (Rather than an indirect jump to a direct jump as
I expected.)
But I'm actually still not sure. I'd have to look through the
generated assembly code to be sure. I think there are very few people
in this world would could have a good intuition of how these things
would compare and have justified confidence that their intuition is
correct.
However, one thing reasonable people can and should agree on is that
intuitively, one would expect the difference to be just barely
measurable and definitely not noticeable under any realistic scenario.
My tests suggests that in a realistic case, the difference would be
about one part in 500 and could go either way depending on several
factors (CPU, compiler, optimization flags, tightness of loops, cache
warmness, and so on).
It's definitely not worth anywhere near the effort that's already gone
into investigating it from a purely practical standpoint. However,
gaining a better understanding of what actually happens and seeing if
testing bears out your intuition often helps you hone your skills, so
it's not really wasted.
DS
|
|
|
|
|