Home > Archive > Unix Programming > November 2004 > select rtns ready;read returns 0
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
select rtns ready;read returns 0
|
|
| michael potter 2004-11-16, 3:56 am |
| I am seeing a condition in my code that causes a small looping
condition.
select returns that a pipe is ready to read, but when the pipe is read
there is no data to be read. The program returns to the select which
immediately returns that there is data to be read ...
I have added logic to detect this condition and sleep for one second.
This sleep seems to allow the data that is supposed to be in the pipe
to "catch up" with what ever select is looking at to determine that
there is data in the pipe. Adding the one second sleep probably frees
up the cpu so it can deliver the data.
I am getting ready to add some more logic to gather more information
about what I am seeing. It does not happen very often so it is a
difficult problem to track. The one second sleep has turned it into a
low priority problem too.
Can someone in the group give me some insight on what I am seeing?
Not supprisingly, It seems to happen when the cpu is busy.
I have seen this on aix 5.x and solaris 9.x
--
Michael Potter
pottmi@gmail.com
| |
| Barry Margolin 2004-11-16, 3:56 am |
| In article <2379dacc.0411151800.6070435e@posting.google.com>,
pottmi@gmail.com (michael potter) wrote:
> I am seeing a condition in my code that causes a small looping
> condition.
>
> select returns that a pipe is ready to read, but when the pipe is read
> there is no data to be read. The program returns to the select which
> immediately returns that there is data to be read ...
When read() returns 0, it means that you have reached EOF. In the case
of a pipe, it means that the writing process has closed its end of the
pipe. There's nothing more to read, you should exit the loop.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| michael potter 2004-11-16, 8:56 pm |
| read will return 0 if O_NDELAY is set, but as far as i can tell it is
not set, so...
Lets assume that read is not returning zero. I would like to explain
why select and read are in a tight loop. It could be explained if the
buffer on the pipe is small so it takes many reads to get all the
data.
What is the buffer size on an unnamed pipe?
Can I increase it?
My goal is to increase efficiency by reducing the number of reads it
takes to get all the data from the pipe.
Thanks for the "listening" and responding.
| |
| Alex Fraser 2004-11-17, 3:56 am |
| "michael potter" <pottmi@gmail.com> wrote in message
news:2379dacc.0411161544.49a1b972@posting.google.com...
> read will return 0 if O_NDELAY is set, but as far as i can tell it is
> not set, so...
A read() on a pipe will return zero when all data have been read and there
are no writers, irrespective of O_NDELAY (or O_NONBLOCK). If you select() on
a pipe in this condition, it will be marked as ready for reading.
> Lets assume that read is not returning zero. [...]
Is this a hypothetical situation?
> What is the buffer size on an unnamed pipe?
Implementation defined, but at least 512 bytes, and probably more these days
in most cases.
> Can I increase it?
I don't think so, at least not in a portable manner. Note that bigger
buffers don't always increase throughput.
Alex
| |
| Barry Margolin 2004-11-17, 3:56 am |
| In article <2379dacc.0411161544.49a1b972@posting.google.com>,
pottmi@gmail.com (michael potter) wrote:
> Thanks for the "listening" and responding.
Then why didn't you actually read my response? I told you why read
returns 0 -- you've reached EOF.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| michael potter 2004-11-17, 3:59 pm |
| "Alex Fraser" <me@privacy.net> wrote in message news:<2vvnh6F2pufviU1@uni-berlin.de>...
> "michael potter" <pottmi@gmail.com> wrote in message
> news:2379dacc.0411161544.49a1b972@posting.google.com...
>
> A read() on a pipe will return zero when all data have been read and there
> are no writers, irrespective of O_NDELAY (or O_NONBLOCK). If you select() on
> a pipe in this condition, it will be marked as ready for reading.
I am not saying what you wrote was wrong.
here is an edited fragment from the read man page on aix:
------------
When attempting to read from an empty pipe:
If some process has the pipe open for writing:
If O_NDELAY is set, the read subroutine returns a value of 0.
------------
>
>
> Is this a hypothetical situation?
no, i should have been clear: assume there is a writer and there is
data and read is not returning zero...
>
>
> Implementation defined, but at least 512 bytes, and probably more these days
> in most cases.
>
>
> I don't think so, at least not in a portable manner. Note that bigger
> buffers don't always increase throughput.
>
> Alex
I am just trying to explain the tight loop that I am seeing. A large
amount of data with a small buffer might explain it.
I have gotten what I want out of this thread: some idea of what to
look for while trying to explain the tight loop that I am seeing. A
large amount of data with a small buffer might explain it.
The way I detect loops in my select is to check the time every N
returns from select. If the time does not increase after N returns,
then I report a potential loop in the log file for my application.
this method has its faults, but it is low overhead.
Thanks for the discussion.
| |
| Michael Fuhr 2004-11-17, 8:57 pm |
| pottmi@gmail.com (michael potter) writes:
> here is an edited fragment from the read man page on aix:
Here's a more complete excerpt:
When attempting to read from an empty pipe (first-in-first-out (FIFO)):
* If no process has the pipe open for writing, the read returns 0
to indicate end-of-file.
* If some process has the pipe open for writing:
o If O_NDELAY and O_NONBLOCK are clear (the default), the read
blocks until some data is written or the pipe is closed by all
processes that had opened the pipe for writing.
o If O_NDELAY is set, the read subroutine returns a value of 0.
o If O_NONBLOCK is set, the read subroutine returns a value of -1
and sets the global variable errno to EAGAIN.
AIX appears to treat O_NDELAY and O_NONBLOCK differently; on many
systems they behave the same way and one may be #define'd as the
other. People using other systems are probably accustomed to the
behavior AIX uses for O_NONBLOCK. Is there a reason you're using
O_NDELAY instead of O_NONBLOCK?
I don't know how AIX's select() behaves in the face of these settings,
but are you sure you're using it correctly? Could we see an example
of your code?
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
| |
| Alex Fraser 2004-11-17, 8:57 pm |
| "michael potter" <pottmi@gmail.com> wrote in message
news:2379dacc.0411170912.4142d2cd@posting.google.com...
> "Alex Fraser" <me@privacy.net> wrote in message
news:<2vvnh6F2pufviU1@uni-berlin.de>...
[snip][color=darkred]
>
> no, i should have been clear: assume there is a writer and there is
> data and read is not returning zero...
Why assume something can easily be established as a fact (or not)?
[snip]
> The way I detect loops in my select is to check the time every N
> returns from select. If the time does not increase after N returns,
> then I report a potential loop in the log file for my application.
> this method has its faults, but it is low overhead.
Can't you reset the "select() returns" counter every time you successfully
read/write some data on any descriptor?
Alex
| |
| Alex Fraser 2004-11-17, 8:57 pm |
| "Michael Fuhr" <mfuhr@fuhr.org> wrote in message
news:419bb225$1_2@omega.dimensional.com...
> pottmi@gmail.com (michael potter) writes:
>
>
> Here's a more complete excerpt:
>
> When attempting to read from an empty pipe (first-in-first-out (FIFO)):
>
> * If no process has the pipe open for writing, the read returns 0
> to indicate end-of-file.
>
> * If some process has the pipe open for writing:
> o If O_NDELAY and O_NONBLOCK are clear (the default), the read
> blocks until some data is written or the pipe is closed by all
> processes that had opened the pipe for writing.
> o If O_NDELAY is set, the read subroutine returns a value of 0.
> o If O_NONBLOCK is set, the read subroutine returns a value of -1
> and sets the global variable errno to EAGAIN.
So for a pipe with O_NDELAY set, it's impossible to tell the difference
between "no writer" and "one or more writers but no data at this time"? That
doesn't seem terribly useful(!). What am I missing?
Alex
| |
| Barry Margolin 2004-11-18, 3:56 am |
| In article <419bb225$1_2@omega.dimensional.com>,
mfuhr@fuhr.org (Michael Fuhr) wrote:
> pottmi@gmail.com (michael potter) writes:
>
>
> Here's a more complete excerpt:
>
> When attempting to read from an empty pipe (first-in-first-out (FIFO)):
>
> * If no process has the pipe open for writing, the read returns 0
> to indicate end-of-file.
>
> * If some process has the pipe open for writing:
> o If O_NDELAY and O_NONBLOCK are clear (the default), the read
> blocks until some data is written or the pipe is closed by all
> processes that had opened the pipe for writing.
> o If O_NDELAY is set, the read subroutine returns a value of 0.
> o If O_NONBLOCK is set, the read subroutine returns a value of -1
> and sets the global variable errno to EAGAIN.
But since he's using select() to wait for data to be available in the
pipe, these last two cases shouldn't be relevant.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Michael Fuhr 2004-11-18, 3:56 am |
| Barry Margolin <barmar@alum.mit.edu> writes:
> In article <419bb225$1_2@omega.dimensional.com>,
> mfuhr@fuhr.org (Michael Fuhr) wrote:
>
> But since he's using select() to wait for data to be available in the
> pipe, these last two cases shouldn't be relevant.
True, assuming that he's using select() correctly and that AIX's
select() behaves as on other systems with respect to marking a
descriptor as readable. I requested that he post some code so we
can check out the first assumption before challenging the second.
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
| |
| Geoff Clare 2004-11-18, 3:59 pm |
| "Alex Fraser" <me@privacy.net> wrote, on Wed, 17 Nov 2004:
> So for a pipe with O_NDELAY set, it's impossible to tell the difference
> between "no writer" and "one or more writers but no data at this time"? That
> doesn't seem terribly useful(!). What am I missing?
You're not missing anything. This misfeature of O_NDELAY is exactly
the reason for it being withdrawn from the UNIX standards in 1988 (XPG3)
and replaced with O_NONBLOCK.
--
Geoff Clare <nospam@gclare.org.uk>
| |
| michael potter 2004-11-18, 8:58 pm |
| Guys,
Thanks for all your comments. I think the read is select/read
combination is working as documented. I think I was fooled by an
unusually large amount of data being written to the pipe and the weak
nature of my loop detection algorithm.
As for posting my code: i have wrapper around select so that it is
easier to port to other systems. are you interested in seeing that
wrapper?
I would not mind getting feedback on it. I am a self taught c/unix
programmer (thank you k&r and richard stevens) and would not mind a
critique.
Thanks,
potter.
| |
| Michael Fuhr 2004-11-19, 3:56 am |
| pottmi@gmail.com (michael potter) writes:
> Thanks for all your comments. I think the read is select/read
> combination is working as documented. I think I was fooled by an
> unusually large amount of data being written to the pipe and the weak
> nature of my loop detection algorithm.
I wonder if the writer is closing the pipe, which, as Barry has
pointed out, would cause read() to return 0. Although O_NDELAY
causes read() to return 0 if a writer has the pipe open but there's
no data in the pipe, if you're using select() correctly then you
shouldn't encounter this case. Select() should mark the descriptor
as readable only if data is present or upon detecting EOF, so if
read() returns 0 then you've reached EOF; for a FIFO this means
that all writers have closed. With other file types you'd typically
close the descriptor because no more data will arrive, but with a
FIFO you can keep reading and eventually get new data when another
process opens the FIFO for writing. However, between the EOF and
the next writer's open(), select() will report that the descriptor
is readable and read() will return 0, which is what might be causing
your tight loop. I'm basing this analysis on experiments I performed
on Solaris 9.
I don't know of a reliable way to avoid the tight loop. You could
close and immediately re-open the FIFO upon detecting EOF, which
should cause select() to block until data is present or you reach
EOF again. Unfortunately this introduces a race condition: if a
writer opens the FIFO before the reader's close() then you could
lose data.
Can anybody find fault with my analysis or come up with a way to
avoid the tight loop that occurs after EOF and before the next
writer opens the FIFO? If not then I wonder if introducing a sleep()
might indeed be the best workaround.
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
|
|
|
|
|