Home > Archive > Unix Programming > May 2005 > no child process - waitpid
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
no child process - waitpid
|
|
| Sudheer.Nair@gmail.com 2005-05-14, 4:05 pm |
| Hi,
I have a program that fork/exec's commands. It works properly on
solaris/hp/aix, but on linux sometimes I get a "No Child Process" error
from waitpid. Anything that might me causing this ?
my machine (Redhat)
Linux 2.4.21-15.EL #1 Thu Apr 22 00:27:41 EDT 2004 i686 i686 i386
GNU/Linux
code snippet -
int ProcStream::open(const char *argv[])
{
if (pipe(pfd)) {
mh.print(9, "pipe() failed : ", strerror(errno));
return -1;
}
pid = fork();
if (pid < 0) {
return -1;
} else if (pid) {
//Parent
//Close the write end of the pipe.
::close(pfd[1]);
//Associate a stream ptr to the pipe descriptor.
#ifdef AIX
// AIX stuff
#else
ifsptr = new ifstream(pfd[0]);
#endif
if (!*ifsptr) {
mh.print(9, "ifsptr NULL");
return -1;
}
int status = 0;
//0=>Block till the child finishes it's execution.
if (waitpid(pid, &status, 0) < 0) {
mh.print(9, "waitpid failed: ",
strerror(errno));
return -1;
}
if (WIFEXITED(status)) {
//status was reported for the child &&
//the child terminated normally.
int exit_status = WEXITSTATUS(status);
//exit status = 0 and 11 are ok.
if (exit_status && exit_status != 11) {
return -1;
}
} else {
return -1;
}
} else {
//Child.
//Close the read end of the pipe.
::close(pfd[0]);
//Associate stdout of the child to the write end
//of the pipe. The parent can then read from it's
//read end whatever the child writes to its stdout.
dup2(pfd[1], STDOUT_FILENO);
//Redirect stderr to the CLIerror log file.
if (fd == -1) {
_exit(127);
}
dup2(fd, STDERR_FILENO);
execvp(argv[0], const_cast<char *const *>(argv));
//We reach here , implies exec failed.
_exit(127);
}
return 0;
| |
| Patrick Plattes 2005-05-14, 4:05 pm |
| Sudheer.Nair@gmail.com wrote:
> Hi,
>
>
> I have a program that fork/exec's commands. It works properly on
> solaris/hp/aix, but on linux sometimes I get a "No Child Process" error
>
> from waitpid. Anything that might me causing this ?
i have two different ideas.
1.) the child exit before you can all waitpid. try to slow down the
clild for testing ( sleep(1) )
2.) maybe one of child action for SIGCHLD is set to SIG_IGN (see man
waitpid for more details)
good luck,
patrick
| |
| Jens.Toerring@physik.fu-berlin.de 2005-05-14, 4:05 pm |
| Patrick Plattes <newsgroup@erdbeere.net> wrote:
> Sudheer.Nair@gmail.com wrote:
[color=darkred]
> i have two different ideas.
> 1.) the child exit before you can all waitpid. try to slow down the
> clild for testing ( sleep(1) )
That can't be the problem since then the childs exit status would
still be around to be fetched by waitpid().
> 2.) maybe one of child action for SIGCHLD is set to SIG_IGN (see man
> waitpid for more details)
But then why it's working on the other systems? The only thing I can
think of at the moment is that the OP is using threads and some other
thread already reaped the exit status of that child (or messed up the
'pid' variable;-) and it worked on the other systems only by chance...
Regards, Jens
--
\ Jens Thoms Toerring ___ Jens.Toerring@physik.fu-berlin.de
\__________________________ http://www.toerring.de
| |
| Fletcher Glenn 2005-05-14, 4:05 pm |
| Jens.Toerring@physik.fu-berlin.de wrote:
> Patrick Plattes <newsgroup@erdbeere.net> wrote:
>
>
>
>
>
>
>
> That can't be the problem since then the childs exit status would
> still be around to be fetched by waitpid().
>
>
>
>
> But then why it's working on the other systems? The only thing I can
> think of at the moment is that the OP is using threads and some other
> thread already reaped the exit status of that child (or messed up the
> 'pid' variable;-) and it worked on the other systems only by chance...
>
> Regards, Jens
It's possible that the parent code is executing before the child process
is established. Try adding a one-second sleep (to yield the processor)
before executing waitpid().
--
Fletcher Glenn
| |
| Jens.Toerring@physik.fu-berlin.de 2005-05-14, 4:05 pm |
| Fletcher Glenn <fletcher@removethisfoglight.com> wrote:
> It's possible that the parent code is executing before the child process
> is established. Try adding a one-second sleep (to yield the processor)
> before executing waitpid().
I never thought about this possibility. But is it really possible?
Once fork() returns I would have assumed that the new child already
exists - or where would fork() take the PID of the child from?
Regards, Jens
--
\ Jens Thoms Toerring ___ Jens.Toerring@physik.fu-berlin.de
\__________________________ http://www.toerring.de
| |
| Gordon Burditt 2005-05-14, 4:05 pm |
| >>>1.) the child exit before you can all waitpid. try to slow down the
This shouldn't matter. waidpid() is still supposed to work if the
child has already terminated. When waidpid() is called from a
SIGCHLD handler, at least one child should already be terminated
before waidpid() is called. When waidpid() is called from a polling
loop with WNOHANG, any children it picks up should be already
terminated, as it won't block waiting for one.
[color=darkred]
>
>It's possible that the parent code is executing before the child process
>is established. Try adding a one-second sleep (to yield the processor)
>before executing waitpid().
This shouldn't be necessary. If it is, Linux would break
(intermittently) on just about any code that does a fork/exec/wait,
and this would be noticed immediately, probably on boot. Linux
works much, much better than that.
Gordon L. Burditt
| |
| Barry Margolin 2005-05-14, 4:05 pm |
| In article <3ejndqF3gsvfU1@uni-berlin.de>,
Jens.Toerring@physik.fu-berlin.de wrote:
> Patrick Plattes <newsgroup@erdbeere.net> wrote:
>
> But then why it's working on the other systems? The only thing I can
> think of at the moment is that the OP is using threads and some other
> thread already reaped the exit status of that child (or messed up the
> 'pid' variable;-) and it worked on the other systems only by chance...
Different versions of Unix treat SIG_IGN for SIGCHLD differently. On
some systems it causes the child to be reaped automatically. On others
it just prevents the SIGCHILD signal from being delivered to the parent,
but the parent is still required to call a wait() function to reap the
child.
So I think that Linux is the former variety, and waitpid() will fail if
the child had already exited by the time the parent calls waitpid().
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
| |
| Jens.Toerring@physik.fu-berlin.de 2005-05-14, 4:05 pm |
| Barry Margolin <barmar@alum.mit.edu> wrote:
> In article <3ejndqF3gsvfU1@uni-berlin.de>,
> Jens.Toerring@physik.fu-berlin.de wrote:
[color=darkred]
> Different versions of Unix treat SIG_IGN for SIGCHLD differently. On
> some systems it causes the child to be reaped automatically. On others
> it just prevents the SIGCHILD signal from being delivered to the parent,
> but the parent is still required to call a wait() function to reap the
> child.
> So I think that Linux is the former variety, and waitpid() will fail if
> the child had already exited by the time the parent calls waitpid().
Good point! I played a bit around with this and got some interesting
(and a bit disturbing) results. Here's the test program:
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
int main( int argc, char *argv[ ] )
{
int pid;
signal( SIGCHLD, SIG_IGN );
if ( ( pid = fork( ) ) == 0 ) {
usleep( 0 );
_exit( 0 );
}
printf( "waitpid returned %d\n", waitpid( pid, NULL, 0 ) );
return 0;
}
Now playing around with removing the calls of signal() and/or
usleep() (usleep() is used since even when called with 0 it will
yield its time slice to the other process) and found that when
the signal disposition of SIGCHLD is set to SIG_IGN and the
usleep() call is removed then waitpid() won't find a child. In
all other cases waitpid() returned successful with the childs
PID. So it looks as if under Linux 2.4.21 (which the OP also is
using) the child process gets reaped automatically a) the signal
disposition for SIGCHLD is set to SIG_IGN and b) the child exits
before the parent had a chance to run (in all cases the child
process was scheduled first). But if the parent has got a time
slice after the fork() call before the child dies the child will
stay around as a zombie. On the other hand under Linux 2.4.26
(and 2.6.10) this seems to have changed - here waitpid() seems
always to return successfully whatever the signal disposition or
if the child process yields its time slice or not.
So it's possible the problem the OP has might be related to some
inconsistency in the behaviour of certain versions of Linux with
SIG_IGN set for SIGCHLD and cases where the child process exits
immediately...
Regards, Jens
--
\ Jens Thoms Toerring ___ Jens.Toerring@physik.fu-berlin.de
\__________________________ http://www.toerring.de
| |
| phil_gg04@treefic.com 2005-05-15, 3:57 am |
| > So it looks as if under Linux 2.4.21 (which the OP also is
> using) the child process gets reaped automatically a) the signal
> disposition for SIGCHLD is set to SIG_IGN and b) the child exits
> before the parent had a chance to run (in all cases the child
> process was scheduled first). But if the parent has got a time
> slice after the fork() call before the child dies the child will
> stay around as a zombie.
The Linux waitpid() man page says the following, which I think is
essentailly what you've described (and sensible, I think):
if a wait() or waitpid() call is made while SIGCHLD is being
ignored, the call behaves just as though SIGCHLD were not being
igored,
that is, the call blocks until the next child terminates
and then
returns the PID and status of that child.
We really need the OP to tell us what he had done with SIGCHLD...
--Phil.
| |
| Sudheer.Nair@gmail.com 2005-05-16, 8:57 am |
| Thanks for your inputs. I had earlier tried putting a sleep(1) and it
did work. But I was under the impression that waitpid can be
successfully called after the child has exited and that I was not
seeing the problem occuring just by chance :)
| |
| phil_gg04@treefic.com 2005-05-16, 3:59 pm |
| > I was under the impression that waitpid can be
> successfully called after the child has exited
It can, you are right.
The exception is if the parent process is ignoring SIGCHLD, which is
not the default behaviour.
If you are not explicitly ignoring SIGCHLD (e.g.
signal(SIGCHDL,SIG_IGN) then the behaviour that you described is still
mysterious.
--Phil.
|
|
|
|
|