For Programmers: Free Programming Magazines  


Home > Archive > Unix Programming > June 2005 > Mysterious CPU consumption









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Mysterious CPU consumption
phil_gg04@treefic.com

2005-05-30, 8:57 pm

Dear Unix Experts,

I have a bug somewhere that causes my code to consume all available CPU
time, but it does it in a bizzare way. When I look at the output of
"top", I see that overall CPU time is approximately 30% user and 70%
system, and the load average is just over 1. But none of the listed
processes shows any significant %CPU: no process "owns" the excessive
processing time.

Of course I know which process is responsible for it - it's the code
I'm currently hacking, and killing it does return things to normal. I
suspect that the problem is a loop that is calling select() over and
over again or something like that, and I'll eventually track it down.
But the fact that this CPU time is not being attributed to it is
mysterious, and I wonder if it is giving me a clue about what is going
wrong. Has anyone else ever seen this behaviour?

This is with Linux, 2.6.3 kernel.

Cheers, Phil.

Måns Rullgård

2005-05-30, 8:57 pm

phil_gg04@treefic.com writes:

> Dear Unix Experts,
>
> I have a bug somewhere that causes my code to consume all available CPU
> time, but it does it in a bizzare way. When I look at the output of
> "top", I see that overall CPU time is approximately 30% user and 70%
> system, and the load average is just over 1.


Sounds like it's stuck in a loop doing system calls repeatedly.

> But none of the listed processes shows any significant %CPU: no
> process "owns" the excessive processing time.
>
> Of course I know which process is responsible for it - it's the code
> I'm currently hacking, and killing it does return things to normal. I
> suspect that the problem is a loop that is calling select() over and
> over again or something like that, and I'll eventually track it down.
> But the fact that this CPU time is not being attributed to it is
> mysterious, and I wonder if it is giving me a clue about what is going
> wrong. Has anyone else ever seen this behaviour?
>
> This is with Linux, 2.6.3 kernel.


The behavior you describe is normal with old versions of "top" and an
NPTL enabled kernel/libc. What happens is that another thread than
the first is using the CPU time, and the tools don't know where to
look to find out. Upgrade your procps package and/or try "ps ux -T".
That command will list all the threads of each process, and should
show the CPU usage correctly.

To find the actual bug, try using strace to pinpoint the exact system
call. If you're lucky, it's one that's called from few places in your
code.

--
Måns Rullgård
mru@inprovide.com
phil_gg04@treefic.com

2005-05-30, 8:57 pm

>> overall CPU time is approximately 30% user and 70% system
[color=darkred]
> The behavior you describe is normal with old versions of "top" and an
> NPTL enabled kernel/libc. What happens is that another thread than
> the first is using the CPU time, and the tools don't know where to
> look to find out.


Ah, an instrumentation failure! Thanks M=E5ns, I'll upgrade and see if
that reports it properly. This is indeed a multi-threaded application.

Cheers, Phil.

Kurtis D. Rader

2005-06-01, 3:57 am

On Mon, 30 May 2005 06:49:33 -0700, phil_gg04 wrote:

> I have a bug somewhere that causes my code to consume all available CPU
> time, but it does it in a bizzare way. When I look at the output of
> "top", I see that overall CPU time is approximately 30% user and 70%
> system, and the load average is just over 1. But none of the listed
> processes shows any significant %CPU: no process "owns" the excessive
> processing time.


There is another possible explanation than the instrumentation defect
suggested by Måns Rullgård. The scenario you describe may be due to
short lived processes. If processes begin and terminate in less than the
sample time of the monitoring tools the CPU time they consume can be hard
to account for. The symptoms you report suggests that tasks (e.g.,
processes or threads) are being spawned and almost immediately exiting.
phil_gg04@treefic.com

2005-06-01, 4:00 pm

>> the load average is just over 1. But none of the listed
> may be due to short lived processes


Thanks Kurtis, that's possible. But I don't think it's the problem in
my case, because looking at the process IDs allocated to new processes
they are only increasing at a sensible rate. If there were short-lived
processes I'd see big gaps between the PIDs of other processes.

--Phil.

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com