For Programmers: Free Programming Magazines  


Home > Archive > PHP Eaccelerator > April 2007 > EA causes Apache to stall









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author EA causes Apache to stall
Ronan Mullally

2006-10-30, 7:48 pm

Hi,

I maintain the infrastructure for a fairly high-volume vBulletin forum
(~500,000+ hits per day). I'm having trouble with eaccelerator which I've
on versions 0.9.5 and 0.9.5-rc2svn272 (IIRC).

The site runs on a:

dual Xeon (with hyper threading enabled) Dell 2850
Gentoo Linux, 2.6.16.x kernel
Apache2 2.0.58 (worker MPM)
MySQL 4.0
PHP 4.4.4

There are also a handful of very low volume sites hosted on the same
server, but I doubt these are implicated.

Since installing eaccelerator about a month ago there have been several
occasions where the site has just stopped answering queries. apache keeps
running, will accept connections on port 80, but nothing is returned.

The only thing in the apache logs is:

*** glibc detected *** double free or corruption (!prev): 0x11707a28 ***
*** glibc detected *** malloc(): memory corruption (fast): 0x3580b828 ***
[Sun Oct 22 19:50:14 2006] [notice] child pid 22608 exit signal Aborted (6)

Which I believe relates to libc detecting malloc errors - however these
occur regardless of whether eaccelerator is installed or not.

The only abnormal indication I can find prior to this occuring is in the
usage of the database. At seemingly random intervals some threads will
get 'stuck':

Id User Host/IP DB Time Cmd Query or State
-- ---- ------- -- ---- --- ----------
7822165 root localhost site 0 Query show full processlist
7910927 username localhost site 1 Sleep
7875593 rtg localhost rtg 77 Sleep
7899826 username localhost site 3736 Sleep
7899828 username localhost site 3736 Sleep

DB threads owned by user "username" *should* have a lifetime of no more than
a couple of seconds. These "stuck" threads will run indefinitely until one
is killed, then they all exit.

At some stage, after a few hours, or sometimes a couple of days the server
will simply stop answering requests. Judging by the network I/O stats this
occurs suddenly - there's no gradual tailing of in traffic or (presumably)
performance. An apache restart is required to get things moving again.

When I disable eaccelerator and reload apache this problem ceases (the
site ran fine for 2 ws without EA), so this is not a problem with the
underlying system - it's rock solid without EA. It could conceiveably be
a coding problem with site itself which is exacerbated by eaccelerator.

The EA logfile shows nothing but a lot of "hit" lines, a few dozen
"cached" lines and regular "mtime is in the future" warnings. I've purged
the disk cache to be sure there's nothing bogus there, but the problem
persists.

Eaccelerator was built manually rather than using the gentoo ebuild and
was compiled with:

./configure --enable-eaccelerator
--with-php-config=/usr/lib/php4/bin/php-config
--without-eaccelerator-encoder
--without-eaccelerator-loader

It's running config is:

extension="eaccelerator.so"
eaccelerator.shm_size="16"
eaccelerator.cache_dir="/var/www/cache"
eaccelerator.log_file="/var/log/eaccelerator.log"
eaccelerator.enable="1"
eaccelerator.optimizer="1"
eaccelerator.check_mtime="1"
eaccelerator.debug="1"
eaccelerator.filter=""
eaccelerator.shm_max="640K"
eaccelerator.shm_ttl="3600"
eaccelerator.shm_prune_period="120"
eaccelerator.shm_only="0"
eaccelerator.compress="1"
eaccelerator.compress_level="9"
eaccelerator.allowed_admin_path="..../control.php"

AFAIK the shared memory cache typically runs at ~80% utilisation (it's not
possible to access control.php once the problem strikes). I've verified
that this occurs both with the optimizer enabled and disabled. The site
ran for a full w without the optimizer enabled without crashing, but
keeled over last night with the same symptoms.

I run EA on several other servers, all dual CPU Dell 2850s, but running
64bit Debian rather than 32bit Gentoo. These systems primarily run Horde
and IMP for webmail and haven't exhibited this problem in 3-4 ws of
operation.

Can anybody offer any suggestions as to how to proceed? If not, is there
anything useful debugging that can be done the next time this occurs
before I restart apache?

Thanks in advance,


-Ronan


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on apache Geronimo
http://sel.as-us.falkag.net/sel?cmd...3057&dat=121642
Bart Vanbrabant

2006-10-30, 7:48 pm

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on apache Geronimo
http://sel.as-us.falkag.net/sel?cmd...3057&dat=121642
Thomas Love

2006-10-30, 7:48 pm

Just some cold comfort, but I get these deadlocks very predictably
under certain conditions, specifically, dual Xeon, high load, and lots
of use of the eA usercache (eaccelerator_get() and
eaccelerator_put()), and interestingly, only when those functions are
_not_ locked in my application by an flock() or whatever.

All apache processes get stuck in "W" state (sending reply). When this
happens I also get those glibc errors and, more consistently, "PHP
crashed on line n of x".

If you see those glibc errors even when eA isn't around, that would
eliminate them as any fault of eA's. But I only ever see them when
these hangs occur, and they have happened in emalloc(), not malloc(),
all the times I remember.

I've also seen the sleeping MySQL queries problem but I've never
thought there was a relation.

I think it may be precipitated by the cache filling up, or that may be
a symptom. It could be either given that _put() allocates space before
locking for insertion (which is when space may be freed).

At any rate this is definately only an issue for me under very high
load and seems to be related to high-concurrency, unlocked use of the
usercache (I imagine output caching counts too). If you don't use the
usercache then that should really narrow things down.

Just to help eliminate one possibility, why not double your shm size
and see whether that makes any difference.

And finally, since eA serializes access to its shm through only one
fully blocking lock, 4 CPU cores is just a false sense of security.
I'd really like to see parallel non-blocking spinlocks implemented.

Cheers
Thomas

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on apache Geronimo
http://sel.as-us.falkag.net/sel?cmd...3057&dat=121642
Ronan Mullally

2006-10-31, 4:11 am

Hi Thomas,

Thanks for your response.

On Tue, 31 Oct 2006, Thomas Love wrote:

> Just to help eliminate one possibility, why not double your shm size
> and see whether that makes any difference.


I've just re-installed EA using the gentoo ebuild rather than a manual
compile. That might just fix it. If not, I'll increase the shm size.


-Ronan

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on apache Geronimo
http://sel.as-us.falkag.net/sel?cmd...3057&dat=121642
Lover4ever

2007-03-20, 6:14 am

Carmen Electra Giving A Head And Taking A Load!
http://Carmen-Electra-Giving-A-Head...hp?movie=148803
Adster43

2007-04-15, 6:41 am

http://Angelina-Jolie-doing-it.info...hp?movie=148803
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com