| Markus 2007-08-19, 4:27 am |
|
Rayiner Hashem wrote:
>
> Obviously. But why fork() and not any of a number of other mechanisms
> for creating processes?
Because fork very cheaply _copies_ the an already existing process. If
you want many pseudo-threads of the same type (e.g. worker threads for
answering http requests concurrently) you just initialize a master
copy (a process) up to a certain point and then fork as many threads
as you like (often under control of a housekeeper process which
decides how many threads are needed and when). This is much cheaper
than creating a new process which would have to go through all the
setup activity of language runtime, reading config files and so on,
again.
Mind you: fork() is cheap, exec() is expensive. spawn() (which is
often only fork()/exec() is expensive too. If I understand things
right, CreatProcess is more like spawn(). Cygwin goes (or did go,
AFAIR) through a number of contortions to actually copy a process
image during startup to emulate fork() on a system which doesn't have
fork() (and thus does not efficiently support sharing of memory pages
between basically indentically cloned processes).
>
> You can achieve the same effect by using an anonymous memory-mapped
> area for the heap, then copy-on-write mapping it in the new process.
Yes, certainly. It requires one to actually change the language
runtime a good bit. Then everything is possible. I'm refreining from
adding "But ..." since I'm sure you can fill in that part for
yourself. Emulating what fork() does in user space is just not as
efficient as a fork() tightly integrated with the OS can be.
I'd like to add, that many people blame windows for not having a real
fork() in the Win32 sub system. I think they are mistaking a lot here:
It's a difference in philosophy: Windows has and had from the
beginning very usable multi threading support. Instead of achieving
parallelism with multiple process (as is traditionally done in Unix)
in Windows one can use multiple threads. There are Pros and Cons:
Multiple threads make shared data easy. Unfortunately it also opens a
whole host of new problems: Racing conditions, locking, often the
problem of a global heap (which leads to involuntary sharing too) and
so on. Parallelism by fork()ing has the advantage of isolation: One
service thread going down won't leave you with corruption in the other
process. And as apache does: After doing a certain amount of work, one
can just scrap the worker and start a new one: With the effect that
any creeping corruption in data, any memory leaks and this like is
being dumped with the old worker. This option greatly enhanced
stability in a number of usage scenarios (like basically stateless
servers).
I think Jon is wrong here,
[color=darkred]
>
> The OS will happily run multiple threads concurrently regardless of
> the type of GC you have. Obviously the GC has to be thread safe, but
> this is accomplished as easily as putting a mutex around the alloc
> routine.
.... but this is a problem: MAnipulating Mutexes usually is a system
call (execpt you can fashion a cheap user space mutex with
instructions that are guaranteed not to be interrupted) and that is
expensive. Expensive malloc() on the other side is the death of all
programs that malloc() generously. Functional languages among them.
> A more reasonable implementation is to give each thread a
> local allocation buffer, so the mutex only comes into play when the
> allocation buffer fills up. This is how most multithreaded Lisp
> systems do things.
Actually that is better :-). Much better.
> Concurrent and parallel GCs come into play only when a GC is
> happening. In a non-concurrent collector, all threads must be stopped
> during a GC (or a increment of GC work). In a non-parallel collector,
> the marking/copying/sweeping can only happen in a single thread.
> Neither of these issues are going to really affect concurrency much on
> a 2-4 core system, unless you spend excessive amounts of time in GC.
I think I can agree here, but GC is very much a black art: It's
definitely difficult to say much about it until you have much
experience in the area.
Regards -- Markus
|