Home > Archive > Compilers > November 2007 > Garbage collection and optimization
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Garbage collection and optimization
|
|
| Rayiner Hashem 2007-11-06, 7:19 pm |
| How do GCs with thread-local, unsynchronized, bump-pointer style
allocation (eg: Sun's JVM and its TLABs), interact with code
optimizers? Specifically, how do they handle the case where one thread
is in the middle of constructing an object when another thread cases a
collection to start? The object may be in an inconsistent state, so
the GC can't scan it, but it also may contain the last remaining
pointers to some objects that need to be kept alive. Consider the
following structure:
typedef struct _cons {
struct _cons* car;
struct _cons* cdr;
} cons;
The allocation routine might look like:
cons* alloc_cons(cons* car, cons* cdr) {
if((alloc_ptr + sizeof(cons)) > alloc_limit)
make_more_space();
cons* obj = (cons*)alloc_ptr;
obj->car = car; /* one */
obj->cdr = cdr; /* two */
alloc_ptr += sizeof(cons); /* three */
return obj;
}
If the GC interrupts before 'one' or after 'three', we're fine. 'car'
and 'cons' will remain live since their values will be used later.
However, what happens when the GC interrupts between 'one' and 'two'?
The GC won't see an incomplete object, since the allocation pointer
hasn't been bumped yet, which is fine, but it also won't trace through
'obj->car'. If the GC is semi-conservative and 'car' is still in a
register, then we can count on the fact that 'obj->car' will remain
valid after the allocation (the object won't have moved), but since
'car' is technically dead at that point, we have no guarantees, right?
What solutions to this problem are used in practice?
| |
| Paolo Bonzini 2007-11-08, 7:14 pm |
| > cons* alloc_cons(cons* car, cons* cdr) {
> if((alloc_ptr + sizeof(cons)) > alloc_limit)
> make_more_space();
>
> cons* obj = (cons*)alloc_ptr;
> obj->car = car; /* one */
> obj->cdr = cdr; /* two */
> alloc_ptr += sizeof(cons); /* three */
> return obj;
> }
>
> If the GC interrupts before 'one' or after 'three', we're fine. 'car'
> and 'cons' will remain live since their values will be used later.
> However, what happens when the GC interrupts between 'one' and 'two'?
>
> What solutions to this problem are used in practice?
You have a special root set for objects being constructed. In one
design the GC will keep them alive but not scan them; or you
initialize the object fields to "null" at the moment you initialize
them. In both cases you "bump the pointer" before you start to fill
the object fields.
In the former case, all three of "obj", "car", "cdr" will have to be
in the special root set. In the latter, only "obj" will have to be
put there.
So it will be something like:
/* caller must have put car and cdr on root set, telling GC to scan
them */
/* do the following two atomically! */
cons* obj = (cons*)((alloc_ptr += sizeof(cons)) - sizeof(cons));
put obj on root set, it will not be scanned
obj->car = car; /* one */
obj->cdr = cdr; /* two */
or
/* caller must have put car and cdr on root set */
/* GC makes sure that memory at alloc_ptr is filled with nulls */
/* do the following two atomically! */
cons* obj = (cons*)((alloc_ptr += sizeof(cons)) - sizeof(cons));
put obj on root set
obj->car = car; /* one */
obj->cdr = cdr; /* two */
|
|
|
|
|