Home > Archive > Compilers > September 2004 > Garbage collection
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Garbage collection
|
|
| William McCabe 2004-07-28, 9:08 pm |
| I am in the process of writing my own programming language and I would
like to use garbage collection as a means for automatic memory
management. I have decided that I want to implement a two-stage
generational copying collector.
I know how a copying and generational collector works but Im not sure
how they work when put together. In my mind, generational is similar
to coping in being that they both seperate the heap into two
semi-spaces, but use totaly different methods to collect garbage. Can
someone shield some light on this subject for me.
Thanks in advance.
[You might want to ask this on the GC list that spun off from
comp.compilers some time ago. -John]
| |
| Nick Maclaren 2004-08-04, 3:57 am |
| wmccabe@hotmail.com (William McCabe) writes:
|> I am in the process of writing my own programming language and I would
|> like to use garbage collection as a means for automatic memory
|> management. I have decided that I want to implement a two-stage
|> generational copying collector.
|>
|> I know how a copying and generational collector works but Im not sure
|> how they work when put together. In my mind, generational is similar
|> to coping in being that they both seperate the heap into two
|> semi-spaces, but use totaly different methods to collect garbage. Can
|> someone shield some light on this subject for me.
|>
|> [You might want to ask this on the GC list that spun off from
|> comp.compilers some time ago. -John]
You might also want to start off by deciding what properties you
want your language to have (e.g. its target area), and therefore
what properties you want your garbage collector to have. No
single garbage collector will be appropriate (or even work) for
all requirements.
Regards,
Nick Maclaren.
| |
| Sebastian 2004-08-04, 3:57 am |
| William McCabe wrote:
> I know how a copying and generational collector works but Im not sure
> how they work when put together. [snip] Can
> someone shield some light on this subject for me.
You can read the following 2-part article from MSDN. It quite nicely
explains how .Net garbage collector works -- it's also (partly*) copying &
generational collector at one (it even has 2 generations).
http://msdn.microsoft.com/msdnmag/issues/1100/GCI/
http://msdn.microsoft.com/msdnmag/issues/1200/GCI2/
*) it's copying GC for not too large objects (AFAIR less than 16KB), larger
objects are allocated from yet another heap and are unmoveable (as moving
large obiects to compact memory is too slow)
rgds
Sebastian Kaliszewski
--
"Never underestimate the power of human stupidity" -- from Notebooks of L.L.
| |
| Michael Tiomkin 2004-08-05, 3:59 pm |
| wmccabe@hotmail.com (William McCabe) wrote
> I am in the process of writing my own programming language and I would
> like to use garbage collection as a means for automatic memory
> management. I have decided that I want to implement a two-stage
> generational copying collector.
>
> I know how a copying and generational collector works but Im not sure
> how they work when put together. In my mind, generational is similar
> to coping in being that they both seperate the heap into two
> semi-spaces, but use totaly different methods to collect garbage. Can
> someone shed some light on this subject for me.
You can try to check what is done in Java, they use 4 generations
with different GC methods for different groups for one of their last
GCs:
http://java.sun.com/docs/hotspot/gc1.4.2/index.html
http://www-106.ibm.com/developerwor...rary/j-jtp11253
Notice that there is a problem with references from the "old"
objects to the "new" objects, these references need special treatment.
BTW, I'm not sure that GC is a part of a compiler, it's usually a
part of runtime support of a language.
[It's runtime, but it's often useful to tweak the compiler to generate
code that makes GC easier. -John]
| |
| Basile Starynkevitch [news] 2004-08-05, 3:59 pm |
| Le 28-07-2004, William McCabe <wmccabe@hotmail.com> a écrit_:
> I am in the process of writing my own programming language and I would
> like to use garbage collection as a means for automatic memory
> management. I have decided that I want to implement a two-stage
> generational copying collector.
>
> I know how a copying and generational collector works [...]
You need to read Lins&Jones' book on Garbage Collection.
If you want a generational copying GC callable from C (using some
specific coding style and calling conventions) you should consider
using my Qish garbage collector (opensource, LGPL):
http://starynkevitch.net/Basile/qishintro.html
I'll be delighted to offer help in using it if needed.
--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net
aliases: basile<at>tunes<dot>org = bstarynk<at>nerim<dot>net
8, rue de la Faïencerie, 92340 Bourg La Reine, France
| |
| Nick Roberts 2004-08-09, 3:56 am |
| On 4 Aug 2004 02:45:19 -0400, Sebastian <sk@z.pl> wrote:
> [Microsoft garbage collector]
> *) it's copying GC for not too large objects (AFAIR less than 16KB),
> larger objects are allocated from yet another heap and are unmoveable
> (as moving large obiects to compact memory is too slow)
Sadly, I feel the need to add a caveat to this (helpfully meant)
advice, that, whilst it is indeed likely to be enlightening to read
articles published by Microsoft, it is also necessary to bear in mind
that past evidence suggests Microsoft is not exactly the fount of all
computing wisdom. I'm afraid I often find that their documentation is
enlightening in suggesting to me precisely how /not/ to do it.
I may add that I am not a knee-jerk Microsoft knocker. I feel there is
genuine evidence of their widespread technical inexpertise.
The above technique (taken on its own, as I am not myself familiar
with Microsoft technologies) seems to be evidence in my case. I find
it hard to believe that the designers just gave up on moving large
blocks! I wonder if actually the idea is to allocate blocks greater
than a certain size page-aligned (in a different heap), so that they
can be moved by page table manipulation?
Anyway, my advice is simply to be cautious about Microsoft's
technology documents, and I suppose the same advice applies to other
companies.
--
Nick Roberts
| |
|
| "Nick Roberts" <nick.roberts@acm.org> schreibt:
>I wonder if actually the idea is to allocate blocks greater than a
>certain size page-aligned (in a different heap), so that they can be
>moved by page table manipulation?
The Intel (segmented) memory management technology imposes some
penalties on certain operations. Even if the idea of segment registers
was appropriate when moving from 16 bit code and 64 KB memory limits
to bigger memory (8086), but in later hardware generations (80386)
segment register manipulation and the maintenance of the related
(task, segment...) tables turned out to be too time expensive. This
was the time when the page-based flat memory model was "invented", as
a faster memory management model.
There may exist similar reasons for using different GC allocation
models, perhaps with regards to new I64 (dis)advantages, but I admit
that a general "too slow" argument is a too vague explication.
All in all I agree with your guess about page table manipulation for
fast block moves. Why move memory around byte by byte, when moving
bigger entitites (pages) costs about the same time, per entity?
DoDi
| |
| glen herrmannsfeldt 2004-08-11, 8:58 pm |
| VBDis wrote:
(snip)
> The Intel (segmented) memory management technology imposes some
> penalties on certain operations. Even if the idea of segment registers
> was appropriate when moving from 16 bit code and 64 KB memory limits
> to bigger memory (8086), but in later hardware generations (80386)
> segment register manipulation and the maintenance of the related
> (task, segment...) tables turned out to be too time expensive. This
> was the time when the page-based flat memory model was "invented", as
> a faster memory management model.
A segment descriptor cache could speed up such segment load
operations. I don't know which, if any, processors had one.
I have been told that some did, but have never seen it in any
Intel or AMD literature.
(snip)
> All in all I agree with your guess about page table manipulation for
> fast block moves. Why move memory around byte by byte, when moving
> bigger entitites (pages) costs about the same time, per entity?
As I understand it, it is common on many systems for the memory
allocation system to fill page tables with pointers to one zero filled
page, and then allocate a real page with the first write operation.
There is a story of someone testing the cache memory characteristics
of a machine with C code like:
int *a,*b,i;
a=malloc(100000000);
b=malloc(100000000);
for(i=0;i<100000000;i++) a[i]=b[i];
It would seem that this would require moving enough data to completely
flush the cache, but it might not.
-- glen
| |
| Nick Roberts 2004-08-13, 8:57 pm |
| On 11 Aug 2004 12:58:25 -0400, glen herrmannsfeldt <gah@ugcs.caltech.edu>
wrote:
> A segment descriptor cache could speed up such segment load
> operations. I don't know which, if any, processors had one.
> I have been told that some did, but have never seen it in any
> Intel or AMD literature.
As I understand it, they all have such a cache, except for certain
early steppings of the original Pentium. When it was discovered that
the omission of the cache had a disasterous effects on the execution
speed of some legacy (mostly 16-bit) software, Intel were quick to put
the cache back into their later models.
The Intel models have a cache of 96 extended descriptors, and the
AMD models have 128 I think.
To be honest, I'm not quite sure how the discussion got onto
segmentation in this thread, but never mind :-)
>
> As I understand it, it is common on many systems for the memory
> allocation system to fill page tables with pointers to one zero
> filled page, and then allocate a real page with the first write
> operation.
I don't understand why it is necessary for the PTEs to be set to
point to any frame. Why not just set the P bit to 0?
> There is a story of someone testing the cache memory
> characteristics of a machine with C code like:
>
> int *a,*b,i;
> a=malloc(100000000);
> b=malloc(100000000);
> for(i=0;i<100000000;i++) a[i]=b[i];
>
> It would seem that this would require moving enough data to
> completely flush the cache, but it might not.
An interesting test, but not really a fair test of a machine's
speed, since it is somewhat unrepresentative of real workloads.
--
Nick Roberts
| |
| Tomasz Zielonka 2004-08-14, 4:00 pm |
| Sebastian wrote:
> You can read the following 2-part article from MSDN. It quite nicely
> explains how .Net garbage collector works -- it's also (partly*) copying &
> generational collector at one (it even has 2 generations).
If it had only 1 generation, it wouldn't be generational.
Best regards,
Tom
--
..signature: Too many levels of symbolic links
| |
| glen herrmannsfeldt 2004-08-16, 3:57 am |
| Nick Roberts wrote:
> On 11 Aug 2004 12:58:25 -0400, glen herrmannsfeldt <gah@ugcs.caltech.edu>
> wrote:
[color=darkred]
> As I understand it, they all have such a cache, except for certain
> early steppings of the original Pentium. When it was discovered that
> the omission of the cache had a disasterous effects on the execution
> speed of some legacy (mostly 16-bit) software, Intel were quick to put
> the cache back into their later models.
> The Intel models have a cache of 96 extended descriptors, and the
> AMD models have 128 I think.
They don't seem to advertise it much.
> To be honest, I'm not quite sure how the discussion got onto
> segmentation in this thread, but never mind :-)
[color=darkred]
[color=darkred]
> I don't understand why it is necessary for the PTEs to be set to
> point to any frame. Why not just set the P bit to 0?
So that they will read as zero. You might as well fill
a page with zero, as with any other value. (I once knew
a system that initialized to X'81', though.) It is then
copy on write to any page.
[color=darkred]
[color=darkred]
[color=darkred]
> An interesting test, but not really a fair test of a machine's
> speed, since it is somewhat unrepresentative of real workloads.
You mean moving 1e8 bytes isn't fair, or copying the same
page many times, inside the cache?
-- glen
| |
| Antti-Juhani Kaijanaho 2004-08-16, 3:57 am |
| Nick Roberts <nick.roberts@acm.org> kirjoitti 13.08.2004:
> On 11 Aug 2004 12:58:25 -0400, glen herrmannsfeldt <gah@ugcs.caltech.edu>
> wrote:
>
> I don't understand why it is necessary for the PTEs to be set to
> point to any frame. Why not just set the P bit to 0?
Presumably because when the P bit is 0, even reading the page triggers a
page fault; if all zero-filled allocated pages are physically shared,
read operations do not trigger a page fault, and only when a write
operation is attempted is the operating system called.
I'm not sure how useful that is; presumably most pages are allocated in
order to write to them, not in order to read zeros from them.
I am also unsure how this relates to compilers any more :)
--
Antti-Juhani Kaijanaho http://antti-juhani.kaijanaho.info/
[There still seems to be a fair amount of C code that likes to dereference
random pointers and expects to find zeros, but I agree that this has
wandered away from compilers. -John]
| |
| Sebastian 2004-08-23, 4:02 pm |
| Nick Roberts wrote:
> On 4 Aug 2004 02:45:19 -0400, Sebastian <sk@z.pl> wrote:
>
>
> Sadly, I feel the need to add a caveat to this (helpfully meant)
> advice, that, whilst it is indeed likely to be enlightening to read
> articles published by Microsoft, it is also necessary to bear in mind
> that past evidence suggests Microsoft is not exactly the fount of all
> computing wisdom. I'm afraid I often find that their documentation is
> enlightening in suggesting to me precisely how /not/ to do it.
There are many different things made by MS from terribly (technically) ugly
& bad to nice and well thought stuff.
> I may add that I am not a knee-jerk Microsoft knocker. I feel there is
> genuine evidence of their widespread technical inexpertise.
I'm far (very far) from loving MS, but there is evidence to the contrary as
well. Just to stay wihtin the topic of this ng, for example their newset
compiler suite seems to be a good one.
> The above technique (taken on its own, as I am not myself familiar
> with Microsoft technologies) seems to be evidence in my case. I find
> it hard to believe that the designers just gave up on moving large
> blocks!
Why it is a bad iadea? I find it perfectly reasonable. Those objects of
course come from another heap, so they don't stand in a way of compaction
of smaller stuff.
The main advantage of compacting collector is IMHO extremely fast allocation
-- but in case of large objects (>=16KB in this case) the more complex
allocation from non compactable heap is neglibile in comparison to object
initiaslisation cost. And then moving such large object every time it's
generation goes through collection is simply huge and unneded cost.
Then add the fact that such objects (in typical imperative language code a
CIL typically runs) are typically not too frequently created (by the sheer
size of them, there must be to too many of them) and often are not short
lived. Hence they survive many many collections. Then add to the fact that
when application with pure-copying GC allocates enough memory that swapping
tricks in you're. Each not moved large object is a few pages not swapped.
Moving them is a waste of time, wipes caches, etc.
> I wonder if actually the idea is to allocate blocks greater
> than a certain size page-aligned (in a different heap), so that they
> can be moved by page table manipulation?
Why to move those objects at all?
>
> Anyway, my advice is simply to be cautious about Microsoft's
> technology documents, and I suppose the same advice applies to other
> companies.
Well, as evidence shows that whole .Net thing seems to behave OK. So maybe
those guys are not wrong after all.
BTW, if you'd like to read some quite strong anti-compaction advocacy try
this:
http://www.hpl.hp.com/personal/Hans...complexity.html
rgds
Sebastian Kaliszewski
--
"Never underestimate the power of human stupidity" -- from Notebooks of L.L.
| |
| Nick Roberts 2004-08-23, 4:02 pm |
| On 15 Aug 2004 22:16:47 -0400, glen herrmannsfeldt <gah@ugcs.caltech.edu>
wrote:
>
>
>
>
> You mean moving 1e8 bytes isn't fair, or copying the same
> page many times, inside the cache?
I didn't mean copying the same page many times, since I didn't
realise that was what the above code would do (it presumably would
while reading the b array, under the regime you mention).
I simply meant that copying a very large (100 MB?) linear chunk of
memory is not a fair test of a machine's speed, because real
applications will very rarely do this (I'm sure it will be done
occasionally, but not very often).
I understand that one measure made by suites designed to
realistically test out the performance of a machine is of a very
large number of small block moves, say 50 bytes each, spread around
an assumed working set of 10 to 20 MB.
--
Nick Roberts
| |
| Sebastian 2004-08-23, 4:02 pm |
| Tomasz Zielonka wrote:
> Sebastian wrote:
>
> If it had only 1 generation, it wouldn't be generational.
Well, it could have more. And, AFAIU, OP was also asking about 2... So it's
quite well matching.
rgds
Sebastian Kaliszewski
| |
| Nick Roberts 2004-09-03, 4:03 pm |
| On 23 Aug 2004 12:07:39 -0400, Sebastian <sk@z.pl> wrote:
>
> Why it is a bad idea? I find it perfectly reasonable. Those objects
> of course come from another heap, so they don't stand in a way of
> compaction of smaller stuff.
Yes, but if you don't move large blocks at all, are you not in danger
of ending up with a lot of (large) unfilled holes?
I should point out that there are two kinds of problem with holes: (a)
running out of actual memory; (b) running out of (logical) address
space. I think it is (b) that is the problem with large holes.
> The main advantage of compacting collector is IMHO extremely fast
> allocation
Hmmm. Well, that may be an advantage, but isn't the /main/ advantage
of a compacting collector the fact that it removes holes? Again, the
problem with these holes may be running out of address space, rather
than running out of actual memory, but it could be both.
> -- but in case of large objects (>=16KB in this case) the more
> complex allocation from non compactable heap is neglibile in
> comparison to object initialisation cost. And then moving such large
> object every time it's generation goes through collection is simply
> huge and unneded cost.
Well, yes, the cost would be huge if you used REP MOVSD on the data
itself. I'm suggesting that if you allocate large blocks page aligned,
you can move them quite quickly by REP MOVSDing in the page tables
instead.
>
> Well, as evidence shows that whole .Net thing seems to behave OK. So
> maybe those guys are not wrong after all.
I haven't had a lot of experience with .NET, but when I have run one
or two .NET programs (on my 2.4 GHz Celeron 512 MiB Windows XP) they
seem to have run surprisingly slowly. But that's just a bare
observation; I don't know /why/ they were so slow.
> BTW, if you'd like to read some quite strong anti-compaction
> advocacy try this:
>
> http://www.hpl.hp.com/personal/Hans...complexity.html
But the criticisms made in this paper (of collection) are based on a
set of assumptions that Microsoft could not make for all the software
running under Windows. They could not simply assume that the only
software requiring garbage collection would be "single-threaded with
moderately long-lived objects".
I know for a fact that they specifically considered the needs of
languages such as LISP, Prolog, and Smalltalk when designing the
original Win16 API. Ironically, the provisions made by Win16 tended
not to be used, because typical LISP, Prolog, Smalltalk, and similar
systems were generally ported from (flat) Unix C, compelling their
own internal provisions to be written. Come 32-bit (flat) Windows,
these internal provisions were sufficient.
--
Nick Roberts
| |
| Sebastian 2004-09-08, 3:57 am |
| Nick Roberts wrote:
> On 23 Aug 2004 12:07:39 -0400, Sebastian <sk@z.pl> wrote:
>
> Yes, but if you don't move large blocks at all, are you not in danger
> of ending up with a lot of (large) unfilled holes?
Well, this holes can be probably filled with somewhat smaller objects. In
typical situations i doubdt that fragmentation is worse than 2x. Copying
collector requires 2x overhead.
>
> I should point out that there are two kinds of problem with holes: (a)
> running out of actual memory; (b) running out of (logical) address
> space. I think it is (b) that is the problem with large holes.
Hmm, unless your system has virtual memory allmost equal in size to logical
address space for applications, then this is prpbably barely a problem. You
can't allocate more memory than swap space + physical mem allows, so even
significant fragmentation in case of such large objects is just
fragmentation of address space. When size of allocatable memory approaches
the size of logical address space compacting is no good either, as you have
no space to copy. Without compacting you can design your app not to cause
too much fragmentation (or at least hope that fragmentation won't be as bad
as 2x) but with compacting you've lost. Compacting system runs out of
memory when half of the address space is filled.
>
>
> Hmmm. Well, that may be an advantage, but isn't the /main/ advantage
> of a compacting collector the fact that it removes holes? Again, the
> problem with these holes may be running out of address space, rather
> than running out of actual memory, but it could be both.
Well, IMHO the hole removal advantage is overestimated. As Hans Boehm points
in the linked article, properly designed non compacting allocator rarely
exhibits fragmentation worse than 2x, while compacting keeps the
fragmentation just at or slightly above (alignment) 2x.
But I have recently profiled C++ application doing a lot of allocations of
rather small objects (the app is an interpreter). To my surpirise the most
significant factor was an allocation.
>
> Well, yes, the cost would be huge if you used REP MOVSD on the data
> itself. I'm suggesting that if you allocate large blocks page aligned,
> you can move them quite quickly by REP MOVSDing in the page tables
> instead.
That requires OS support, I don't know if NT kernel allows for page
movement. As .Net works also on older OS'es (incl. non NT 98 & Millenium)
modifying kernel might not be an option, I guess.
Then unless you have ~2GB of actual virtual memory on 32bit system, logical
address space framgentation is barely an issue and all non allocated pages
are just returned to system.
In case of 64bit enviroments this is not a problem at all.
[snip]
>
> But the criticisms made in this paper (of collection) are based on a
> set of assumptions that Microsoft could not make for all the software
> running under Windows. They could not simply assume that the only
> software requiring garbage collection would be "single-threaded with
> moderately long-lived objects".
Well, but MS didn't take the road describet in the articcle, as they do use
compaction for small objects.
rgds
Sebastian Kaliszewski
| |
| Chris Noonan 2004-09-13, 4:00 pm |
| Sebastian wrote:
> Nick Roberts wrote:
[snip]
>
> Hmm, unless your system has virtual memory allmost equal in size to logical
> address space for applications, then this is prpbably barely a problem. You
> can't allocate more memory than swap space + physical mem allows, so even
> significant fragmentation in case of such large objects is just
> fragmentation of address space. When size of allocatable memory approaches
> the size of logical address space compacting is no good either, as you have
> no space to copy. Without compacting you can design your app not to cause
> too much fragmentation (or at least hope that fragmentation won't be as bad
> as 2x) but with compacting you've lost. Compacting system runs out of
> memory when half of the address space is filled.
>
>
> Well, IMHO the hole removal advantage is overestimated. As Hans Boehm points
> in the linked article, properly designed non compacting allocator rarely
> exhibits fragmentation worse than 2x, while compacting keeps the
> fragmentation just at or slightly above (alignment) 2x.
Interesting analysis regarding address range and paging file
memory. But you omit to say that holes between allocations consume
RAM, which is a far more valuable resource than the other two memory
types. The 2x 'fragmentation' of the copying collector does not
consume RAM.
Chris
|
|
|
|
|