For Programmers: Free Programming Magazines  


Home > Archive > Compilers > June 2006 > data allocation in interpreters









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author data allocation in interpreters
weltraum@astrocat.de

2006-05-30, 7:11 pm

Hi,

What are common techniques for data allocation (strings, high-level
data structures) in interpreters? I could think about:

- malloc / free with trying to free unused data as soon as possible.
- same as above with some custom implementation of malloc / free (which
one?)
- full blown grabage collector.
- somthing hybrid (how?)

What do you think, how is it implemented in the famous interpreters?


Bye
Chris
Pascal Bourguignon

2006-05-30, 10:03 pm

weltraum@astrocat.de writes:
> What are common techniques for data allocation (strings, high-level
> data structures) in interpreters? I could think about :
>
> - malloc / free with trying to free unused data as soon as possible.
> - same as above with some custom implementation of malloc / free (which
> one?)
> - full blown grabage collector.
> - somthing hybrid (how?)
>
> What do you think, how is it implemented in the famous interpreters?


Depends on the programming language.

If you consider an interpreter for C, you'll probably won't have a
garbage collecetor included, since the purpose of C is to compute
addresses and use malloc/free (or is it? cf BoehmGC).

But in most other programming languages, there's a garbage collector.

--
__Pascal Bourguignon__ http://www.informatimago.com/
Hans Aberg

2006-06-03, 7:05 pm

Pascal Bourguignon <pjb@informatimago.com> wrote:

> weltraum@astrocat.de writes:
[color=darkred]
[color=darkred]
> Depends on the programming language.


As you say, it depends on the programming language used, but also on the
ambition level of the_interpreter implementation. If one is a singly
implementer, it might be suitable with a programming language like C++
plus the use of a reference count - C++ does not currently support
the_implementations of any other GC (the Boehm GC goes in a a layer
between the compiler and the platform), but the next revision might. If
using Java/Objective C, one can get this cleanup via the language.

> But in most other programming languages, there's a garbage collector.


For a more optimized interpreter, C is the better choice, and an
interpreter like Hugs <http://haskell.org/hugs/> uses a two-space copying
GC. I think this is typical choice of GC for many interpreters, at
least_before more_advanced forms of GC are considered.

--
Hans Aberg
George Peter Staplin

2006-06-03, 7:05 pm

weltraum@astrocat.de toggled some bits and produced:
> Hi,
>
> What are common techniques for data allocation (strings, high-level
> data structures) in interpreters? I could think about :
>
> - malloc / free with trying to free unused data as soon as possible.
> - same as above with some custom implementation of malloc / free (which
> one?)
> - full blown grabage collector.
> - somthing hybrid (how?)
>
> What do you think, how is it implemented in the famous interpreters?


I assume "full blown" GC doesn't include reference counting. A lot of
implementations in C use structures/objects with a reference count
member. Whenever the int ->refCount drops to 0 or below the object is
generally recycled. By recycled I mean the object is available for
reuse, and is generally added to the head of a free-list of an object
pool. The problem with that approach generally is that it requires
more manual management, but if you automatically generate the C it can
be easier. Also, it can't handle circular structures without very
clever methods. A paper reference for circular structures handled
with a clever reference counting algorithm is in Knuth's volume 1 from
what I recall (in the garbage collection section).

Tcl uses reference counting in the C API.

Python from what I've heard uses reference counting, and some automatic
GC. It's probably a "hybrid" architecture.

Java has a variety of "full blown" GC methods that can be tuned:
http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html

Garbage collection is actually a very old technique in the history of
computing. Dijkstra's concurrent collector is my favorite, because it
doesn't result in long pauses, and it's elegant.

Also, there are some people that claim that the Boehm collector for C is
good. I think it's very dangerous, because if you have a large
allocation, and some integer that happens to have the same value as a
pointer's address the object may not be freed. Thus it's called a
"conservative collector." IMO it's just a bad idea, but it works for
simple programs I suppose, and if you only retain a few small objects
now and then.

You asked about string management. AFAIK that's usually done with
another allocator than the object-pool allocator (something like
malloc), and a pointer is associated with the object. When the object
is released to the free-list, the string is usually freed.

Have fun :)

-George
[I gather that the Boehm collector has worked in some rather large programs,
and the number of false pointers is usually quite low. -John]
Gregg Townsend

2006-06-17, 7:05 pm

<weltraum@astrocat.de> wrote:
> What are common techniques for data allocation (strings, high-level
> data structures) in interpreters?


For one set of answers see
The Implementation of the Icon Programming Language
Ralph E. Griswold and Madge T. Griswold
Princeton University Press, 1986

It's now out of print, but a PDF version can be downloaded from
http://www.cs.arizona.edu/icon/ibsale.htm

---------------------------------------------------------------------------
Gregg Townsend Staff Scientist The University of Arizona
gmt@cs.arizona.edu Computer Science Tucson, Arizona, USA

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com