For Programmers: Free Programming Magazines  


Home > Archive > Scheme > March 2004 > Re: VM's what are they used for ?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: VM's what are they used for ?
Joe Marshall

2004-03-27, 12:24 am

PaulMatthews@yahoo.com (Paul Matthews) writes:

> Most scheme implementations run on some form of virtual machine.
>
> What advantage is this over evaluating the s-expressions directly.


There is not much hardware that can execute s-expressions directly.


--
~jrm
Tom Lord

2004-03-27, 12:24 am


Paul Matthews:
[color=darkred]

Joe Marshall[color=darkred]
> There is not much hardware that can execute s-expressions directly.


One could write a book on such trade-offs and there are probably
several already.

The original question is phrased misleadingly. An interpreter
for s-expressions _is_ a VM.

I think the question is probably intended to ask about the design
space of VMs:

1) Why is compiling code to an array better than interpreting
code stored in a list structure?

2) Why is a more compact representation for code, such as bytecode,
better than a less compact representation, such as list structure?

3) Why should the instruction set of a VM be something lower level
than Scheme source? Why bother compiling Scheme at all for
an interpreter rather than just directly executing some
representation (s-exps or otherwise) of the source?

Some of the many answers:

1) Code in an array is likely to be more compact (e.g., no space taken
up by CDR values). It is also likely to interact well with CPU
caches (when I fetch instruction N, all instructions N ... N+K are
fetched from main memory and kept in the cache. Hopefully when I
then fetch instruction N+1, that fetch goes faster.) It's also
likely to generate less memory traffic (no need to read all those
CDR pointers).

The trade-off for those benefits is that, in an interpreter at least,
I have to spend time translating code to array form before I can
execute it --- increasing "latency" (the delay between reading an
expression and evaluating it) and the complexity of the interpreter.

2) Why more compact (bytecode) rather than less (s-exps)?

Same reasons and trade-offs: better cache behavior, less memory
traffic.

There's another reason too: the size of the interpreted code. You
can get bytecode to be half the size of s-exp code without trying
-- you can probably do better than that by another 2x or 4x without
trying very hard.

In situations where a small process size is desirable, bytecode has a
big advantage.

Also, depending on what kind of GC you use, bytecode may give your GC
considerably less work to do (e.g., no need to trace the code
s-exps).

3) Why lower-level instrution sets rather than just
R5RS-as-instruction-set?

Even a very simplistic translation to a lower-level instruction set
is likely to make an interpreter much faster. You could think of
this as "moving code outside the loop" or "partial evaluation":

Consider just the simple case of an expression which is nothing other
than a reference to a given variable. Without translation to
lower-level instructions, every time you evaluate that expression
you have to look up the variable by name. But in a typical Scheme
implementation, that lookup is going to produce the same result every
time (not the same value -- it's just going to go through
identical steps to figure out where the value is stored). So why
not do that lookup (or most of it) just once, up front, rather than
every time you evaluate the expression? But if you do it up front,
and store the result somewhere, then you've essentially translated
the expression to a lower-level instruction.

You can figure out by analogy how translation would also speed up
something like an IF expression --- a translator figures out just
once that a given expression is an IF and then avoids that work
during interpretation.


But one nice thing about interpreting source s-expressions directly is
low latency: you can READ or CONS a new expression and begin evaluating
it immediately -- no need to run it through a compiler.

SCM and Guile take a hybrid approach. They interpret source
s-expressions directly -- but rewrite those expressions "on
the fly" into a lower-level machine language. You can notice
this by loading a program and calling it in a way that involves
the evaluation of many different expressions within the program
-- then calling it the same way a second time but without
re-loading it first. The second run will be faster than the
first because during the first run, the code was being incrementally
compiled to lower-level code. During the second run, the lower-level
code is run directly. Even the low-level code
is stored as s-exps, though, and at least in the past, it was fairly
clear that this was a performance bottleneck for these interpreters.

Other hybrid approaches are probably possible too.

A direct interpreter is far from insane. For one thing, it's likely to
be a much smaller, simpler implementation than many other approaches.
If it's "fast enough" for your purposes, simplicity and small
implementation size may be reason enough to write a direct interpreter.

-t
William D Clinger

2004-03-27, 12:24 am

PaulMatthews@yahoo.com (Paul Matthews) wrote:
> Most scheme implementations run on some form of virtual machine.
>
> What advantage is this over evaluating the s-expressions directly.


In any language that has a macro facility, you don't want to waste
time re-expanding macros. Thus most implementations of Scheme will
expand macros before interpretation, or will cache the results of
macro expansion done on-the-fly.

Beyond that, the tradeoffs are many and interesting.

Some of the fastest Scheme interpreters are themselves written in
Scheme. Some of the slowest are VM interpreters written in C. The
details do matter.

Will
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com