For Programmers: Free Programming Magazines  


Home > Archive > Compilers > May 2005 > What is byte-code ?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author What is byte-code ?
KC

2005-03-04, 8:58 pm

Hi,

What's the official definition of byte-code ? Any documentation and
book related to byte-code and byte-code compiler ? Does byte-code
just a simple version of CPU's opcode ?

Thanks
KC
[Here in computer-land, there's no official definition of anything.
I'd say a byte code is a langauge encoded a byte or two at a time and
intended to be interpreted. -John]

Anton Ertl

2005-03-06, 3:58 am

"KC" <kcc1967@gmail.com> writes:
>What's the official definition of byte-code ?


There is none (well, you can try Wikipedia or FOLDOC for inofficial
definitions).

The term byte-code is typically used for a special case of a virtual
machine, where the virtual machine instructions and their operands are
encoded as bytes, or as groups of bytes (e.g., JVM byte-code,
Smalltalk byte-code, Elisp byte-code).

However, I have also seen the term used for virtual machines that
don't have this property (e.g., the Ocaml "byte-code", where the
instructions are encoded as machine words); that usage seems to just
use "byte-code" in place of "virtual machine" (maybe because there are
also other meanings of "virtual machine").

As for "virtual machine", in this context it means a flat, sequential
intermediate representation of the programs, consisting of virtual
machine instructions, with some similarities to real-machine code.

>Any documentation and
>book related to byte-code and byte-code compiler ?


@Book{krasner83,
title = "Smalltalk-80: Bits of History, Words of Advice",
publisher = "Addison-Wesley",
year = 1983,
editor = "Glen Krasner"
}

You can also look at some papers published at, e.g., IVME'04 and
follow the references.

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/home.html
Laurence Finston

2005-03-06, 3:58 am

On Fri, 4 Mar 2005, KC wrote:

> Any documentation and book related to byte-code and byte-code
> compiler ?


I've never run across a book dedicated to this particular topic. If
you want to read up on it, I suggest you read the CWEB version of
Donald Knuth's `dvitype' program, which converts files in TeX's `dvi'
format to human readable form. Other possibilities might be GNU
Common LISP's and/or GNU Emacs LISP's routines for byte compilation.
`dvitype' can be found on the CTAN archive at http://www.tug.org
http://www.dante.de and elsewhere. The GNU packages are available
from http://www.gnu.org

Laurence Finston
http://www.gnu.org/software/3dldf/LDF.html

Nathan Moore

2005-03-06, 3:58 am

Byte code is interpreted one byte at a time, so that endianness is not
an issue.

KC wrote:
> What's the official definition of byte-code ? Any documentation and
> book related to byte-code and byte-code compiler ? Does byte-code
> just a simple version of CPU's opcode ?

Nick Maclaren

2005-03-08, 3:59 am

Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
>"KC" <kcc1967@gmail.com> writes:
>
>There is none (well, you can try Wikipedia or FOLDOC for inofficial
>definitions).
>
>The term byte-code is typically used for a special case of a virtual
>machine, where the virtual machine instructions and their operands are
>encoded as bytes, or as groups of bytes (e.g., JVM byte-code,
>Smalltalk byte-code, Elisp byte-code).


And compilers based on the interpretation of compiled "byte streams"
predate the use of the term by many decades. The first was before my
time, and could well have been either LISP or SNOBOL - but equally
well could have been something else of that era. Not after the early
1960s, anyway.

>However, I have also seen the term used for virtual machines that
>don't have this property (e.g., the Ocaml "byte-code", where the
>instructions are encoded as machine words); that usage seems to just
>use "byte-code" in place of "virtual machine" (maybe because there
>are also other meanings of "virtual machine").


Yes. It is most reasonably used in the sense of "virtual machine code."

>As for "virtual machine", in this context it means a flat, sequential
>intermediate representation of the programs, consisting of virtual
>machine instructions, with some similarities to real-machine code.


Nah. I will give you most of that, but NOT "sequential". Parallel
ones have existed, but I now forget good references. Some of the
first "automatic parallelisation" was based on compiling a functional
program into independent threads (as many as possible), and having a
set of thread servers consume them, blocked only by the dependency
graph. Dead easy, but it delivered negligible parallelism when faced
with practical programs. I am 95% certain that some of those used a
"byte code" compiler/interpreter implementation.

Regards,
Nick Maclaren.
Randy

2005-03-08, 3:59 am

KC wrote:
> What's the official definition of byte-code ? Any documentation and
> book related to byte-code and byte-code compiler ? Does byte-code
> just a simple version of CPU's opcode ?


The whatis.com definition seems to be as good as any, IMHO:

http://searchsmb.techtarget.com/sDe...i211722,00.html

Oddly, the web definitions that I've seen mention only Java and not
Microsoft's equivalent bytecode 'standard': MSIL (the basis for .NET).

As I understand it, the goals of bytecode are:

1) Bytecode executables can run on any kind of CPU, within virtual
machines. (I suspect that bytecodes are as close to an architecture
neutral distribution format for executables as we're likely to see).

2) Bytecode executables are small and thus easy to distribute over the
net/web (smaller than equivalent binary executables, since bytecodes
lack much of the O/S-specific and library-specific linkage that are
present in native binaries).

3) Bytecode *should* be more portable across versions of O/Ses and
libraries than native executables, since resolution of the
executable's dependencies is left to runtime 'recompilation' and
linkage. In practice, however, this has not come to pass. (Java has
been too much a moving target. I know too little about .NET to pass
judgment on it, however.)

4) Bytecodes are more amenable to security oversight than are native
executables (since the virtual machine can perform checks that native
code does not yet support on commercial O/Ses or CPUs).

5) Final translation of bytecode into executable can be very fast and
can often occur concurrently with execution. In addition, most
bytecode programs are expected to be web-oriented, thus having a large
GUI component, thus requiring less in terms of execution speed (making
the added overhead of the VM less onerous).

Randy
--
Randy Crawford http://www.ruf.rice.edu/~rand rand AT rice DOT edu

Brian Inglis

2005-03-08, 4:02 pm

On 4 Mar 2005 14:12:40 -0500 in comp.compilers, "KC"
<kcc1967@gmail.com> wrote:

>What's the official definition of byte-code ? Any documentation and
>book related to byte-code and byte-code compiler ? Does byte-code
>just a simple version of CPU's opcode ?


>[Here in computer-land, there's no official definition of anything.
>I'd say a byte code is a langauge encoded a byte or two at a time and
>intended to be interpreted. -John]


It's an oldish idea; Google for: Pcode OR "P-code" -postal

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Brian.Inglis@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]a
b[dot]ca)
fake address use address above to reply
Torben Ęgidius Mogensen

2005-03-08, 4:02 pm

nmm1@cus.cam.ac.uk (Nick Maclaren) writes:

> Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:


>
> Nah. I will give you most of that, but NOT "sequential". Parallel
> ones have existed, but I now forget good references.


I'm sure Anton Ertl meant that the code has a sequential layout
(syntax), not that the semantics are necessarily sequential. Of
course, anything stored in a file has a sequential layout, but I think
Anton referred to the code being a linear sequence of instructions
rather than a syntax tree.

Torben
Torben Ęgidius Mogensen

2005-03-08, 4:02 pm

Randy <joe@burgershack.com> writes:

> 5) Final translation of bytecode into executable can be very fast and
> can often occur concurrently with execution. In addition, most
> bytecode programs are expected to be web-oriented, thus having a large
> GUI component, thus requiring less in terms of execution speed (making
> the added overhead of the VM less onerous).


Bytecode predates GUI's and the web by several decades. Additionally,
bytecodes have traditionally been interpreted rather than translated,
it is only in the last couple of decades that bytecodes designed for
compilation or JIT's have emerged. And, arguably, JVM bytecode was
designed for interpretation and only later JIT'ed or compiled when
Java was used in contexts for which it was not originally designed.

One reason for using byte-formatted code for interpretation is that
decoding is a table lookup: You use the byte to index into a jump
table that points to code that executes the bytecode instruction,
possibly taking parameters from the subsequent bytes, updating the PC
as required.

Many bytecodes are interpreted even these days, as this makes it
easier to port the language to other systems. Such interpreted
langauges are often used for non-GUI tasks that are not time critical
or where the problems are i/o-bound, so raw execution speed is of
little importance. Perl and Python are good examples of languages
that traditionally use interpreted bytecodes.

Torben
Carl Cerecke

2005-03-09, 4:00 am

Torben Ęgidius Mogensen wrote:

> bytecodes have traditionally been interpreted rather than translated,
> it is only in the last couple of decades that bytecodes designed for
> compilation or JIT's have emerged.


What is the difference between bytecodes designed for (JIT) compilation
and bytecodes designed for interpreting? Your comment sounds like there
are specific design decisions regarding bytecode structure that
facilitate compilation. Or, rather, that some design decisions can make
compilation more difficult/inefficient.

Would you shed some light on what aspects of bytecodes facilitate/hinder
compilation (JIT or otherwise)? (I'm not trying to start an argument, I
just don't know enough about bytecodes to answer my question)

Cheers,
Carl.
[I'd want my JIT code to have stronger typeing so I could figure out at
code generation time what the datatypes of the operands all were. -John]
Torben Ęgidius Mogensen

2005-03-15, 4:04 am

Carl Cerecke <cdc@maxnet.co.nz> writes:

> Torben Ęgidius Mogensen wrote:
>
>
> What is the difference between bytecodes designed for (JIT) compilation
> and bytecodes designed for interpreting? Your comment sounds like there
> are specific design decisions regarding bytecode structure that
> facilitate compilation. Or, rather, that some design decisions can make
> compilation more difficult/inefficient.
>
> Would you shed some light on what aspects of bytecodes facilitate/hinder
> compilation (JIT or otherwise)? (I'm not trying to start an argument, I
> just don't know enough about bytecodes to answer my question)
>
> Cheers,
> Carl.
> [I'd want my JIT code to have stronger typeing so I could figure out at
> code generation time what the datatypes of the operands all were. -John]


I agree with John on types. But there are other considerations:

- Bytecodes designed for interpretation needs low decoding overhead,
so you would tend to have a lot of "work" done per instruction.
With compiling, the decoding overhead is only paid once, so you can
have relatively higher decode cost and, hence, less work per
instruction. The compiler can then combine several small
instructions into single machine-code instructions.

- Bytecodes designed for compilation would tend to use registers
rather than a stack. An interpreter would have difficulty using
registers for all but a few user variables, as it can't indirectly
address registers. Hence, you often see stack-based bytecode for
interpretation. A compiler has no such problem, as it can compile
into direct register accesses.

- Bytecodes for interpretation often have specialized parameter-less
instructions for common cases, such as adding 1 to a variable or
setting it to 0, in order to reduce decoding overhead. With
compilation, the saving in decode matters little, so there is no
incentive to have such specialised instruction.

Torben
Chris Dollin

2005-03-15, 4:04 am

Nathan Moore wrote:

> Byte code is interpreted one byte at a time, so that endianness is not
> an issue.


Depending on why you mean by `byte` and `bytecode` that may not be true ...

The most recent VM I wrote had instructions that were 16 bits
wide. You could say it had 16-bit bytes. (As it happens, the opcode
part was 5 bits wide, with 3 other bits that were either an operand
specification or more opcode - so you *could* say it was an
8-bit-byte-code, because the opcodes are 8 bits wide and the operand
another 8.)

(I called them `shortcodes`.)

(Yes, branch instructions demanded some attention. There were two, one
for forward, one for backward. There's a certain schoolboy pleasure in
having a GOB instruction.)

--
Chris "electric hedgehog" Dollin
Xous - Jose R. Negreira

2005-03-18, 3:59 am

>>What's the official definition of byte-code ? Any documentation and[color=darkred]

This is my first post, I hope it'll be useful!

Well...official definition, I don't know if it exists, this intends to
be an approach:

go to answers.com, and enter 'byte-code' or instead:
http://www.answers.com/byte-code

and about p-code (what Brian said), Java Virtual Machine was inspired in
old "p-code systems", more info:
http://www.answers.com/p-code

greets,

--
Jose R. "Xous" Negreira
[ *xous*at*xouslab_dot_com* ]
XousLAB - http://www.xouslab.com
iptableslinux - http://www.iptableslinux.com
Nathan Moore

2005-03-25, 3:59 am

Chris Dollin wrote:
> Depending on why you mean by `byte` and `bytecode` that may not be true ...
>
> The most recent VM I wrote had instructions that were 16 bits
> wide. You could say it had 16-bit bytes. (As it happens, the opcode
> part was 5 bits wide, with 3 other bits that were either an operand
> specification or more opcode - so you *could* say it was an
> 8-bit-byte-code, because the opcodes are 8 bits wide and the operand
> another 8.)
>


I would not consider the interpreter's instruction set to fit the
definition of "byte code". I don't believe that just because it
is interpreted it is bytecode. (I had posted another more elaborate
definition that was culled -- don't know why). Your VM's instruction
set is very much like that of many hardware CPU's (as I'm sure you
know), but when done in software, you incure the expense of separating
out the 5 bit operation code from the rest of the 16 bits, before
decoding (ie jumping to the code that does the operation or decodes the
rest of the opcode if it is one of those cases). This separation
of components, which you most likely do with a MASK | INSN is not
expensive, but does have to be done for every instruction, which may
not be the case for a VM whose instructions are addressable by the
host machine. Plus I'll bet that you take another hit on the codes
that use the 3 bits left from the first 8 bits as an additional portion
of the opcode, b/c you have to decode that as well as the first 5 bits.
In addition, in the instruction decode code you will have the result of
INSN | MASK be all high valued bits for some hosts. This means that
your "jump to" code (or switch/case code generated by the compiler) will
have to scale this value in some way before using it to find the correct
place to jump, which is an additional cost. (same is true of the other
3 bits when used as opcode on the hosts where the above is not true).
You also have to worry about endianness and portability.

This really doesn't have much to do with the definition of bytecode, but
why bytecode works well in VM's and particularly portable VMs.
I just don't think that just because something is a VM the code that it
eats can be defined as bytecode. This definition would make all machine
languages and infact all bit-streams be included in the set of bytecode.

Nathan Moore
Anton Ertl

2005-04-01, 4:00 am

Carl Cerecke <cdc@maxnet.co.nz> writes:
>Torben Ęgidius Mogensen wrote:
>
>
>What is the difference between bytecodes designed for (JIT) compilation
>and bytecodes designed for interpreting? Your comment sounds like there
>are specific design decisions regarding bytecode structure that
>facilitate compilation. Or, rather, that some design decisions can make
>compilation more difficult/inefficient.


Well, one example is static stack depth knowledge in the JVM. The
interpreter does not profit from it, but the compiler does: it can
allocate all stack items to registers or wherever it wants to. While
the stack depth is statically known in the JVM in general, there are a
few hurdles that could have been avoided if the JVM had been designed
with that in mind (IIRC subroutines are particularly nasty); I think
that the JVM was originally designed to be interpreted, and the static
features were due to it also being designed to be verifyable (for
certain safety properties).

>[I'd want my JIT code to have stronger typeing so I could figure out at
>code generation time what the datatypes of the operands all were. -John]


I assume that you mean that the types should be statically known, and
that the JIT should track types and resolve any overloading in the VM
instructions at JIT time and compile it to native code without typeq
checks.

While such a VM would be definitely interpreter-unfriendly, it's not
very JIT-friendly, either. Whatever types and overloading can be
resolved statically in your language should already be resolved by the
front-end (which typically has to deal with types anyway and perform
any static type checking), resulting in typed operations in the VM
(like JVMs iadd fadd etc.). Such a VM does not need to deal with
types (except the remaining dynamic types of the language) at either
interpreter or JIT time, making both the interpreter faster and the
JIT smaller and faster-compiling (if it is not burdened with a
verifier).

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/home.html

Anton Ertl

2005-04-01, 4:00 am

torbenm@diku.dk (=?iso-8859-1?q?Torben_=C6gidius_Mogensen?=) writes:
> - Bytecodes designed for interpretation needs low decoding overhead,
> so you would tend to have a lot of "work" done per instruction.


My impression is that VMs designed with this idea in mind have many
special-purpose VM instructions and their interpreters tend to be slow
on general-purpose code. I have the following explanations for that:

- Since the idea is that the individual VM instructions do so much
work that the interpretive overhead does not matter much, the
implementer does not put as much effort into reducing the overhead per
instruction (after all, he needs to optimize all those important
special-purpose instructions, too).

- Some of the issues in interpreter efficiency also work out
differently: E.g., the fastest interpreters typically put all the code
for all VM instructions in one big C function, which is not considered
good style in general. In particular, if the VM consists of 200
simple VM instructions, each with 1-3 lines of code, and the speedup
is large, one is usually willing to follow this practice; but if the
VM consists of 500 mostly special-purpose VM instructions, each with
20-100 lines of code, and the speedup is small on many
(special-purpose) benchmarks, the implementor will probably not put
them in a single function.

- The special-purpose VM instructions may put some burden on the VM
that has to be paid by other instructions as well, e.g., state that
has to be maintained across function calls; or register pressure in
one place that causes spilling in other places.

- The VM designer may neglect the small general-purpose VM
instructions; they may be a special case of a more powerful (but
slower) VM instruction, and the VM designer may not want to add
another VM instruction just for this special case.

While the idea of doing more work per instruction has its merits, the
way it should be done is to first make a simple VM with a fast
interpreter, then see what sequences are executed often, and add VM
instructions for them.

> - Bytecodes designed for compilation would tend to use registers
> rather than a stack. An interpreter would have difficulty using
> registers for all but a few user variables, as it can't indirectly
> address registers. Hence, you often see stack-based bytecode for
> interpretation. A compiler has no such problem, as it can compile
> into direct register accesses.


Yes, an interpreter has to use memory for indexable registers.
However, as David Gregg and his students have found, using a
"register" VM (actually they used a variant of the JVM with directly
encoded locals) can reduce the number of VM instructions executed by
so much that every interpreter with a significant
VM-instruction-dispatch overhead is faster with such a "register" VM
than with a naive stack VM.

Of course, I still believe in stack VMs, for the following reasons:

- Dynamic superinstructions (aka selective inlining) eliminate the
differences in dispatch overhead between these two methods (there is
only one dispatch per VM basic block).

- Even without dynamic superinstructions, one can find sequences of
frequently occuring VM instructions and combine them into static VM
superinstructions; I believe that applying this optimization to both
stack VM and "register" VM will eliminate many differences between
them (e.g., if the sequence "iload iload iadd istore" is combined into
a superinstruction, the superinstruction is actually an instruction of
the "register" VM), and lets the stack VM have a similar low number of
dispatches, while still having less decoding overhead in those cases
where the stack can be used to advantage (e.g., non-trivial
expressions). There are automatic tools for introducing static
superinstructions (Vmgen), so sticking with the stack VM gives you
most of the benefit of the register VM without having to pay the
complexity of dealing with a register VM.


As for compilers, why would a VM designed for them use registers? In
order to use the CPU's registers effectively, the compiler has to
reallocate registers anyway; for that a stack VM is just as easy as a
register VM; maybe it's easier, since with stack items you learn about
their death during a forward pass through the code, whereas with most
register-based representations (certainly a typical register VM) you
need a backwards pass to determine the death of a register.

You could use an intermediate representation designed for register
allocation, e.g., with an infinite number of pseudo-registers and only
one live range per pseudo-register and with explicit register death
marks; but such an intermediate representation would typically no
longer be considered a VM.

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/home.html

Chris Dollin

2005-04-01, 4:00 am

Nathan Moore wrote:

> Chris Dollin wrote:


(snip)

[color=darkred]
> I would not consider the interpreter's instruction set to fit the
> definition of "byte code". I don't believe that just because it
> is interpreted it is bytecode.


I agree.

> Plus I'll bet that you take another hit on the codes
> that use the 3 bits left from the first 8 bits as an additional portion
> of the opcode, b/c you have to decode that as well as the first 5 bits.


No; it's a single 256-way switch. (Earlier interpreters did work the way
you describe. It's horrible; one massive switch is bad enough, but one
large switch nested inside the massive switch is too much for me to keep
track of. I never got round to seeing if using an array of function pointers
had a sufficiently small performance penalty. Maybe the next time ...)

> In addition, in the instruction decode code you will have the result of
> INSN | MASK be all high valued bits for some hosts. This means that
> your "jump to" code (or switch/case code generated by the compiler) will
> have to scale this value in some way before using it to find the correct
> place to jump, which is an additional cost.


Easy fix: in the 16-bit opcode word, put the 5:3 at the low-order end.
You still need a mask, of course. Or - if you can - store the instructions
as I = (A, 5:3, B), where A::B are the opcode bits and the width of B is
such that the masked I is already suitable scaled. (You pay in the operand
access, but since I already needed a shift, it's just a little extra shift,
and on a beautiful machine [1], you can exploit rotates. But likely not
at the C level.)

> You also have to worry about endianness and portability.


Always.

[1] The ARM.

--
Chris "electric hedgehog" Dollin

Nathan Moore

2005-04-06, 12:44 pm

Chris Dollin wrote:
> Nathan Moore wrote:


>
>
> No; it's a single 256-way switch. (Earlier interpreters did work the way
> you describe. It's horrible; one massive switch is bad enough, but one
> large switch nested inside the massive switch is too much for me to keep
> track of. I never got round to seeing if using an array of function pointers
> had a sufficiently small performance penalty. Maybe the next time ...)


So what, you have several cases w/ the same code for all of the
instructions that only use the first 5 bits to determine the operation
and the other 3 of the octet as extra operands when the first 5 bits match?
With binary numbers so bits are obvious:

case 0101 0000:
case 0101 0001:
case 0101 0010:
case 0101 0011:
....
case 0101 0111:
do_some_junk();
break;

or with a gcc extention:

case 0101 0000 ... 0101 0111:
do_some_junk();
break;

That would be faster. Seems kinda dirty.

Rather than the array of f() * , if you use gcc, you could implement it
with an array of labels if you use gcc or another compiler that supports it.

void * LABEL[256] = { ADD, SUBTRACT, ... };
....
goto LABEL[instruction];

ADD:
/* add code */
/* update instruction pointer to reflect that this one has been done */
goto LABEL{instruction];
SUBTRACT:

This will save you about 2 jumps per instructoin since the jump tables
that are usually spit out by compilers are:

jump ip+(constant*case)
jump CODE_THAT_ACTUALLY_DOES_CASE_0
jump CODE_THAT_ACTUALLY_DOES_CASE_1
....

and because you won't have to jump back to the initial jump to begin the
next instruction. If you were to do this, I would suggest that you make
an automated process of building the C source file that has the
implementation of the instructions in it from one or more files that
contain the per-instruction implementations and make it so that the
method of jump table can be changed for different compilers.

>
>
> Easy fix: in the 16-bit opcode word, put the 5:3 at the low-order end.
> You still need a mask, of course. Or - if you can - store the instructions
> as I = (A, 5:3, B), where A::B are the opcode bits and the width of B is
> such that the masked I is already suitable scaled. (You pay in the operand
> access, but since I already needed a shift, it's just a little extra shift,
> and on a beautiful machine [1], you can exploit rotates. But likely not
> at the C level.)


Took a while to figure out what you said there. My points were to get
rid of those shifts, but if you could configure the VM to take advantage
of prescaling via the shifts (or rotates) and the MASK, you have
overcome some other issues. I think that there may be some hidden
grimlins here though. I believe that most archs have at least one
rotate instruction -- even something as ugly as x86 (since forever).

>
>
>
>
> Always.


My point here was to do something that would compile correctly on any
machine and work effeciently/quick without having to be configured
slecial for that machine. Other areas have to worry about endianness
and other protability issues, but the more areas that you can make them
non-issues, the better.

Nathan Moore

John Slimick

2005-04-11, 3:58 am

Nathan Moore wrote:

I've been following this thread and I find that there are assumptions
about "byte." In my career bythes were six bits until April 1964 when
they became 8 bits (IBM 360). Now with the advent of UNICODE it seems
that bytes are becoming 16 bits, at least conceptually. (And then
there were the 5 bit Baudot "bytes"). I think everyone is talking
about "octets," so why can't we use them?

john slimick
slimick@pitt.edu

> My point here was to do something that would compile correctly on any
> machine and work effeciently/quick without having to be configured
> slecial for that machine. Other areas have to worry about endianness
> and other protability issues, but the more areas that you can make them
> non-issues, the better.
>
> Nathan Moore

[Probably because the bytes in byte code don't have to be 8 bits. either.
-John]


Chris Dollin

2005-04-11, 3:58 am

Nathan Moore wrote:

> Chris Dollin wrote:
>
> So what, you have several cases w/ the same code for all of the
> instructions that only use the first 5 bits to determine the operation
> and the other 3 of the octet as extra operands when the first 5 bits
> match?


Yes. It seemed the most straightforward solution. (As it happens, the
case where the 3-bit field is 7 is also often a special case: 0-6
specifies a register, 7 specifies a pull/push of the operand stack.)

> Rather than the array of f() * , if you use gcc,


I didn't want to commit to gcc (especially since one of my targets
was the ARM).

> and because you won't have to jump back to the initial jump to begin the
> next instruction. If you were to do this, I would suggest that you make
> an automated process of building the C source file that has the
> implementation of the instructions in it from one or more files that
> contain the per-instruction implementations and make it so that the
> method of jump table can be changed for different compilers.


Indeed. My friend Steve Leach has done some VM investigation where the
same code fragments for the instructions) are stitched together in
different ways for comparison purposes: switch cases, functions, and
GCC label targets. If I remember rightly, label targets were fastest
on x86 machines, but functions were fastest on Power PC. Were I to
be doing another round of VM exercises, I'd play the same kind of
trick.

--
Chris "electric hedgehog" Dollin

Anton Ertl

2005-04-11, 3:58 am

Nathan Moore <nathan.moore@sdc.cox.net> writes:
>Rather than the array of f() * , if you use gcc, you could implement it
>with an array of labels if you use gcc or another compiler that supports it.
>
>void * LABEL[256] = { ADD, SUBTRACT, ... };
>...
>goto LABEL[instruction];
>
>ADD:
> /* add code */
> /* update instruction pointer to reflect that this one has been done */
> goto LABEL{instruction];
>SUBTRACT:
>
>This will save you about 2 jumps per instructoin since the jump tables
>that are usually spit out by compilers are:
>
>jump ip+(constant*case)
>jump CODE_THAT_ACTUALLY_DOES_CASE_0
>jump CODE_THAT_ACTUALLY_DOES_CASE_1
>...


I have never seen that kind of code generated. Which compiler does it
that way? Big, dense switch statements have been translated into a
range check, an array access (for the target address), and an indirect
jump to the target in the code I looked at (usually by gcc).

By using the labels-as-values extension you at least get rid of the
range check (in theory, with a switch over an unsigned char, and all
256 possible values covered by cases, the compiler could get rid of
that by itself, but I have not yet seen a compiler that does this).

>and because you won't have to jump back to the initial jump to begin the
>next instruction.


The main benefit here is not the elimination of the direct jump, but
the increased prediction accuracy of the BTB for the indirect jump,
resulting in speedups by up to a factor of 2:

http://www.jilp.org/vol5/v5paper12.pdf

Unfortunately, since gcc-3.2 gcc tends to combine all indirect jumps
into one, and put a jump to this common indirect jump where the
indirect jumps should be. It looks like this will be fixed in
gcc-4.0. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15242

>If you were to do this, I would suggest that you make
>an automated process of building the C source file that has the
>implementation of the instructions in it from one or more files that
>contain the per-instruction implementations and make it so that the
>method of jump table can be changed for different compilers.


Sounds like what Vmgen (http://www.complang.tuwien.ac.at/anton/vmgen/)
is doing; however, Vmgen also does lots of other helpful things. OTOH,
it currently has no support for dealing with bit-fields for opcodes
and arguments.

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/home.html
Nathan Moore

2005-04-16, 3:58 pm

>>Rather than the array of f() * , if you use gcc,
>
>
> I didn't want to commit to gcc (especially since one of my targets
> was the ARM).
>
>
>
>
> Indeed. My friend Steve Leach has done some VM investigation where the
> same code fragments for the instructions) are stitched together in
> different ways for comparison purposes: switch cases, functions, and
> GCC label targets. If I remember rightly, label targets were fastest
> on x86 machines, but functions were fastest on Power PC. Were I to
> be doing another round of VM exercises, I'd play the same kind of
> trick.


One other thing that you could do if you were to create a tool to take
your implementations of the instructions and make them into the
different types of jump tables (*F(), label[], switch/case) would be
to use that same instruction implementation code to make a simple jit
compiler, with a little fiddling w/ the variables used, so that the
system's C compiler could be fed the output from the jit/vm and then
load the code from it (possibly after running the system's linker over
it). This should be really simple.

Nathan
glen herrmannsfeldt

2005-04-16, 3:58 pm

John Slimick wrote:


> I've been following this thread and I find that there are assumptions
> about "byte." In my career bythes were six bits until April 1964 when
> they became 8 bits (IBM 360).


In the pre-360 books that I have seen the six bit unit seems to be
called BCD. That is, the code before EBCDIC was called BCDIC, but it
seems that IBM called them BCD, even though it would seem the term
should be used for numeric data.

The IBM Fortran manual calls them "storage locations", I remember
being why everyone was calling them bytes, when the book
didn't have that word anywhere in it.

> Now with the advent of UNICODE it seems that bytes are becoming 16
> bits, at least conceptually. (And then there were the 5 bit Baudot
> "bytes"). I think everyone is talking about "octets," so why can't
> we use them?


It seems that in C, and the PDP-10 bytes can have a variety of sizes,
but to some people and some machines they are always eight bits.


In the Fortran 66 days the only portable way to do anything with
character data was with one character per Fortran storage unit (the
size of an INTEGER or REAL variable). That is, A1 format. Word
addressed machines were popular enough that you couldn't do much else.
[color=darkred]
> [Probably because the bytes in byte code don't have to be 8 bits. either.
> -John]


In that case, what are the bytes in byte code?

I suppose I don't like the name that much anyway, so I can't
really disagree with the meaning.

-- glen
Ralph Corderoy

2005-05-10, 4:01 am

Hi Anton,

Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
> Nathan Moore <nathan.moore@sdc.cox.net> writes:
>
> I have never seen that kind of code generated. Which compiler does it
> that way?


I've seen the Norcroft and gcc C compilers targetted at ARM generate
such code.

ldrb r1, [r11, #0]
cmp r1, #7
addls pc, pc, r1, lsl #2 ;
b case_8
b case_0
b case_1
b case_2
b case_3
b case_4
b case_5
b case_6

.case_7
ldr r1, [r11, #&028]
mov r0, r11

With a 32-bit word, pc is already two words ahead of the current
instruction due to pipe-lining. This is the idiomatic jump table in ARM
and assembly programmers use it too.

> Big, dense switch statements have been translated into a range check,
> an array access (for the target address), and an indirect jump to the
> target in the code I looked at (usually by gcc).


The assembler above only works when the relative jump destination can be
encoded along with the b instruction in the 32-bit word.

Cheers,

--
Ralph Corderoy. http://inputplus.co.uk/ralph/ http://troff.org/
glen herrmannsfeldt

2005-05-14, 7:14 pm

Anton Ertl wrote:

(snip regarding switch statements or statement label arrays)

[color=darkred]
[color=darkred]
> I have never seen that kind of code generated. Which compiler does it
> that way? Big, dense switch statements have been translated into a
> range check, an array access (for the target address), and an indirect
> jump to the target in the code I looked at (usually by gcc).


IBM S/360 (S/370, ESA/390, etc.) compilers generate that for computed
GOTO in Fortran, and switch in languages with that.

S/360 has pretty much only one memory addressing mode with is
12 bit constant plus register plus register.

The exit codes (return codes) of all IBM supplied S/360 programs are
multiples of four, convenient for use in indexed branch instructions.
(Hopefully after bounds testing.)

-- glen
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com