For Programmers: Free Programming Magazines  


Home > Archive > Scheme > September 2006 > global variables, no lexical address









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author global variables, no lexical address
Marlene Miller

2006-09-03, 7:00 pm

My evaluator uses lexical addresses instead of identifiers to represent
variable references in expressions. The environment has only values instead
of bindings of variables to values. However, the evaluator has a few
'global' variables which have no lexical address. By global variables, I
mean the initial environment has some constants which are outside the scope
of lambda and let expressions.

When an evaluator uses lexical addresses, is it acceptable/normal for the
environment representation to contain global variables?

(context: EoPL 3.23 - 3.26)

Marlene


Marlene Miller

2006-09-03, 7:00 pm


"Marlene Miller" wrote...
> My evaluator uses lexical addresses instead of identifiers to represent
> variable references in expressions. The environment has only values

instead
> of bindings of variables to values. However, the evaluator has a few
> 'global' variables which have no lexical address. By global variables, I
> mean the initial environment has some constants which are outside the

scope
> of lambda and let expressions.
>
> When an evaluator uses lexical addresses, is it acceptable/normal for the
> environment representation to contain global variables?
>
> (context: EoPL 3.23 - 3.26)


On further thought I might be wrong. Maybe the globals variables do have
lexical addresses. Is that right?

I have an idea. The lexical calculator that transforms the abstract syntax
tree of an expression by replacing variables references with lexical
addresses must /somehow know the global variables/. If it doesn't find an
identifier in the lambda and let declarations, it increments the lexical
address depth and looks in the list of global variables. The environment
contains (only) the values of the global variables.


int2k@gmx.net

2006-09-04, 4:01 am

"Marlene Miller" <marlenemiller@worldnet.att.net> writes:

I'm not quite sure I understand everything you say because I don't
have the book =BBEssentials of Programming Languages=AB, but I'll try
anyway. If I understand you correctly, you're trying to implement
some sort of compiler and one task is to replace references to
variables with the address of the variable, right?

> "Marlene Miller" wrote...

Yes. The built-in functions and variables of the defined language
have to pre-exist in the initial global environment if they cannot be
defined in the language itself. In Scheme it would be things like
car, #f, and so on. They are different from the other objects that
you create in the defined language because they are written in the
*defining* language, i. e. the language that you write your
interpreter in or, if you're writing a compiler, the target language
of the compiler.
[color=darkred]
>
> On further thought I might be wrong. Maybe the globals variables do
> have lexical addresses. Is that right?


OK, maybe you were not talking about built-in functions after all. If
you don't want special treatment for references to global variables,
you would have to transform you programs like this:

(define foo 2)
(define bar (list foo 3))
(cons 1 bar)

=3D=3D> (let* ((foo 2)
(bar (list foo 3)))
(cons 1 bar))

or maybe

((lambda (foo bar)
(begin
(set! foo 2)
(set! bar (list foo 3))
(cons 1 bar)))
#undefined #undefined)

The second version is somewhat more elegant, because memory for a new
lexical environment, i. e. a new stack frame, is only allocated when a
function is called. If you wanted to compile the original program
directly, you'd have to create the first environment (the global
environment) explicitly. But if you transform the program into one
big function application, you can handle the start of the program just
like any other function call.

> I have an idea. The lexical calculator that transforms the abstract
> syntax tree of an expression by replacing variables references with
> lexical addresses must /somehow know the global variables/. If it
> doesn't find an identifier in the lambda and let declarations, it
> increments the lexical address depth and looks in the list of global
> variables. The environment contains (only) the values of the global
> variables.


I don't know what you mean by =BBincrements the lexical address depth=AB,
but the rest sounds correct. If a name isn't defined in the current
lexical environment, it could be a reference to a pre-existing
function/variable, or a global function/variable. So instead of
replacing the variable name with something like (lexical-ref 5) you
would replace it with (predefined-ref 3) or (global-ref 17). If your
language has closures, a symbol can also be a reference to a free
variable, i. e. a variable defined in the lexical environment of an
outer lambda:

(define adder (lambda (free)
(lambda (bound) (+ bound _free_))))

((adder 1) 2) ; =3D> 3

If you transform your program into one big function application, like
I described above, all references to global variables would become
references to free variables.

Now, back to addresses: all variables references can be transformed
into addresses, i. e. indices into some vector. But depending on
where the variable =BBlives=AB, you have several different vectors,
namely
the current lexical environment (this is the current stack frame), the
vector of free variables, maybe the vector of global variables, and
the (global) vector of pre-defined functions and variables. The
vectors are allocated at different times during program execution: The
vector of pre-defined objects is allocated before the program is
started, as is the vector of global variables. A new vector of free
variables is allocated each time a function is created, i. e. each
time you encounter (lambda (...) ...). Lexical environments are
created each time a function is applied.

Wolfram

Marlene Miller

2006-09-04, 7:01 pm

Wolfram wrote...

>I'm not quite sure I understand everything you say because I don't have the

book »Essentials of Programming Languages«, but I'll try anyway. If I
understand you correctly, you're trying to implement some sort of compiler
and one task is to replace references to variables with the address of the
variable, right?

Yes.

In this book, a lexical address is a depth and position describing the place
in the text of the variable declaration. The variable reference in the
abstract syntax tree is replaced with a depth and position. The environment
has only values, no variables. The lexical address is used to locate the
value in the environment.

The purpose of the evaluators in this book is, I assume, to
describe/understand the semantics of languages. If the purpose were to get
the value of an expression, we could use any coding mechanism. My question
is about the correct way to think about the semantics of the language.

environment representation to contain global variables?
[color=darkred]
>Yes. The built-in functions and variables of the defined language have to

pre-exist in the initial global environment if they cannot be defined in the
language itself.

It concerns me that for variables declared by lambda and let expressions,
only the value of the binding is in the environment, whereas for predefined
variables (which have no declaration and hence no depth and position
relative to a variable reference), the whole binding - the identifier and
the value - must be in the environment (in order to lookup the predefined
variable).

lexical addresses. Is that right?
[color=darkred]
>OK, maybe you were not talking about built-in functions after all. If you

don't want special treatment for references to global variables, you would
have to transform you programs like this:

I meant built-in (or predefined) variables. I used the wrong word - globals.
I don't want special treatment for predefined variables. But it occurred to
me, that is a mechanical issue not semantics.

>I don't know what you mean by »increments the lexical address depth«,


In this book, depth is the scope nesting level from the variable reference
to the variable declaration. Since predefined variables do not have
declarations, they do not have depth. I suppose we could stretch the
definition of depth and say the depth of a variable reference to a prefined
variable is one more than the last enclosing scope. I am not sure whether
that is coding mechanics or semantics.

>but the rest sounds correct. If a name isn't defined in the current

lexical environment, it could be a reference to a pre-existing
function/variable, or a global function/variable. So instead of replacing
the variable name with something like (lexical-ref 5) you would replace it
with (predefined-ref 3) or (global-ref 17).

Ah! This could be the answer I am looking for. Tag the values in the
environment. That might be better than pretending the predefined variables
have a lexical address. Thank you for the suggestion.

>Now, back to addresses: all variables references can be transformed into

addresses, i. e. indices into some vector. But depending on where the
variable »lives«, you have several different vectors, namely the current
lexical environment (this is the current stack frame), the vector of free
variables, maybe the vector of global variables, and the (global) vector of
pre-defined functions and variables. The vectors are allocated at different
times during program execution: The vector of pre-defined objects is
allocated before the program is started, as is the vector of global
variables. A new vector of free variables is allocated each time a function
is created, i. e. each time you encounter (lambda (...) ...). Lexical
environments are created each time a function is applied.

This is a little confusing to me because I don't see the notion of nested
scopes. I like EoPL. Maybe I need to read more about lexical addresses in
another book. Do you have any recommendations?

- - - - - - - - - -

Given that lexical addresses (depth and position) can replace variable
references, should we say a variable reference refers to a place in the text
where a value first appears (instead of saying a variable reference refers
to a value)?

Marlene


Jens Axel Søgaard

2006-09-04, 7:01 pm

Marlene Miller skrev:

> On further thought I might be wrong. Maybe the globals variables do have
> lexical addresses. Is that right?


Maybe you can use the idea from the appendix of R5RS?

If P is a program in which all variables are defined before being
referenced or assigned, then the meaning of P is

P == ((lambda (I*) P') <undefined> ...)

where I* is the sequence of variables defined in P, P' is the sequence
of expressions obtained by replacing every definition in P by an
assignment, <undefined> is an expression that evaluates to undefined,
and is the semantic function that assigns meaning to expressions.

After this transformation, all global variables are gone.

--
Jens Axel Søgaard

int2k@gmx.net

2006-09-05, 4:01 am

Hi!

"Marlene Miller" <marlenemiller@worldnet.att.net> writes:

> Wolfram wrote...
>
>
> Yes.
>
> In this book, a lexical address is a depth and position describing the pl=

ace
> in the text of the variable declaration. The variable reference in the
> abstract syntax tree is replaced with a depth and position. The environme=

nt
> has only values, no variables. The lexical address is used to locate the
> value in the environment.


I see. You want transform a program like

(let ((a 1)
(b 2))
(let ((a 3)
(c 4)
(d 5))
(list a b c d)))

into something like

(let ((a 1)
(b 2))
(let ((a 3)
(c 4)
(d 5))
(list (ref 0 0) (ref 1 1) (ref 0 1) (ref 0 2))))

I was thinking about =BBflattened=AB environments. That is, my version
would have looked like this:

(let ((a 1)
(b 2))
(let ((a 3)
(c 4)
(d 5))
(list (ref 2) (ref 1) (ref 3) (ref 4))))

Semantically, these are equivalent, but performance-wise, flat
environments win because you only have to allocate one per function
call. In the nested version, you'd have to allocate one where you
store the values of the parameters and one for each let within the
body of the function. Allocations are expensive, but it mostly
doesn't matter how much memory you allocate, just how often. Thus, an
interpreter using flat environments will be much faster than an
interpreter using nested environments, all other thing being equal.

I'm working on my own Lisp interpreter, and one major source of its
slowness is that looking up values in the environment takes too much
time. It's about 40% or so of the total execution time of a program.
For every variable reference and assignment, the interpreter has to
figure out where to find the value in memory. Of course this is a
colossal waste of runtime, because this could be figured out at
compilation time, and only once, using a transformation like the one
you're working on.

> The purpose of the evaluators in this book is, I assume, to
> describe/understand the semantics of languages. If the purpose were
> to get the value of an expression, we could use any coding
> mechanism. My question is about the correct way to think about the
> semantics of the language.


IMO, it is. It's not necessarily the best way from an implementation
point of view, but it's true to the semantics of the language: when
you look up a variable, you start by looking at the last binding of
the innermost let (or lambda) and work your way outwards. The
transformation of variable references into addresses tells you
directly at which position in which let/lambda you will find the
value. Instead of having to use linear search (N comparisons of the
variable name), you get look-up in constant time.

>
>
> It concerns me that for variables declared by lambda and let expressions,
> only the value of the binding is in the environment, whereas for predefin=

ed
> variables (which have no declaration and hence no depth and position
> relative to a variable reference), the whole binding - the identifier and
> the value - must be in the environment (in order to look-up the predefined
> variable).


You need to store them in another environment, separate form the
bindings you can create in the defined language. This is not really a
problem because you're traversing the whole program anyway, which
means you *know* which variables are defined and which aren't. If you
encounter a name which hasn't been defined anywhere, it must be a
reference to a predefined function/variable. (That, or it's a bug in
the program.) You'd replace these names with something else than the
names that refer to objects defined in the program. The program from
above might become:

(let ((a 1)
(b 2))
(let ((a 3)
(c 4)
(d 5))
((builtin-ref 4) (ref 1 0) (ref 0 1) (ref 1 1) (ref 1 2))))

Another possibility would be to start with a non-empty environment,
one that contains the values of, say, car, cdr, append, list etc. It
would be like pretending that the program looked like this:

(let ((car car-of-the-defining-language)
(cdr cad-of-the-defining-language)
...)
... ; the actual program
)

One way or the other, you have to use some trick to get these things
into the defined language.

>
>
> I meant built-in (or predefined) variables. I used the wrong word - globa=

ls.
> I don't want special treatment for predefined variables. But it occurred =

to
> me, that is a mechanical issue not semantics.


Yes. A reference to a builtin should behave exactly like a reference
to a user-defined object.

>
> In this book, depth is the scope nesting level from the variable reference
> to the variable declaration. Since predefined variables do not have
> declarations, they do not have depth. I suppose we could stretch the
> definition of depth and say the depth of a variable reference to a predef=

ined
> variable is one more than the last enclosing scope. I am not sure whether
> that is coding mechanics or semantics.
>
>
> Ah! This could be the answer I am looking for. Tag the values in the
> environment. That might be better than pretending the predefined variables
> have a lexical address. Thank you for the suggestion.
>
>
> This is a little confusing to me because I don't see the notion of nested
> scopes.


Nested scopes are just an illusion that allows you to give the same
name to two different variables. I. e.

(let ((a 1)
(b 2))
(let ((a (+ a b))
(b (- a b)))
(list a b)))

is just like saying

(let* ((v1 1)
(v2 2)
(v3 (+ v1 v2))
(v4 (- v1 v2)))
(list v3 v4))

or

(begin
(define stack (make-vector 4))
(vector-set! stack 0 1)
(vector-set! stack 1 2)
(vector-set! stack 2 (+ (vector-ref stack 0) (vector-ref stack 1)))
(vector-set! stack 3 (- (vector-ref stack 0) (vector-ref stack 1)))
(list (vector-ref stack 2) (vector-ref stack 4)))

> I like EoPL. Maybe I need to read more about lexical addresses in
> another book. Do you have any recommendations?


I don't know about lexical addresses, but if you *really* want to know
how the Lisp family of languages works, you might want to take a look
at Christian Queinnec's =BBLisp in Small Pieces.=AB [1] It describes
several interpreters for a Scheme-like language (The implementation
language is also Scheme.) plus a byte-code compiler, a Scheme to C
compiler and some other stuff (e. g. an object system and a macro
system).

> - - - - - - - - - -
>
> Given that lexical addresses (depth and position) can replace variable
> references, should we say a variable reference refers to a place in the t=

ext
> where a value first appears (instead of saying a variable reference refers
> to a value)?


Actually, I'd say a variable reference is a reference to an address in
memory. In fact, variable names are just mnemonics for addresses in
memory. You can see that in my last example, which uses a vector.


Cheers
Wolfram


PS: The quoted parts of your messages are broken. Instead of

--8<---------------cut here---------------start------------->8---
> some words followed by a newline
> and another line

--8<---------------cut here---------------end--------------->8---

you have

--8<---------------cut here---------------start------------->8---
>some words followed by a newline

and another line
--8<---------------cut here---------------end--------------->8---

Could you please try to fix that in your newsreader?

Footnotes:
[1] See <http://www-spi.lip6.fr/~queinnec/WWW/LiSP.html> and

<http://www.amazon.com/Lisp-Small-Pi...p/0521545668/r=
ef=3Dpd_lpo_k2_dp_k2a_2_txt/002-4523961-5858447?ie=3DUTF8>

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com