Home > Archive > Scheme > May 2004 > Re: Bottom in R5RS DS
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Re: Bottom in R5RS DS
|
|
| Mario S. Mommer 2004-05-12, 9:12 pm |
|
Lauri Alanko <la@iki.fi> writes:
> I don't think this supports the argument that scientific terminology
> is so rigorous that it could not support two different notions of
> "type"...
Indeed. Besides, what is the point of ignoring reality? Cardelli
himself acknowledges that his terminology is not completely standard,
due to the "inherent inconsistency of standard terminology arising
from different sources".
| |
| Mario S. Mommer 2004-05-12, 9:12 pm |
|
Mario S. Mommer <m_mommer@yahoo.com> writes:
> Cardelli himself acknowledges that his terminology is not completely
> standard, due to the "inherent inconsistency of standard terminology
> arising from different sources".
Forgot the citation:
@incollection{ cardelli97type,
author = "Luca Cardelli",
title = "Type Systems",
booktitle = "The Computer Science and Engineering Handbook",
publisher = "CRC Press",
address = "Boca Raton, FL",
editor = "Allen B. Tucker",
year = "1997",
url = "citeseer.ist.psu.edu/cardelli97type.html" }
| |
| Matthias Felleisen 2004-05-12, 9:12 pm |
| Lauri Alanko wrote:
> In article <fb74251e.0405051035.15628f92@posting.google.com>,
> William D Clinger <cesuraSPAM@verizon.net> wrote:
>
>
>
> All right, thanks. Our department's library doesn't have Stoy's book,
> so I haven't yet had the chance to read it.
>
> Stoy's book is kinda old, and Schmidt's is a fairly brief introductory
> textbook. What would be a good contemporary source about DS?
Denotational semantics in Stoy's sense has gradually disappeared in favor of
operational semantics (SOS, reduction semantics, and other ad hoc frameworks).
The reasons are many, but you have put your finger in a wound with your remark
below.
You should probably look for a modern text book on semantics like Winskel or
possibly Mitchell. But honestly, I'd study operational semantics instead. It's
more useful.
>
>
> Right. I was being imprecise. Of course the semantics of _programs_
> doesn't change. What I meant was that a lifted domain seems like a
> "sloppy" model for the values of a strict language.
>
> Intuitively, if you have a Scheme value, then you bloody well _do_
> have a value, not an unterminating computation, and I think that the
> domain should reflect this. Having all domains include a bottom
> element, but then operating on them purely strictly, is like using
> real numbers for integer arithmetic: you get the right result, but the
> choice of an unnecessarily powerful tool only complicates and misleads.
This is where denotational semantics began to unravel. The story has two major
aspects. First, denotational domains contain some much junk that the extensional
equivalence [approximation] relation on domain elements is sound with respect to
the observational equivalence [approximation] relation but utterly incomplete
and misleading for many cases when you want to reason about transformations.
This is called the full abstraction problem, and it started with "embarrassing"
top elements (Scott's and Plotkin's phrase) and moved on to typed languages and
deterministic paralllelism.
Second, the most important thing you wish to prove about a programming language
semantics is a "well specifiedness" theorem. This is also known as a type
soundness theorem though the name is misleading. For years people tried to use
denotational semantics for this purpose, until they discovered a major flaw in a
major proof (Damas, ML + references). Then they realized that for this kind of
proof you want to reason about "whole programs" and for that purpose an
operational semantics is enough. Nowadays most people use a proof technique for
this purposes that is based on a Lambda-calculus style semantics (reduction
semantics) and that uses a progress and a preservation lemma.
It is naturally ironic that one of the oldest languages that celebrates the
lambda calculus uses a denotational semantics to specify some kernel rather than
the original Lambda calculus style semantics. [Denotational semantics was
inspired by work modeling languages with Lambda calculus, but that's not an excuse.]
>
>
> I think my problem is with the "might as well be". It's not clear
> which operations _have_ to deal with bottom.
That's correct and refers to Problem 1 above. The Scheme semantics uses an
encoding trick (continuation passing style) to achieve more than one purpose:
access to the continuation and making functions CBV. The trick is unnecessary
for both purposes and only obscures the second point. You will need to calculate
through a small example, say Lambda plus numbers with addition and division plus
booleans with if and similar things to see that it actually works out with or
without strictness for function spaces. [One indication is Plotkin's result that
lambda with beta eta and lambda with beta-value eta-value are the same on cps
terms yet prove more than lambda with beta-value for the call-by-value source
language. To complete the reasoning power you need an additional set of axioms,
dubbed the A-axioms. And this is where the A-normal form compilation comes from
but now we're somewhere else]
-- Matthias
| |
| Lauri Alanko 2004-05-12, 9:12 pm |
| In article <c7dcn0$912$1@camelot.ccs.neu.edu>,
Matthias Felleisen <matthias@ccs.neu.edu> wrote:
> You should probably look for a modern text book on semantics like
> Winskel or possibly Mitchell. But honestly, I'd study operational
> semantics instead. It's more useful.
Well yes. I do think I have a basic grasp on operational semantics.
It's _easy_ compared to DS. :)
I do believe that op sem is more useful nowadays. That seems to be one
point that everyone agrees on. So I'm not interested in DS as a tool
for reasoning about programs.
My motivation for learning about DS is twofold. Firstly, I am (still)
working on my MS thesis about typed reflection, and all the classic
papers on reflection from the eighties have a very denotational flavor
to them. Even Queinnec's paper on first-class environments from as
late as 1996 provides denotational definitions for all the operations.
If I am to discuss these papers and possibly adapt and simplify some
of the definitions for expositionary purposes (eg. by removing
stores), I'd better understand what I'm doing.
The second reason, more general, is that I feel it's very important to
understand something about the relationship of a program to the world
of mathematical objects. It's not enough just to say that "this
program terminates on all inputs of type int and returns a result of
type int", but also, "and its result really _is_ the factorial of the
argument". To my understanding, DS is the tool for bridging this gap
between the syntactic and the semantic worlds.
Although purely syntactic methods (operational semantics) are very
practical for most purposes, and I find them much easier to
understand, they still don't provide answers to everything. The
_purpose_ of a programming language is to express certain _meanings_.
Brian Cantwell Smith's reflection papers make a very strong point
about this: _first_ comes the intended meaning (in the language
designer's mind), _then_ the formal system intended to express it.
In most cases, we just have to rely on our intuition to tell us
whether the formalization is correct or not. But when the intended
meaning is mathematical, then at least we have a tool to use.
Thanks for your reply.
Lauri Alanko
la@iki.fi
| |
| Lauri Alanko 2004-05-12, 9:12 pm |
| In article <c7dcn0$912$1@camelot.ccs.neu.edu>,
Matthias Felleisen <matthias@ccs.neu.edu> wrote:
> It is naturally ironic that one of the oldest languages that
> celebrates the lambda calculus uses a denotational semantics to
> specify some kernel rather than the original Lambda calculus style
> semantics.
I wouldn't mind seeing operational semantics for Scheme in R6RS. :)
Lauri
| |
| Matthias Felleisen 2004-05-12, 9:12 pm |
| Lauri Alanko wrote:
> In article <c7bkvo$dvq$1@oravannahka.helsinki.fi>,
> Lauri Alanko <la@iki.fi> wrote:
>
>
> I don't think this supports the argument that scientific terminology
> is so rigorous that it could not support two different notions of
> "type"...
Indeed, the scientific terminology accepts 100's of notions of type systems and
they are all type systems -- based on the principle that a type is a syntactic
abstraction barrier.
The *un*scientific terminology of newsgroups and people without knowledge and/or
respect for the need of a precise language conflates two different notions:
>types< and >type tags<.
Different thread.
-- Matthias
| |
| Matthias Felleisen 2004-05-12, 9:12 pm |
| Lauri Alanko wrote:
> In article <c7dcn0$912$1@camelot.ccs.neu.edu>,
> Matthias Felleisen <matthias@ccs.neu.edu> wrote:
>
>
>
> I wouldn't mind seeing operational semantics for Scheme in R6RS. :)
Robby Findler and Jacob Matthews are working on it. -- Matthias "-)
| |
| Mario S. Mommer 2004-05-12, 9:12 pm |
| Matthias Felleisen <matthias@ccs.neu.edu> writes:
> Indeed, the scientific terminology accepts 100's of notions of type
> systems and they are all type systems -- based on the principle that a
> type is a syntactic abstraction barrier.
>
> The *un*scientific terminology of newsgroups and people without
> knowledge and/or respect for the need of a precise language conflates
> two different notions: >types< and >type tags<.
This is getting fun, really.
So what does a >type tag< do? Specify the font in which the value
should be printed, or does it do something more fancy?
| |
| Matthias Blume 2004-05-12, 9:12 pm |
| Mario S. Mommer <m_mommer@yahoo.com> writes:
> Matthias Felleisen <matthias@ccs.neu.edu> writes:
[color=darkred]
> So what does a >type tag< do? Specify the font in which the value
> should be printed, or does it do something more fancy?
It enables the efficient implementation of certain runtime predicates.
It does /not/ provide a syntactic abstraction barrier.
M.B.
| |
| Bradd W. Szonye 2004-05-12, 9:12 pm |
| Matthias Felleisen <matthias@ccs.neu.edu> wrote:[color=darkred]
> Indeed, the scientific terminology accepts 100's of notions of type
> systems and they are all type systems -- based on the principle that a
> type is a syntactic abstraction barrier.
>
> The *un*scientific terminology of newsgroups and people without
> knowledge and/or respect for the need of a precise language conflates
> two different notions:
You missed an important group: The formal terminology of standards
committees (including the RRRS authors) also uses the word "type" to
refer to classes of values in general. These people typically require
precise language, and they often work in scientific contexts, yet they
obviously do not find it an "abuse of language" to call a non-syntactic
type a "type."
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Mario S. Mommer 2004-05-12, 9:12 pm |
|
Matthias Blume <find@my.address.elsewhere> writes:
> Mario S. Mommer <m_mommer@yahoo.com> writes:
>
> It enables the efficient implementation of certain runtime predicates.
BZZZT! Wrong!
It indicates the type of the value.
> It does /not/ provide a syntactic abstraction barrier.
So what?
| |
| Lauri Alanko 2004-05-12, 9:12 pm |
| In article <fz7jvpeaib.fsf@germany.igpm.rwth-aachen.de>,
Mario S. Mommer <m_mommer@yahoo.com> wrote:
>
> BZZZT! Wrong!
>
> It indicates the type of the value.
Are you claiming that (given your definition of a type), "value x has
a type t" is not a runtime predicate, and tags do not make its
implementation efficient?
Lauri Alanko
la@iki.fi
| |
| Mario S. Mommer 2004-05-12, 9:12 pm |
|
Lauri Alanko <la@iki.fi> writes:
> In article <fz7jvpeaib.fsf@germany.igpm.rwth-aachen.de>,
> Mario S. Mommer <m_mommer@yahoo.com> wrote:
>
> Are you claiming that (given your definition of a type), "value x has
> a type t" is not a runtime predicate, and tags do not make its
> implementation efficient?
Ok, granted. You caught me.
It still indicates the type of the value :)
| |
| Matthias Blume 2004-05-12, 9:12 pm |
| Mario S. Mommer <m_mommer@yahoo.com> writes:
> Matthias Blume <find@my.address.elsewhere> writes:
>
> BZZZT! Wrong!
It is certainly not wrong, or are you saying that tags are not used
for implementing those predicates called PAIR? or SYMBOL?, etc.,
efficiently?
> It indicates the type of the value.
So what this thing it "indicates"?
M.B.
| |
| Matthias Blume 2004-05-12, 9:12 pm |
| Mario S. Mommer <m_mommer@yahoo.com> writes:
> Lauri Alanko <la@iki.fi> writes:
>
> Ok, granted. You caught me.
>
> It still indicates the type of the value :)
I think it indicates the /smell/ of a value. So maybe we should call it
"smell tag". (Of course, I don't mean "smell" in the sense of the
sensation that we get from the chemical receptors in our noses. And
no, I won't tell you what I mean instead, as this seems to be par for
the course.)
Matthias B.
| |
| Matthias Felleisen 2004-05-12, 9:12 pm |
| Bradd W. Szonye wrote:
> Matthias Felleisen <matthias@ccs.neu.edu> wrote:
>
>
> You missed an important group: The formal terminology of standards
> committees (including the RRRS authors) also uses the word "type" to
> refer to classes of values in general. These people typically require
> precise language, and they often work in scientific contexts, yet they
> obviously do not find it an "abuse of language" to call a non-syntactic
> type a "type."
That's , isn't it? These people just propagate and play fast and loose with
words.
-- Matthias
| |
| Matthias Felleisen 2004-05-12, 9:12 pm |
| Lauri Alanko wrote:
> In article <c7dcn0$912$1@camelot.ccs.neu.edu>,
> Matthias Felleisen <matthias@ccs.neu.edu> wrote:
>
>
>
> Well yes. I do think I have a basic grasp on operational semantics.
> It's _easy_ compared to DS. :)
>
> I do believe that op sem is more useful nowadays. That seems to be one
> point that everyone agrees on. So I'm not interested in DS as a tool
> for reasoning about programs.
>
> My motivation for learning about DS is twofold. Firstly, I am (still)
> working on my MS thesis about typed reflection, and all the classic
> papers on reflection from the eighties have a very denotational flavor
> to them.
> The second reason, more general, is that I feel it's very important to
> understand something about the relationship of a program to the world
> of mathematical objects.
>
> Although purely syntactic methods (operational semantics) are very
> practical for most purposes, and I find them much easier to
> understand, they still don't provide answers to everything.
>
> In most cases, we just have to rely on our intuition to tell us
> whether the formalization is correct or not. But when the intended
> meaning is mathematical, then at least we have a tool to use.
Your motivation for studying DS is good. I teach some DS for just that reason in
my grad seminar. However, I'd propose that you take different route, esp
concerning "second". Read Plotkin's paper on full abstraction for LCF and then
try to get a handle on Scott's Data Types, Plotkin's Pomega, and Scott's Info
Systems. Follow up with Kanneganti-Cartwright from ICALP early 90s. This
sequence of papers is a study of universal domains and universal languages that
have just the right properties. Each paper cuts junk from the previous approach,
though I left you with some gaps. No text book deals with this idea properly,
yet it is at the heart of your quest.
Good luck -- Matthias
P.S. Oh, and you will see why Scott started with "types as tags" (eh, types as
retracts) and if you watch closely, you will understand why this theory suggests
flaws in this view.
| |
| William D Clinger 2004-05-12, 9:12 pm |
| Lauri Alanko <la@iki.fi> wrote:
> I wouldn't mind seeing operational semantics for Scheme in R6RS. :)
The R5RS cites:
William D Clinger. Proper tail recursion and space efficiency.
In the Proceedings of the 1998 ACM Conference on Programming
Language Design and Implementation, June 1998, pages 174-185.
ftp://ftp.ccs.neu.edu/pub/people/will/tail.ps.gz
This paper contains an operational semantics for the core of Scheme.
I have proved, but not published, a certain theorem that establishes
the consistency of this operational semantics with the denotational
semantics of the R5RS. Would you prefer this operational semantics
to the R5RS semantics?
Will
| |
| Matthias Felleisen 2004-05-12, 9:12 pm |
| William D Clinger wrote:
> Lauri Alanko <la@iki.fi> wrote:
>
>
>
> The R5RS cites:
>
> William D Clinger. Proper tail recursion and space efficiency.
> In the Proceedings of the 1998 ACM Conference on Programming
> Language Design and Implementation, June 1998, pages 174-185.
> ftp://ftp.ccs.neu.edu/pub/people/will/tail.ps.gz
>
> This paper contains an operational semantics for the core of Scheme.
> I have proved, but not published, a certain theorem that establishes
> the consistency of this operational semantics with the denotational
> semantics of the R5RS. Would you prefer this operational semantics
> to the R5RS semantics?
Since the semantics in your paper is a "derivative" of the semantics that I used
in my dissertatin (to construct other things and prove theorems), I can speak
from my experience that (1) you don't cover a small fraction of what is needed
and (2) it needs an executable version (easy) and (3) it is too low-level. You
might as well stick with denotational semantics.
-- Matthias
| |
| Ray Dillinger 2004-05-12, 9:12 pm |
| Matthias Felleisen wrote:
> Bradd W. Szonye wrote:
>
[color=darkred]
> That's , isn't it? These people just propagate and play fast and
> loose with words.
Until now, you were in danger of being taken seriously.
It is abundantly clear what the word "type" means when
used about dynamic types. The fact that some theorists
use it to refer to syntactic (static) types is irrelevant
to that use. The language of the standards group is very
precise and unambiguous; they explain the rules that their
types follow, which serves as an empirical definition of
what the word means in that context.
Bear
| |
| Matthias Blume 2004-05-12, 9:12 pm |
| Ray Dillinger <bear@sonic.net> writes:
> It is abundantly clear what the word "type" means when
> used about dynamic types.
Strange thing then that nobody seems to be able to give a clear
definition of this meaning, even though it supposedly is "abundantly
clear"...
| |
| Ray Dillinger 2004-05-12, 9:12 pm |
| Matthias Blume wrote:
> Ray Dillinger <bear@sonic.net> writes:
>
>
>
>
> Strange thing then that nobody seems to be able to give a clear
> definition of this meaning, even though it supposedly is "abundantly
> clear"...
The definition is empirical, as I said. If you're looking
for a strictly declarative definition, you won't find it.
Bear
| |
| Bradd W. Szonye 2004-05-12, 9:12 pm |
| > Ray Dillinger <bear@sonic.net> writes:
Matthias Blume wrote:[color=darkred]
> Strange thing then that nobody seems to be able to give a clear
> definition of this meaning, even though it supposedly is "abundantly
> clear"...
I gave a clear definition right from the beginning.
1. A number of people or things having in common traits or
characteristics that distinguish them as a group or class.
2. The general character or structure held in common by a number of
people or things considered as a group or class.
-- The American Heritage Dictionary of the English Language, 4th ed
These two definitions cover all common usage of the word "type" in
computer langages. Sometimes we refer to the set of things, and
sometimes we refer to the common character or structure, but all usage
fits one of those two definitions.
For example, the "disjoint types" of R5RS refer to ten sets of values:
boolean, symbol, char, vector, procedure, pair, number, string,
input-port, and output-port, all of which are disjoint except for the
last two (which overlap only with each other). In practice, we test
membership in those sets to determine the general character of a value,
specifically the interfaces and abstractions it supports.
How is this definition unsatisfactory?
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Bradd W. Szonye 2004-05-12, 9:12 pm |
| Ray Dillinger wrote:
> It is abundantly clear what the word "type" means when used about
> dynamic types. The fact that some theorists use it to refer to
> syntactic (static) types is irrelevant to that use.
I think part of the difficulty here stems from confusion over the
difference between "syntactic" and "static" analysis. Syntax is a set of
rules for combining language elements into grammatical statements, and
syntactic analysis is a method of divining the syntactic type of those
language elements from context. Historically, programming languages have
often aimed at making syntax amenable to static analysis, but there are
plenty of counterexamples. For example, polymorphic types often require
dynamic analysis (e.g., C++'s "runtime type information").
There are two ways to describe Scheme's type system:
1. Scheme's "types" are syntactic elements, and the language defines how
one may combine those elements into grammatical expressions. The
expression (CAR 42) is ungrammatical.
2. Scheme's "types" are sets of values with common characteristics, but
they are not syntactic elements. Scheme's syntactic elements are
operator position, operand position, etc. The expression (CAR 42) is
grammatical but meaningless.
Under the first interpretation, Scheme's types are syntactic but not
generally amenable to static analysis. Because they are syntactic, it's
not an "abuse of language" to call them types. However, because all
values belong to the same static type, syntactic analysis requires
dynamic checking.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Matthias Blume 2004-05-12, 9:12 pm |
| "Bradd W. Szonye" <bradd+news@szonye.com> writes:
>
> Matthias Blume wrote:
>
> I gave a clear definition right from the beginning.
>
> 1. A number of people or things having in common traits or
> characteristics that distinguish them as a group or class.
> 2. The general character or structure held in common by a number of
> people or things considered as a group or class.
> -- The American Heritage Dictionary of the English Language, 4th ed
You call /that/ a "clear" definition?!? AFAIC, this is not a
definition at all, at least not in mathematics (and, by extension, not
in PL).
> For example, the "disjoint types" of R5RS refer to ten sets of values:
> boolean, symbol, char, vector, procedure, pair, number, string,
> input-port, and output-port, all of which are disjoint except for the
> last two (which overlap only with each other). In practice, we test
> membership in those sets to determine the general character of a value,
> specifically the interfaces and abstractions it supports.
>
> How is this definition unsatisfactory?
How is this definition different from that of "set"? If it isn't, why
do we need to abuse the word "type" for it?
| |
| Bradd W. Szonye 2004-05-12, 9:12 pm |
| Matthias Blume wrote:
"Bradd W. Szonye" writes:[color=darkred]
[color=darkred]
> You call /that/ a "clear" definition?!?
Yes. In natural language, a "type" is merely a set whose members have
common traits or structure. For example, the set of integers is a type
because integers have common mathematical properties, and the set of
pairs is a type because pairs have a common structure. However, the
arbitrary set { 1 Q { foo } -i } is not a type, because its members have
no common traits.
> AFAIC, this is not a definition at all, at least not in mathematics
> (and, by extension, not in PL).
Your extension does not follow. There's more to programming languages
than just mathematics.
[color=darkred]
> How is this definition different from that of "set"?
Elements of arbitrary sets do not have common traits. Types do.
> If it isn't, why do we need to abuse the word "type" for it?
You people keep using this word "abuse," yet you're the ones using a
term that is clearly at odds with the established meanings of the term.
I find it especially bizarre that you call it "abuse of language" to use
the word in its non-mathematical sense, and even more bizarre that you
insist upon qualifying the word (as "dynamic type") when referring to
the natural-language definition.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Ray Dillinger 2004-05-12, 9:12 pm |
| Bradd W. Szonye wrote:
> Ray Dillinger wrote:
>
>
<schnyppe>
> There are two ways to describe Scheme's type system:
>
> 1. Scheme's "types" are syntactic elements, and the language defines how
> one may combine those elements into grammatical expressions. The
> expression (CAR 42) is ungrammatical.
>
> 2. Scheme's "types" are sets of values with common characteristics, but
> they are not syntactic elements. Scheme's syntactic elements are
> operator position, operand position, etc. The expression (CAR 42) is
> grammatical but meaningless.
> Under the first interpretation, Scheme's types are syntactic but not
> generally amenable to static analysis. Because they are syntactic, it's
> not an "abuse of language" to call them types. However, because all
> values belong to the same static type, syntactic analysis requires
> dynamic checking.
I find the second interpretation more meaningful. Scheme's
types are operational/empirical/dynamic, not syntactic/static.
If the syntax of a language can be analyzed statically, using
an algorithm known to terminate in finite time for any finite
program, to resolve all type information for any program in
that language, then I say that the language has static typing,
because we are then talking about static analysis (of the
syntax). If you want to call these "syntactic" types, I've
no objection, because we're also talking about (static)
analysis of the syntax.
If the type information of any program in a language cannot
be resolved by an algorithm known to terminate in finite
time for any finite program, then I do not say that that
language has static types, because the type information
cannot be resolved by static analysis. It could still be
said to have syntactic types, because a program is its
syntax, and any analysis, including running it, is an
analysis of its syntax - but that is using the word
'syntax' in a meaningless way (a way which excludes
absolutely no analysis that can be done) so I don't prefer
it.
Scheme variables have types that can be resolved, in general,
only by runtime or "dynamic" analysis: This is not analysis
of the syntax (for a 'meaningful' use of the word 'syntax'),
this is analysis of the process that syntax generates. Since
the process is not generally guaranteed to terminate in finite
time (cf. halting problem), the analysis is also not guaranteed
to terminate in finite time. This is clearly a dynamic rather
than a static analysis. Therefore I say that scheme has
"dynamic" types, because resolving them requires dynamic
analysis, or analysis of the process rather than the syntax.
The only slippery notion, then, is this: What Is A Type?
The definition, alas, is not "crisp"; here is a technique
or criterion to determine to what degree some set is a
member of the set of types, relative to some particular
language and program; I'd just count the number of
contexts in that program within which substitution of
*any* value not belonging to the set for a value
belonging to the set would cause a runtime error, and
divide it by the number of contexts within which values
of that set occur. The resulting quotient would be the
(fuzzy) degree of membership of that set in the set of
types, relative to that language and program.
But the same "fuzzy" degree of membership test could be
carried out for static type systems as well. So, like
most people who write about dynamically typed languages,
I guess I normally use the word "type" to denote a set of
values. On closer inspection though, membership of a set
in the set of types itself is fuzzy; some things are
types to a greater degree than others.
Bear
| |
| Bradd W. Szonye 2004-05-12, 9:12 pm |
| > Bradd W. Szonye wrote:
Ray Dillinger wrote:[color=darkred]
> I find the second interpretation more meaningful. Scheme's types are
> operational/empirical/dynamic, not syntactic/static.
They aren't static types, but they're still syntactic. Remember,
"syntactic" refers to grammar, not to a particular method for parsing
programs. Languages with static type systems have grammars that are
amenable to static parsing (i.e., parsing is decidable). But syntactic
analysis is still "syntactic" even if parsing isn't decidable.
> ... If the type information of any program in a language cannot be
> resolved by an algorithm known to terminate in finite time for any
> finite program, then I do not say that that language has static types,
> because the type information cannot be resolved by static analysis.
Agreed.
> It could still be said to have syntactic types, because a program is
> its syntax, and any analysis, including running it, is an analysis of
> its syntax - but that is using the word 'syntax' in a meaningless way
> (a way which excludes absolutely no analysis that can be done) so I
> don't prefer it.
But it's not meaningless -- it just delves into the hairy world of
non-context-free grammars with undecidable parsing.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Ray Dillinger 2004-05-12, 9:13 pm |
| Bradd W. Szonye wrote:
>
>
> Agreed.
>
>
>
>
> But it's not meaningless -- it just delves into the hairy world of
> non-context-free grammars with undecidable parsing.
I do not think that "type" as applied to data in a programming
language is a meaningful concept as applied to context-sensitive
languages with undecidable parsing. Such languages are not
programming languages.
Indeed, if the parse of a program is undecidable, then the
program itself is of indeterminate semantics; this is
usually considered to indicate a deep design flaw in the
programming language. I am aware of no reason why such
a programming language would be interesting or useful, and
I am aware of no such programming languages (barring buggy
compilers where the parse depends on a race condition or
something) which exist.
Since such languages (AFAIK) do not exist, there would seem
to be no examples on which to base observations about the
properties of their type systems, in order to contrast them
with "syntactic" type systems.
To put it a different way: Unless there is some example of
a type system that is not "syntactic" (in whatever sense
you're using the word "syntactic" - a sense which, clearly,
I still do not understand) then there is no point in
describing *any* type system as syntactic, because this
does not distinguish it from any other type system.
Bear
| |
| Joe Marshall 2004-05-12, 9:13 pm |
| Ray Dillinger <bear@sonic.net> writes:
> I do not think that "type" as applied to data in a programming
> language is a meaningful concept as applied to context-sensitive
> languages with undecidable parsing. Such languages are not
> programming languages.
>
> Indeed, if the parse of a program is undecidable, then the
> program itself is of indeterminate semantics; this is
> usually considered to indicate a deep design flaw in the
> programming language. I am aware of no reason why such
> a programming language would be interesting or useful, and
> I am aware of no such programming languages (barring buggy
> compilers where the parse depends on a race condition or
> something) which exist.
REBOL uses context-sensitive parsing where the context depends on the
runtime values of the identifiers. It would be trivial to write a
REBOL program that parses differently depending on whether, say,
Goldbach's conjecture was true or false.
Many people seem to find this interesting and useful. I consider it
deeply flawed, but amusing.
> Since such languages (AFAIK) do not exist, there would seem
> to be no examples on which to base observations about the
> properties of their type systems, in order to contrast them
> with "syntactic" type systems.
REBOL is dynamically typed. (I don't care if you don't like the phrase
`dynamically typed'. I am confident that *everyone* here understands
exactly what I mean by that.)
> To put it a different way: Unless there is some example of
> a type system that is not "syntactic" (in whatever sense
> you're using the word "syntactic" - a sense which, clearly,
> I still do not understand) then there is no point in
> describing *any* type system as syntactic, because this
> does not distinguish it from any other type system.
I believe the idea here is that `syntactic' means that the types are
associated with fragments of program text rather than being associated
with sets of program values.
| |
| Matthias Blume 2004-05-12, 9:13 pm |
| Joe Marshall <jrm@ccs.neu.edu> writes:
> I believe the idea here is that `syntactic' means that the types are
> associated with fragments of program text rather than being associated
> with sets of program values.
No. "Syntactic" refers to the fact that a static type system
"reasons" about types by manipulating syntactic terms -- namely
expressions in both term- and type languages. In contrast, there is
the "semantic" approach where reasoning about types makes use of a
semantic model (i.e., often, but not always, an interpretation of
types as something like sets of values).
As someone else has already said, the syntactic approach has the
advantage that whatever you can prove will hold in all models. The
semantic approach, on the other hand, has the potential advantage of
being able to use specific properties of the particular model.
Matthias
| |
| Joe Marshall 2004-05-12, 9:13 pm |
| Matthias Blume <find@my.address.elsewhere> writes:
> Joe Marshall <jrm@ccs.neu.edu> writes:
>
>
> No. "Syntactic" refers to the fact that a static type system
> "reasons" about types by manipulating syntactic terms -- namely
> expressions in both term- and type languages. In contrast, there is
> the "semantic" approach where reasoning about types makes use of a
> semantic model (i.e., often, but not always, an interpretation of
> types as something like sets of values).
Ok. So consider a hypothetical syntactic type system where `number'
and `integer' are types, but there is no rule that says that if an
expression is well-typed as an `integer' then the same expression is
well-typed as `number'. This system would give type errors when you
tried to pass an `integer' to someone expecting a `number'. The
system could not reason that integers *are* numbers because as far as
it knows `number' is an arbitrary label that has nothing to do with
that other arbitrary label `integer'.
A semantic approach would have a model in which all integers were
numbers, but only some numbers are integers (and it may know other
things like positive integers are greater than negative integers and
the sums of two positive integers is positive, etc. etc.)
Is this correct?
| |
| Bradd W. Szonye 2004-05-12, 9:13 pm |
| Matthias Blume wrote:
> "Syntactic" refers to the fact that a static type system "reasons"
> about types by manipulating syntactic terms -- namely expressions in
> both term- and type languages. In contrast, there is the "semantic"
> approach where reasoning about types makes use of a semantic model
> (i.e., often, but not always, an interpretation of types as something
> like sets of values).
Could you please clarify something for me? In /Types and Programming
Languages,/ Pierce uses a BNF-like notation to express type and
evaluation rules:
premise ... premise ...
----------- or -----------
lhs --> rhs term
This looks to me like BNF optimized for unrestricted grammars. Instead
of writing the full context in the lhs, one can encode parts of it as
"premises," which check relationships between contextual types, terms,
and values.
For example, Pierce implements the "downcast" operator with a type rule
and an evaluation rule:
type(context, t1, S)
------------------------- (T-DOWNCAST)
type(context, t1 as T, T)
type(null, v1, T)
----------------- (E-DOWNCAST)
v1 as T --> v1
The type rule expresses downcasting in the type language, and the
evaluation rule dynamically checks the cast so that the program cannot
proceed unless the cast is correct. (Pierce notes that programming
languages usually include another rule to handle bad casts, so that
evaluation doesn't get stuck.)
I think T-DOWNCAST and E-DOWNCAST are both syntactic rules (for
unrestricted grammars), but I'm not sure. If so, then his "type rules"
and "dynamic checks" are both syntactic methods for ensuring program
safety via typing.
If I understand Pierce correctly, the difference between a type rule and
a dynamic check is not one of syntax vs semantics or terms vs sets.
Instead, the difference is one of tractability: The type rules form a
system that that permits /tractable/ static analysis. In contrast,
dynamic checks generally require full evaluation, which isn't tractable
because of the halting problem.
Am I on the right track here?
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Matthias Blume 2004-05-12, 9:13 pm |
| Joe Marshall <jrm@ccs.neu.edu> writes:
> Matthias Blume <find@my.address.elsewhere> writes:
>
>
> Ok. So consider a hypothetical syntactic type system where `number'
> and `integer' are types, but there is no rule that says that if an
> expression is well-typed as an `integer' then the same expression is
> well-typed as `number'. This system would give type errors when you
> tried to pass an `integer' to someone expecting a `number'. The
> system could not reason that integers *are* numbers because as far as
> it knows `number' is an arbitrary label that has nothing to do with
> that other arbitrary label `integer'.
>
> A semantic approach would have a model in which all integers were
> numbers, but only some numbers are integers (and it may know other
> things like positive integers are greater than negative integers and
> the sums of two positive integers is positive, etc. etc.)
>
> Is this correct?
Yes, exactly.
| |
| Bradd W. Szonye 2004-05-12, 9:13 pm |
| Ray Dillinger wrote:
Bradd wrote:[color=darkred]
[color=darkred]
> I do not think that "type" as applied to data in a programming
> language is a meaningful concept as applied to context-sensitive
> languages with undecidable parsing. Such languages are not programming
> languages.
According to Church's Thesis, all models of computation -- Turing
machines, mu-recursive functions, unrestricted grammars, etc. -- have
equivalent computing power. Because of that, we're free to specify a
Scheme interpreter as a state machine, as a system of lambda
expressions, or as an unrestricted grammar. Pierce takes advantage of
this in /Types and Programming Languages/; he uses a BNF-like notation
for evaluation rules.
However, just as you can't guarantee or even predict whether a Scheme
machine will halt when given an arbitrary Scheme program, you can't
guarantee or predict whether the equivalent grammar will halt.
> Since such languages (AFAIK) do not exist ....
They do exist, at least hypothetically; every Scheme interpreter can be
modeled as an unrestricted grammar. Unless, of course, somebody has
disproved Church's Thesis since my undergrad days.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Ray Dillinger 2004-05-12, 9:13 pm |
| The fact that you can write an undecidable grammar for
a language does not make it an undecidable language.
Every decidable language has many undecidable grammars
and at least one decidable grammar.
Programming languages are useful only if their parses
are decidable; therefore I assert that, although
examples of context-sensitive programming languages
exist, none may have truly undecidable parsing.
Bear
| |
| Joe Marshall 2004-05-12, 9:13 pm |
| Ray Dillinger <bear@sonic.net> writes:
> Programming languages are useful only if their parses
> are decidable; therefore I assert that, although
> examples of context-sensitive programming languages
> exist, none may have truly undecidable parsing.
Obviously if the parse never returns then you'll have problems running
the program, but if parsing is interleaved with evaluation then you
could have a program whose continuation parse was undecidable. Another
alternative is for the parser to return all potential parses so that
the evaluator can decide which one to use. If the evaluator's choice
is predicated on something nondecidable, then the correct parse is
undecidable.
Of course this is a rather silly thing to do...
BEGIN {
eval (time % 2 ? 'sub foo ();' : 'sub foo (@);');
}
foo / 25 ; # / ; die "Does this die?"
But there's no accounting for taste.
| |
| Bradd W. Szonye 2004-05-12, 9:13 pm |
| Ray Dillinger writes:
Joe Marshall wrote:[color=darkred]
> Obviously if the parse never returns then you'll have problems running
> the program, but if parsing is interleaved with evaluation then you
> could have a program whose continuation parse was undecidable.
And, if I understand Matthias F. correctly, that's effectively how a
dynamic type-test works. The compiler postpones some of the syntactic
manipulation to run-time. (In practice, it calculates the possible
reductions in advance and leaves behind enough "type residue" for the
program to test typing relations and branch.) If a syntactic decision
depends on a term that doesn't terminate, the "parse" doesn't terminate
either.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Ray Dillinger 2004-05-12, 9:13 pm |
| Bradd W. Szonye wrote:
> And, if I understand Matthias F. correctly, that's effectively how a
> dynamic type-test works. The compiler postpones some of the syntactic
> manipulation to run-time. (In practice, it calculates the possible
> reductions in advance and leaves behind enough "type residue" for the
> program to test typing relations and branch.) If a syntactic decision
> depends on a term that doesn't terminate, the "parse" doesn't terminate
> either.
Please explain why you consider this to be a "manipulation of
the syntax" rather than an "examination of the process".
Please explain what could possibly be done to a program which is
not, in this trivial sense, a "manipulation of the syntax."
Bear
| |
| Bradd W. Szonye 2004-05-12, 9:13 pm |
| Bradd wrote:
Ray Dillinger wrote:[color=darkred]
> Please explain why you consider this to be a "manipulation of the
> syntax" rather than an "examination of the process".
I don't. I consider the two models *equivalent*, because of Church's
Thesis, which roughly states that you can always model a state machine
as a grammar (or as mu-recursive functions, etc.).
> Please explain what could possibly be done to a program which is not,
> in this trivial sense, a "manipulation of the syntax."
Nothing -- which is the whole point of Church's thesis. All models of
computation are equivalent.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Bradd W. Szonye 2004-05-12, 9:13 pm |
| Ray Dillinger wrote:
Bradd W. Szonye wrote:[color=darkred]
> I don't. I consider the two models *equivalent*, because of Church's
> Thesis, which roughly states that you can always model a state machine
> as a grammar (or as mu-recursive functions, etc.).
[color=darkred]
> Nothing -- which is the whole point of Church's thesis. All models of
> computation are equivalent.
By the way, this is why "tractable" is an important part of Pierce's
definition. Because of Church's Thesis, you can always express type
classification as a syntactic method, just by using the syntactic model
of computation. However, the tractability requirement discourages use of
run-time methods, because they're vulnerable to the halting problem.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Ray Dillinger 2004-05-12, 10:31 pm |
| Bradd W. Szonye wrote:
>
> Ray Dillinger wrote:
>
>
> Nothing -- which is the whole point of Church's thesis. All models of
> computation are equivalent.
AAARGH! What you are saying then, is that the word "syntactic"
which folks have been throwing around as a prerequisite for
something to be a type system, means exactly NOTHING!
If absolutely everything that can be done to a program is a
manipulation of its syntax, then 'syntactic' is being used
in such a trivial sense as to be completely meaningless.
Why was it an important word to say in this context? Why
do we not describe *ALL* features of *EVERY* programming
language as "a manipulation of its syntax?"
Bear
| |
| William D Clinger 2004-05-13, 5:32 pm |
| Ray Dillinger <bear@sonic.net> wrote:
>
> AAARGH! What you are saying then, is that the word "syntactic"
> which folks have been throwing around as a prerequisite for
> something to be a type system, means exactly NOTHING!
>
> If absolutely everything that can be done to a program is a
> manipulation of its syntax, then 'syntactic' is being used
> in such a trivial sense as to be completely meaningless.
>
> Why was it an important word to say in this context? Why
> do we not describe *ALL* features of *EVERY* programming
> language as "a manipulation of its syntax?"
>
> Bear
IMO, what people mean when they say types are "syntactic" is that
types determine which programs are syntactically correct. We do
not even discuss programs that are not syntactically correct: we
just regard them as meaningless, and discard them.
IMO this is entirely consistent with the usage of types in logic
and mathematics. I hope to write further about this soon.
Will
| |
| Bradd W. Szonye 2004-05-13, 7:32 pm |
| William D Clinger wrote:
> IMO, what people mean when they say types are "syntactic" is that
> types determine which programs are syntactically correct. We do not
> even discuss programs that are not syntactically correct: we just
> regard them as meaningless, and discard them.
OK, that helps a lot. This relates to what I said about models of
computation. Grammars and automata may be isomorphic, but they have
different strengths and weaknesses as models. It's similar to the time
and frequency domains in signal processing: They're isomorphic via
Fourier transform, but they're best suited for different tasks.
Grammars are especially good at specifying combinations -- whether
they're valid, what they reduce to, etc. Automata are better at
specifying states -- what's the current situation? what to do next? I
don't think it's any coincidence that we normally use grammars to define
languages but then transform the programs into automata before running
them.
Grammars make it easier to define and check a language's rules for
combining terms. However, even with grammars, it's tough to verify
arbitrary languages and programs. If I understand correctly, type
systems aim to simplify this by categorizing terms. Instead of dealing
with complicated combination and verification rules for specific terms,
you can simplify things to deal with whole classes of terms.
Hm, that description reminds me of the functional abstraction, but
instead of abstracting the combination of domain values into results
(computation), it abstracts the combination of language terms into
expressions. To put it another way:
Functional Model Grammatical Model
value : set : function :: term : type : type rule
Is this a reasonable description and analogy?
One handy aspect of the grammatical model is that you can often use
static methods to prove things about programs, rather than using dynamic
methods to check assumptions. Indeed, the type folks emphasize that very
strongly, encouraging type systems that are amenable to static analysis
and discouraging typed constructs that require dynamic checking. It
seems like there's some disagreement among type advocates on that issue.
For example, Pierce presents a dynamic type-test rule that can't be
reduced until run time; I get the impression that the type folks are
split over whether that kind of dynamic rule is really "syntactic,"
whether it's part of the "type system."
> IMO this is entirely consistent with the usage of types in logic
> and mathematics. I hope to write further about this soon.
Please do!
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Ray Dillinger 2004-05-14, 1:32 pm |
| William D Clinger wrote:
> Ray Dillinger <bear@sonic.net> wrote:
>
> IMO, what people mean when they say types are "syntactic" is that
> types determine which programs are syntactically correct. We do
> not even discuss programs that are not syntactically correct: we
> just regard them as meaningless, and discard them.
>
> IMO this is entirely consistent with the usage of types in logic
> and mathematics. I hope to write further about this soon.
Okay -- that makes more sense. But to me the standard terminology
of "static" vs. "dynamic" types makes more sense; with static
types, correctness can be determined with a static (bounded-time)
analysis, and with dynamic types, there are programs for which
type correctness can be determined only with dynamic (runs
for as long as the program) analysis.
Part of the disconnect is that I regard syntactic analysis as
inherently referring to something that happens during initial
parsing; Errors that arise at runtime are semantic, not syntactic.
If you are extending the notion of "syntax error" to cover all
type errors including those that can arise at runtime, then you
can describe any type system as syntactic - but that seems to
me like 'if you extend the notion of "cow" to cover horses then
you can describe most of the large animals on your ranch as
bovine...'
Bear
| |
| Anton van Straaten 2004-05-14, 2:32 pm |
| Ray Dillinger wrote:
> Part of the disconnect is that I regard syntactic analysis as
> inherently referring to something that happens during initial
> parsing; Errors that arise at runtime are semantic, not syntactic.
>
> If you are extending the notion of "syntax error" to cover all
> type errors including those that can arise at runtime, then you
> can describe any type system as syntactic -
That's exactly why the question arises about whether e.g. the variants of a
union type are really types in their own right, in the syntactic sense. The
clearly syntactic aspect is that with a static type system, terms can be
statically classified as belonging to a particular union type, but not
necessarily to a particular variant within that type.
> but that seems to
> me like 'if you extend the notion of "cow" to cover horses then
> you can describe most of the large animals on your ranch as
> bovine...'
If you change "cow" to "syntactic type", then that is indeed the discussion
we've been having, at least in some other parts of these threads - extending
the notion of syntactic type to describe the things we call types in a
Scheme program. Confusing this discussion is the fact that there are senses
in which types in Scheme can be considered syntactic, but they're limited
and involve constraints.
Anton
| |
| Bradd W. Szonye 2004-05-14, 2:32 pm |
| Ray Dillinger wrote:
That's what I figured. (It was my initial reaction too, based on the
usual compiler construction idea that "syntax" stops at parsing.)
[color=darkred]
Anton van Straaten wrote:[color=darkred]
> That's exactly why the question arises about whether e.g. the variants
> of a union type are really types in their own right, in the syntactic
> sense. The clearly syntactic aspect is that with a static type
> system, terms can be statically classified as belonging to a
> particular union type, but not necessarily to a particular variant
> within that type.
I'm still not entirely clear on this, but I currently believe that
"syntactic" refers to use of the grammar model of computation, and not
to static/dynamic concerns. While type advocates clearly prefer type
rules that permit tractable static analysis, there are enough
counterexamples that I don't think it's a requirement.
It looks to me like type rules are the grammar model's major
"abstraction" feature, just as functions are to the lambda calculus. See
my followup to Will Clinger for more details.
[color=darkred]
> If you change "cow" to "syntactic type", then that is indeed the
> discussion we've been having, at least in some other parts of these
> threads - extending the notion of syntactic type to describe the
> things we call types in a Scheme program.
I don't see that as an extension. Instead, it looks to me like type
advocates voluntarily limit themselves to statically-tractable type
rules whenever possible, because they're better suited to the goals of
static analysis & proof. In the cases where that isn't possible, though,
at least some of them still call it a "type system," even though it
isn't entirely static.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Anton van Straaten 2004-05-14, 6:32 pm |
| Bradd W. Szonye wrote:
> Anton van Straaten wrote:
>
> I'm still not entirely clear on this, but I currently believe that
> "syntactic" refers to use of the grammar model of computation,
> and not to static/dynamic concerns.
I agree with Will's definition: "types determine which programs are
syntactically correct. We do not even discuss programs that are not
syntactically correct: we just regard them as meaningless, and discard
them."
As Will observed, this is exactly what happens in mathematical logic:
formulae which don't match type rules are rejected as being meaningless.
This is fundamentally static - if you can't do it statically, then you can't
prove anything, and the whole point of types in this context would be moot.
These logical type systems are part of what Pierce describe as "a much
broader field of study in logic, mathematics, and philosophy".
Pierce also identifies two branches of study of type systems within computer
science: those that deal with typed lambda calculi, and those that deal with
more traditional programming languages. However, one thing that all three
of these notions of type system share is an emphasis on static knowledge.
I'll underscore that with another quote from Cardelli: "Types express static
knowledge about programs, whereas terms (statements, expressions, and other
program fragments) express the algorithmic behavior."
Pierce's intended meaning of "syntactic" can be seen in the following quote:
"Another important element in the above definition [of type system] is its
emphasis on classification of terms - syntactic phrases - according to the
properties of the values that they will compute when executed." His next
sentence is one I've quoted before, about a type system calculating a static
approximation to the runtime behaviors of terms.
> While type advocates clearly prefer type
> rules that permit tractable static analysis, there are enough
> counterexamples that I don't think it's a requirement.
What counterexamples? Are you just talking about things like the ascription
rules, which as previously observed, exist within a statically-typed
context?
>
>
> I don't see that as an extension. Instead, it looks to me like type
> advocates voluntarily limit themselves to statically-tractable type
> rules whenever possible, because they're better suited to the goals of
> static analysis & proof. In the cases where that isn't possible, though,
> at least some of them still call it a "type system," even though it
> isn't entirely static.
For example? Are you talking about something other than a static type
system with a dynamic component? Again, all the literature says that if you
take away the static part, you're left without a type system, and are
untyped. This is considered the case precisely because you no longer have
any useful static knowledge. The static knowledge is explicitly a
requirement, *in the accepted definitions*.
Let me repeat that I believe I understand where you're coming from re
connecting the dynamic notion of type to the static/syntactic notion. I'm
just saying that if you want to do that in a way that matches the syntactic
definition of types and type systems, then it has to involve static
knowledge. When you start doing things in programs that undermine static
knowledge, you're moving away from the syntactic definitions.
Whether the syntactic definitions are overly constraining or narrow is a
separate discussion. That was the point I was getting at with the case
where dynamic type errors, instead of terminating a program, are used to
change control flow in some other way. You'd need to account for issues
like this if you want to draw a stronger connection between dynamic and
static types than the syntactic definitions currently permit.
Anton
| |
| William D Clinger 2004-05-14, 8:32 pm |
| I wrote:
> IMO, what people mean when they say types are "syntactic" is that
> types determine which programs are syntactically correct. We do
> not even discuss programs that are not syntactically correct: we
> just regard them as meaningless, and discard them.
Ray Dillinger <bear@sonic.net> wrote:
> Okay -- that makes more sense. But to me the standard terminology
> of "static" vs. "dynamic" types makes more sense;
To me too. I wasn't advocating the "syntactic" terminology, just
trying to explain what I think it means.
> Part of the disconnect is that I regard syntactic analysis as
> inherently referring to something that happens during initial
> parsing; Errors that arise at runtime are semantic, not syntactic.
This syntactic versus semantic business is culture-dependent. In
the culture of compiler writers, everything that happens after
context-free parsing is called "semantic", even though it happens
at compile time. Thus all undefined variable and type errors are
"semantic" in that culture.
> If you are extending the notion of "syntax error" to cover all
> type errors including those that can arise at runtime, then you
> can describe any type system as syntactic - but that seems to
> me like 'if you extend the notion of "cow" to cover horses then
> you can describe most of the large animals on your ranch as
> bovine...'
Most of the people who regard types as "syntactic" would deny that
type errors can occur at runtime. In fact, they carefully define
the notion of type error to exclude all errors that are checked at
runtime. Thus a NullPointerException in Java is not a type error.
As I have said elsewhere, this is consistent with the way types
are used in logic and mathematics.
Will
| |
| Bradd W. Szonye 2004-05-14, 9:31 pm |
| Bradd wrote:
Anton van Straaten wrote:[color=darkred]
> I agree with Will's definition: "types determine which programs are
> syntactically correct. We do not even discuss programs that are not
> syntactically correct: we just regard them as meaningless, and discard
> them."
I agree with it too, but it isn't the whole story. Types also influence
evaluation. I've already mentioned downcasting and dynamic type-testing
(T&PL, p. 195) where evaluation rules depend on a value's type. Pierce
also describes overloading on p. 340:
/Ad-hoc polymorphism,/ by contrast, allows a polymorphic value to
exhibit different behaviors when "viewed" at different types. The
most common example of ad-hoc polymorphism is /overloading,/ which
associates a single function symbol with many implementations; the
compiler (or the run-time system, depending on whether overloading
resolution is /static/ or /dynamic/) chooses an appropriate
implementation for each application of the function, based on the
types of the arguments.
For example, C++ permits static overloading.
Pierce goes on to note another variety of overloading that's especially
relevant to this discussion:
A generalization of function overloading forms the basis for
/multi-method dispatch/ in languages such as CLOS .... This
mechanism has been formalized in the lambda-& calculus of Castagna,
Ghelli, and Longo ....
Don't you find it a bit odd that Pierce uses CLOS as an example of
polymorhpism within the context of type systems, after declaring earlier
that type systems are about static analysis?
He concludes his "Varieties of Polymorphism" with subtype polymorphism:
The /subtype polymorphism/ of Chapter 15 gives a single term many
types using the rule of subsumption, allowing us to selectively
"forget" information about the term's behavior.
And as Chapter 15 shows, you can also recover that information, but you
must use run-time typechecking to ensure safety.
There's an awful lot of dynamic checking in that list. It's why I
brought up polymorphism early in this discussion: Polymorphism often
relies on type information to guide evaluation, not just to check syntax
or to prove soundness. Furthermore, several kinds of polymorphism
require /run-time/ type information to guide evaluation. That's one of
the reasons I question this emphasis on type systems as a static method
or as a method for proving soundness.
Even in the proof context, the emphasis on tractable static analysis
often takes a backseat to necessity. For example, Pierce concedes
(T&PL p. 7) that
Static elimination of array-bounds checking is a long-standing goal
for type system designers. In principle, the necessary mechanisms
... are well understood, but packaging them in a form that balances
expressive power, predictability, and tractability of typechecking,
and complexity of program annotations remains a significant
challenge.
More generally, when he explains the importance of efficiency
(T&PL p. 4), he concedes
However, exactly what counts as efficient is a matter of debate.
Even widely used type systems like that of ML ... may exhibit huge
typechecking times in pathological cases .... There are even
languages with typechecking or type reconstruction problems that are
undecidable, but for which algorithms are available that halt
quickly "in most cases of practical interest" (e.g., Pierce and
Turner, 2000 ....)
In my opinion, all these examples greatly weaken the definitions. Yes,
type systems are all about tractable static proofing, except for the
systems that aren't generally tractable, and except for the systems that
are about polymorphism rather than proofing!
Yet Pierce brushes off "untyped" languages like Scheme, despite the fact
that Scheme statically associates types with many terms before it
"forgets" them, despite the fact that macro pattern matching essentially
works like static overloading, and despite his acceptance of undecidable
typecheckers and run-time typecheckers in so-called "type systems."
It looks to me like the type theorists simply stop considering languages
as soon as they spot a ubiquitous supertype, that they brand it /a
priori/ "untyped" and move on. Now, I realize that such languages are
"uninteresting" if you're interested in proving soundness, and Pierce
notes the difficulty of adding a non-vestigial type system to them, so I
understand why there's little interest in exploring them further.
However, for somebody like me who has a strong interest in polymorphism
and partial program analysis, it seems far too hasty to label such
languages "untyped."
[color=darkred]
> What counterexamples?
Pierce's examples of undecidable typecheckers (T&PL, p. 4), overloading
and multi-method dispatch (pp. 340-341), and of course downcasting rules
for subtype polymorphism (Ch. 15).
[color=darkred]
[color=darkred]
> For example? Are you talking about something other than a static type
> system with a dynamic component? Again, all the literature says that
> if you take away the static part, you're left without a type system,
> and are untyped. This is considered the case precisely because you no
> longer have any useful static knowledge.
Maybe not globally, but Scheme has enough of a type system to permit
local analysis. If I were solely interested in proving whole-program
soundness, that might not be enough, but I'm also interested in local
type analysis (e.g., to minimize conversions between wrapped and
unwrapped subtypes) and dynamic polymorphism. Pierce seems to regard
both of those as parts of the type system in so-called "typed"
languages. So what are they in an "untyped" language? Chopped liver?
> The static knowledge is explicitly a requirement, *in the accepted
> definitions*.
But the type gurus admit exceptions to that requirement all the time --
unless it's an "untyped" language, upon which they disavow them. That
doesn't wash with me.
> Whether the syntactic definitions are overly constraining or narrow is
> a separate discussion.
The syntactic definition is fine. Even the "tractable" part is fine, so
long as you regard it as a goal and not an absolute requirement. It's
the part where they make exceptions for some languages but not others
that bothers me.
> That was the point I was getting at with the case where dynamic type
> errors, instead of terminating a program, are used to change control
> flow in some other way. You'd need to account for issues like this if
> you want to draw a stronger connection between dynamic and static
> types than the syntactic definitions currently permit.
There's already a strong connection between the two. However, the type
gurus seem to disavow it whenever it threatens to bless the type systems
in "untyped" languages, and that double standard bothers me. Maybe I'm
just not getting something.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Bradd W. Szonye 2004-05-14, 11:32 pm |
| William D Clinger wrote:
> This syntactic versus semantic business is culture-dependent. In
> the culture of compiler writers, everything that happens after
> context-free parsing is called "semantic", even though it happens
> at compile time. Thus all undefined variable and type errors are
> "semantic" in that culture.
For example, the Dragon Book [Aho, Sethi, Ullman, 1986] states, "The
semantic analysis phase checks the source program for semantic errors
and gathers type information for the subsequent code-generation phase
.... An important component of semantic analysis is type checking."
I think much of the outrage against Matthias Felleisen's original claim
arose from this culture clash. I definitely read it with the compiler-
construction mindset, where "syntax" almost always means "context-free
parsing."
By the way, the Dragon Book also offers a definition of "type system":
A /type system/ is a collection of rules for assigning type
expressions to the various parts of a program. A type checker
implements a type system.
While the authors consider type checking a part of semantic analysis,
they specify and apply type rules in a "syntax-directed manner." If I
remember correctly -- it's been a few years -- the Dragon Book's
"syntax-directed" analysis is very similar to or the same as what Pierce
and Cardelli would call a "tractable syntactic method."
Next, the book describes "Static and Dynamic Checking of Types":
Checking done by a compiler is said to be static, while checking
down when the target program runs is termed dynamic. In principle,
any check can be done dynamically, if the target code carries the
type of an element along with the value of that element.
A /sound/ type system eliminates the need for dynamic checking for
type errors because it allows us to determine statically that these
errors cannot occur when the target program runs.
This is, of course, the primary goal for type systems, according to
Pierce, Cardelli, and others, although compiler writers have other major
uses for type systems (like optimization).
That is, if a sound type system assigns a type other than ["wrong"]
to a program part, then type errors cannot occur when the target
code for the program part is run. A language is /strongly typed/ if
its compiler can guarantee that the programs it accepts will execute
without type errors.
I.e., "well-typed programs do not go wrong"!
In practice, some checks can be done only dynamically [e.g., array
bounds checking in languages like Pascal that include the bounds in
the type].
Pierce notes this too, adding that there's active research in the area.
Overall, it looks to me as though Aho, Sethi, and Ullman use exactly the
same /concepts/ as Pierce and Cardelli, but they use slightly different
terminology (e.g., type systems use syntax-directed methods for semantic
analysis rather than "tractable syntactic methods") and the emphasis on
proving soundness is much reduced. The Dragon Book is /the/ book on
compiler construction; it was my compilers textbook at the University of
Michigan, and just about every systems software engineer I know owns a
copy. (I work in a systems software lab in California.)
That's a big part of why it seemed so bizarre when Matthias mentioned
"abuse of language"; to folks brought up in the compiler-construction
culture, it looked like he was abusing the terminology, not us. Now I
find it even more strange -- the type theorists and the compiler writers
use conflicting terminology, even though the latter group is probably
the primary consumer of the former group's work.
--
Bradd W. Szonye
http://www.szonye.com/bradd
| |
| Anton van Straaten 2004-05-15, 12:32 am |
| Bradd W. Szonye wrote:
>
> The syntactic definition is fine. Even the "tractable" part is fine, so
> long as you regard it as a goal and not an absolute requirement. It's
> the part where they make exceptions for some languages but not others
> that bothers me.
>
>
> There's already a strong connection between the two. However, the type
> gurus seem to disavow it whenever it threatens to bless the type systems
> in "untyped" languages, and that double standard bothers me. Maybe I'm
> just not getting something.
I don't see a double standard. I'll take a stab at explaining why. I'll
paraphrase Matthias F's simplification of Pierce's definition of type
system:
"a syntactic method for classifying terms (syntactic phrases) according
to the kinds of values they compute."
The need for proofs follows from this definition. If you remove proofs from
the picture entirely, you no longer have classification of terms. Dynamic
checks at runtime are not classifying terms, they're merely checking a
term's safety for a specific set of values. I believe this applies to all
the dynamic checking cases you mentioned. Nothing Scheme does with dynamic
checking, for example, is in any danger of "blessing" its type system.
The need for tractability follows from the need for proofs. A proof that
isn't tractable is of little use. Sure, you can design type systems that
don't support tractable proofs, but they don't help you.
If you doubt that proofs are the important point here, it might help to look
at the roots of these systems, in mathematical logic. These go back at
least to Russell in 1908, when logical type theory was created in order to
"protect logic from contradictions", by rejecting formulae that could lead
to contradictions. Today's programming language type systems are not that
different in their goals - they even use a descendant of Frege's inference
notation, for type judgements. Phil Wadler has some great papers about this
at http://homepages.inf.ed.ac.uk/wadle...cs/history.html . The proof
connection is strengthened by the Curry-Howard correspondence, which maps
logical proofs to progams in typed lambda calculi.
So, syntactic type theory is all about proofs, and the only point that needs
to be addressed is the dynamic checking issue. In static type systems,
dynamic checks arise when as a result of the (static) proofs, areas of the
program are identified for which proofs of safety aren't possible within the
type system. In these places, runtime checks are inserted into the program
to ensure safety.
These checks aren't proofs, they're tests. This can be considered an
application of the type system to the running program - it's plugging a gap
in the type system's proof capabilities. If you increase the size of this
gap to the point where you're not able to prove anything, and are doing
nothing *but* plugging the gap by ensuring that type errors are caught as
soon as they occur at runtime, then you're dealing with a fundamentally
different kind of system. The distinction between these two kinds of
systems is described in the literature as the difference between typed
systems and untyped systems.
The dynamic checking than is done in an untyped system is identical (afaict)
to the dynamic checking that's done in a typed system, but the lack of any
static checking (proofs) in the untyped case means that none of the criteria
needed to be called a type system have been met. This isn't a double
standard, it's just a simple consequence of the definition.
The definition isn't arbitrary, either: these type systems have very strong
formal descriptions, which leave little room for fudging the meaning of
things. You can't draw meaningful parallels between runtime types and
syntactic types if the type system doesn't support those parallels, unless
you develop your own new theory to encompass them both.
If you take the strongest syntactic terminological position on all this,
then everything Pierce says about type checking at runtime is "slightly
improper" (Cardelli). From that strict position, types are statically
associated with terms, and anything that doesn't fall into this category is
not a type. Will Clinger's recent posts have talked about this. The nice
thing about this position is that it's completely unambiguous: it eliminates
any fuzziness by simply defining away any areas that are ambiguous. The
dynamic checks are not dealing with "types", they're doing tests on values,
etc.
Keeping that strict position in mind, it's easier to keep track of the
confusion that ensues once you admit the notion of a runtime type, even if
it's just in the sense that Pierce does. The reason for banning the word
"type" in this context is precisely to avoid this confusion. If you do use
the word "type" in this context, then you have to be aware that it's not the
same thing as the syntactic types about which proofs were done before
runtime. It may be a semantic model of those types, or something, but it's
still a different thing, which is used specifically to do things which the
type system couldn't do statically. So the runtime entities that you're
comparing Scheme's types to aren't actually types, in the strict syntactic
sense.
> Now, I realize that such languages are
> "uninteresting" if you're interested in proving soundness, and Pierce
> notes the difficulty of adding a non-vestigial type system to them,
> so I understand why there's little interest in exploring them further.
Right. So the point is that if you're interested in the "type systems" of
dynamically-typed languages, you're not going to find good off-the-shelf ana
lyses of them in the syntactic type literature. But I don't think it helps
to try to force the syntactic definitions to fit dynamically-typed
languages. You're really talking about developing an entirely new theory.
Anton
| |
| Anton van Straaten 2004-05-15, 12:32 am |
| Bradd W. Szonye wrote:
> The Dragon Book is /the/ book on compiler construction;
Except that it's old enough that the academic scene has changed pretty
dramatically since then.
I bet it doesn't mention compiling with continuations...
Anton
| |
| Bradd W. Szonye 2004-05-15, 3:32 pm |
| Bradd W. Szonye wrote:
Anton van Straaten wrote:[color=darkred]
> Except that it's old enough that the academic scene has changed pretty
> dramatically since then.
The engineering scene hasn't, though. Partly, that's because mainstream
software engineering mostly uses languages like C, C++, Perl, and Java,
which rely heavily on dynamic checking when they bother with type safety
at all. (In C it's even worse, because it relies on the programmer to
insert the dynamic checks.) I don't see that changing any time soon,
either. Perl likes to embrace new models, but it's so ad hoc that
there's probably no hope of it ever having a useful static type system.
I won't comment on Java, because I don't have enough experience with the
language to judge it fairly, and the little contact I've had with the
Java community has been with the minimally-skilled GUI-hacking segment.
C and C++ could possibly take advantage of that academic scene,
especially C++ which (like Perl) has been eager to embrace new stuff.
However, it'd be very slow going. Legacy compatibility is a big deal in
the C/C++ world, and the ISO standards process reinforces that. It takes
about 5 years to get a new standard version rolling, another 5 years to
publish the standard, and another 5 years to see widespread adoption. In
2004, mainstream developers are still reluctant to use C99 features
because support for them is so spotty.
Faster change is possible if the programming community mandated change,
but I just don't see that happening either. The "software crisis" of
low-quality code is decades old now, and experienced programmers have
seen so many automatic and process-oriented "solutions" that they view
new safety techniques with great suspicion, especially when they have a
"holy grail" or "silver bullet" aspect. Furthermore, good engineers see
safety as just one component of a larger benefits + risks + costs
equation, and great engineers realize that software is just one
component of the overall system. Bjarne Stroustrup described this
attitude well in 1994 (/The Design and Evolution of C++,/ p. 93):
I saw, and still see, type checking as a practical tool rather than
a goal in itself. It is essential to realize that eliminating every
type violation in a program doesn't imply that the resulting program
is correct or even that the resulting program cannot crash because
an object was used in a way that was inconsistent with its
definition. For example, a stray electric pulse may cause a critical
memory bit to change its value in a way that is impossible according
to the language definition. Equating type insecurities with program
crashes and program crashes with catastrophic failures such as
airplane crashes, telephone system breakdowns, and nuclear power
station meltdowns is irresponsible and misleading.
People who make statements to that effect fail to appreciate that
the reliability of a system depends on all of its parts. Ascribing
an error to a particular part of the total system is simply pin-
pointing the error. We try to design life-critical systems so that a
single error or even many errors will not lead to a "crash." The
responsibility for the integrity of the system is in the people who
produce the system and not in any one part of the system. In
particular, type safety is not a substitute for testing even though
it can be a great help in getting a system ready for testing.
Blaming programming language features for a specific system failure,
even a purely software one, is confusing the issue ....
While type theory may have advanced greatly in the 10 years since then,
these engineering principles haven't changed. Mathematical proofs cannot
replace testing and monitoring, because computer programs are not just
abstract mathematical constructs; they're also physical machines. Of
course, there's some value in proving the soundness of a program,
because it eliminates one point of failure. However, a good engineer
won't use them unless the benefit outweighs the cost. Practically
speaking, automatic proof techniques must offer significantly more
safety, power, or convenience than what you already get from the
necessary testing and monitoring, or it's not worth the cost to an
engineer. In some domains, like life-critical systems, you can usually
justify even small safety improvements with high costs. However, the
feature- and schedule-driven world of mainstream software engineering
won't accept the costs and constraints of static proof methods until
they get a lot cheaper (or the cost of not having them demonstrably
becomes much worse).
In the meantime, mainstream engineers /will/ take advantage of type
systems in other ways. They encourage abstraction and internal
documentation, which makes code review easier and fosters reliability in
a more general sense. And of course even "small proofs" enable static
optimizations. Pierce notes that the "first type systems in computer
science ... were introduced to improve the efficiency of numerical
calculations," and optimization has been a significant application for
type systems ever since, perhaps the primary application.
Ironically, using type systems to prove optimization safety compromises
overall system safety (in Stroustrup's sense). For example, Pierce
explains that "further efficiency improvements are gained by eliminating
many of the dynamic checks that would be needed to guarantee safety (by
proving statically that they will always be satisfied)." But eliminating
dynamic checks -- a kind of system monitoring -- makes the overall
system more vulnerable to environmental failures like memory errors. In
other words, type systems enable optimization of the abstract machine by
sacrificing some robustness of the physical machine.
> I bet [the Dragon Book] doesn't mention compiling with continuations...
Correct!
/Compiler Design/ by Reinhard Wilhelm and Dieter Maurer (a 1995 survey
of compilation techniques for several programming models) does mention
"continuation addresses" in the context of functional and logic
programming languages, but only in the limited sense of continuations as
a generalization of return addresses, not as a more general model of
control flow.
That book also describes types as a "semantic property" rather than a
syntactic one, and it gives examples of static and dynamic types. If
anything, it's even more at odds with type theory than the Dragon Book.
(By the way, I wouldn't particularly recommend this book except as a
broad introduction to different programming paradigms and their
compilation issues.)
--
Bradd W. Szonye
http://www.szonye.com/bradd
|
|
|
|
|