Home > Archive > Scheme > October 2004 > Type checking in Scheme?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Type checking in Scheme?
|
|
| \(define Author 'Me\) 2004-09-26, 3:56 am |
| Hi I have two questions.
1. How do I go about doing type checking in Scheme?....say I have a
procedure called 'Add' that
is like so:
(Add this that). Well we want it to add this to that BUT ONLY if
they have the same units, ie
if this is '20 dollars' and that is '5 hours' , it's an error since we're
adding two different types of things (dollars to hours).
2. Is there a simple way for a procedureX to know whether the arguments
it is receiving are created by procedureY?
That is to say, if procedureY had not created the arguments x is receiving,
procedureX would simply not execute and return error.
| |
| Ray Dillinger 2004-09-26, 3:56 am |
| (define Author 'Me) wrote:
> Hi I have two questions.
>
> 1. How do I go about doing type checking in Scheme?....say I have a
> procedure called 'Add' that
> is like so:
> (Add this that). Well we want it to add this to that BUT ONLY if
> they have the same units, ie
> if this is '20 dollars' and that is '5 hours' , it's an error since we're
> adding two different types of things (dollars to hours).
Easy to do, I guess, if you have some standard notation
for affixing units to numbers. If for example, you are
using (number . unit) pairs, you can define Add this way:
(define Add
(lambda (this that)
(and (pair? this)
(pair? that)
(eq? (cdr this) (cdr that))
(number? (car this))
(number? (car that))
(cons (+ (car this) (car that)) (cdr this)))))
(Add '(2 . feet) '(3 . feet)) => (5 . feet)
(Add '(2 . feet) '(10 . pounds)) => #f
I hope this isn't homework...
> 2. Is there a simple way for a procedureX to know whether the arguments
> it is receiving are created by procedureY?
> That is to say, if procedureY had not created the arguments x is receiving,
> procedureX would simply not execute and return error.
There is not an easy way to do this. There is an insane way,
but it involves define-syntax, dynamic-wind and call/cc.
Bear
| |
| \(Define\) 2004-09-26, 9:07 am |
|
"Ray Dillinger" <bear@sonic.net> wrote in message
news:A2s5d.15147$54.229892@typhoon.sonic.net...[color=darkred]
> (define Author 'Me) wrote:
we're[color=darkred]
>
> Easy to do, I guess, if you have some standard notation
> for affixing units to numbers. If for example, you are
> using (number . unit) pairs, you can define Add this way:
>
> (define Add
> (lambda (this that)
> (and (pair? this)
> (pair? that)
> (eq? (cdr this) (cdr that))
> (number? (car this))
> (number? (car that))
> (cons (+ (car this) (car that)) (cdr this)))))
>
> (Add '(2 . feet) '(3 . feet)) => (5 . feet)
> (Add '(2 . feet) '(10 . pounds)) => #f
>
> I hope this isn't homework...
>
arguments[color=darkred]
receiving,[color=darkred]
Let's say you wanted ADD to not even execute until you were sure that this
and that were BOTH
legitimate arguments created by procedure CREATE-UNITS or something like
that. ie to say
you can worry about units mismatch only after you're sure you're *getting*
unit-based-arguments (eg. 2 seconds, 9.8m/seconds^2)
in the first place (as compared to junkstrings).
So CREATE-UNITS can handle that, ie create legitimate units ready to
process. But is there a way for ADD to know whether its arguments
are legitimate ie from CREATE-UNITS or not?
| |
| Jesper Louis Andersen 2004-09-26, 9:07 am |
| "\(define Author 'Me\)" <mail@null.com> wrote:
> 1. How do I go about doing type checking in Scheme?....
The simplest way, as already said, is to add a type tag to the value
you are passing and check this type-tag when entering the function.
You can even generalize the check to a separate function and call
this to ensure the arguments are correct.
The problem with the approach is, of course, that you end up typing
more than you normally would. The lack of types in Scheme opens for
some extra possibilities and hopefully also allows for constructing
the code faster.
But if you want types, I think it is far better to go with a language
which supports them from the ground up. It is also my bias that I
prefer statically typed languages.
--
j.
| |
| Ray Dillinger 2004-09-26, 3:57 pm |
| (Define) wrote:
> Let's say you wanted ADD to not even execute until you were sure that this
> and that were BOTH
> legitimate arguments created by procedure CREATE-UNITS or something like
> that. ie to say
> you can worry about units mismatch only after you're sure you're *getting*
> unit-based-arguments (eg. 2 seconds, 9.8m/seconds^2)
> in the first place (as compared to junkstrings).
> So CREATE-UNITS can handle that, ie create legitimate units ready to
> process. But is there a way for ADD to know whether its arguments
> are legitimate ie from CREATE-UNITS or not?
>
>
You just want a user-defined disjoint aggregation type.
It's been done for you as SRFI-9; find it at http://srfi.schemers.org.
It's a library (usually implemented in C, but scheme versions involving
the "insane" method I referred to do exist) that creates records
disjoint from all R5RS scheme types.
Personally, I think it's a bad idea, but if you come from a bondage-
and-discipline tradition like statically typed languages, it probably
seems like a good idea, at least at first. And as long as you
continue to feel like it's a good idea, and you don't need to write
codewalkers or build generic functions that operate on *all* data,
you should probably keep using it.
Disjoint record types are one of the constructs that seems like
a good idea at first, and do help somewhat with the simplest kinds
of abstraction, but eventually start getting in your way because
they constrain higher-level abstractions.
Bear
| |
| Marcin 'Qrczak' Kowalczyk 2004-09-26, 3:57 pm |
| Ray Dillinger <bear@sonic.net> writes:
> Disjoint record types are one of the constructs that seems like
> a good idea at first, and do help somewhat with the simplest kinds
> of abstraction, but eventually start getting in your way because
> they constrain higher-level abstractions.
I disagree. What do you propose instead? Lists starting with a symbol?
Here are their di vantages:
- Errors in putting a wrong kind of a record aren't detected when
fields are extracted, unless you are lucky and the given data is
a shorter list than the expected record type or not a list at all.
Similarly, putting a record when a list is expected doesn't lead
to an immediate error but data is misprocessed.
- There is no compile-time nor runtime checking for the number of
arguments when the "record" value is constructed.
- There are no warnings for misspellings of record types, nor - when
there is an evolving family of record types encoding various cases -
no warnings for using obsolete record types whose definitions would
have been deleted if they were explicit.
- The code to extract a field doesn't use field name, it's not clear
what caddr means in this context.
- You can't grep code for extraction of a particular field, caddr is
too ambiguous.
- If you represent two-argument records as conses rather than lists
with two-elements, code using them breaks when a third field is
added later. Even if you use lists, code using them breaks when
a field is removed - not only code using that particular field,
but code using later fields as well.
--
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
| |
| Matthias Blume 2004-09-26, 8:57 pm |
| Ray Dillinger <bear@sonic.net> writes:
> Disjoint record types are one of the constructs that seems like
> a good idea at first, and do help somewhat with the simplest kinds
> of abstraction, but eventually start getting in your way because
> they constrain higher-level abstractions.
*BS*
| |
| Ray Dillinger 2004-09-27, 3:57 am |
| Matthias Blume wrote:
> Ray Dillinger <bear@sonic.net> writes:
>
>
>
>
> *BS*
I acknowledge that there's room to disagree, and if you
won't ever need to write a code-walker or a data-walker,
you can ignore my opinion. I'm not going to pursue an
argument about it.
Bear
| |
| Ray Dillinger 2004-09-27, 3:57 am |
| Marcin 'Qrczak' Kowalczyk wrote:
> Ray Dillinger <bear@sonic.net> writes:
>
>
>
>
> I disagree. What do you propose instead? Lists starting with a symbol?
> Here are their di vantages:
....Elided...
Hey, let's not have an argument. I told him in so many
words that as long as he continues to think they're a good
idea, he should keep using them.
I don't see where that leaves us anything to argue about.
Bear
| |
| \(Define\) 2004-09-27, 4:02 pm |
|
"Ray Dillinger" <bear@sonic.net> wrote in message
news:1RB5d.15175$54.231038@typhoon.sonic.net...[color=darkred]
> (Define) wrote:
>
this[color=darkred]
*getting*[color=darkred]
>
I'm not sure if it would work but can a creating procedure's id be embedded
into the argument? In other words
say we have Define Salary (create-salary 60000 dollars)......now apart from
concatenating 60000 with dollars, create-salary also adds a 'Createdyby' tag
to each salary it creates
Then we have (define IsSalary? S) such that issalary checks the cddr of S
for the 'createdby' tag.
Finally, (Add Bonus Salary) wouldn't even execute until it checks with
IsSalary? to make sure both bonus and salary were created by CREATE-SALARY.
| |
| David Rush 2004-09-27, 4:02 pm |
| Matthias Blume <find@my.address.elsewhere> writes:
> Ray Dillinger <bear@sonic.net> writes:
>
>
> *BS*
*Harsh*. "BS-ish", if you please.
I really wish people would get over the data-representation issue. All
you really need to know is the functional interface. And if you need
to write a code or data-walker, you need to keep a registry of
canonical representation transformers. This is not a Big Deal. I've
implemented several lightweight persistence interfaces, and am finally
doing a full-featured one based entirely on this principle.
david rush
--
The truth may be out there, but lies are inside your head.
-- Terry Pratchett, in _Hogfather_
| |
| Ray Dillinger 2004-09-27, 4:02 pm |
| David Rush wrote:
> All
> you really need to know is the functional interface. And if you need
> to write a code or data-walker, you need to keep a registry of
> canonical representation transformers. This is not a Big Deal. I've
> implemented several lightweight persistence interfaces, and am finally
> doing a full-featured one based entirely on this principle.
Interesting!
My best idea for doing a code or data-walker in the presence
of user-defined types was that the programmer simply needs
some universal accessors and operators. I'd been thinking of
using integer calls as universal accessors, based on an idea
I got from Paul Graham's Arc papers. For example, (0 foo)
can be read as (car foo) or (vector-ref foo 0) or (string-ref
foo 0) or (array-ref foo 0 0 0) for a 3-dimensional array, or
whatever other "get me the first element" operation is
appropriate, including the specially-named methods for accessing
record and object fields. It would be easy to do at the level
of language implementation, (get the nth lisp value in the
object) but much harder to do within the constraints of scheme.
In scheme, since you can't call a number or give a number
procedure semantics portably, this was going to become
(%accessor% 0 foo) and (%setter%! 0 foo) where %accessor%
and %setter%! are universal accessing and setting functions,
named with % marks because they explicitly break type
opacity in the case of records.
But %accessor% and %setter%! are hard to implement efficiently
in scheme, and have to keep a database of accessor functions
and type predicates and (in the case of %setter%!) set type
constraints, and the database has to be updated and managed at
every twist and turn, and a lot of record libraries don't even
expose getters and setters for all of their fields. I
eventually concluded that if you think you might *ever* need
code walkers, it's best to just avoid aggregate types.
Have you found an efficient, reliable way to implement this?
Bear
| |
| David Rush 2004-09-28, 9:04 am |
| Ray Dillinger <bear@sonic.net> writes:
> David Rush wrote:
>
<chop>[color=darkred]
> But %accessor% and %setter%! are hard to implement efficiently
> in scheme, and have to keep a database of accessor functions
> and type predicates and (in the case of %setter%!) set type
> constraints, and the database has to be updated and managed at
> every twist and turn, and a lot of record libraries don't even
> expose getters and setters for all of their fields.
<chop>
> Have you found an efficient, reliable way to implement this?
For certain values of 'efficient' and 'reliable' perhaps. I suspect
that my constraints will not be pleasing to you. I generally do not
use the REPL as an inbuilt command interface, so the type structure of
my programs is probably rather more static than in that environment. I
also make heavy use of functors in my code which has two stylistic
side-effects. I write very little code at top-level. I also routinely
hand sets of functions off to various susbsystems. So yes, I have an
efficient and reliable way to do this within my coding style.
I should also point out that the generic[1] library version of this
persistence methodology is still a work in progress, my architecture
may change if it doesn't work out. I believe that this method could
easily be incorporated into the macrology used to implement SRFI-9,
although I haven't bothered to do it yet. It would certainly be more
graceful if your style of programming uses primarily first-order
abstractions to have it rolled into the record type declaration,
anyway.
I do think the problem gets quite a bit more difficult in a
traditional Lisp/REPL oriented environment. The versioning issues
associated with structure re-definition can be quite frustrating from
a user view. The implementor's pov appears to be difficult as well, or
so my p ing under the hood at S48 internals has led me to believe.
david rush
1 - previous implementations have had a certain amount of
application-specific knowledge built into them. This is actually a
classic refactoring case study...
--
The Torah is written in black fire inscribed upon white fire - fire
mixed with fire, hewn out of fire and given from fire
-- 3rd Century Palestinian Merkavah Haggadah
| |
| Ray Dillinger 2004-09-28, 4:13 pm |
| All told, I think it's much easier to build universal
accessors and setters at the implementation level than
at the program code level.
I don't know how many implementations work this way, but
for me all the aggregation types - arrays, vectors,
pairs, string segments, etc. - are the same underlying
binary type. I can find the boundaries of the soft
pointers array, the boundaries of the lisp values array,
and the boundaries of the aux-data area without knowing
or caring what exact *kind* of aggregation type it is.
So it's easy, in the guts of the implementation, to write
a get-the-nth-value or a set-the-nth-value function; but
much harder at the program code level where you have to
laboriously keep track of mappings to type-specific
accessors to do it.
So I'm making two extensions: nonnegative integers have
call semantics as a universal accessor, and if the first
argument of set! is a nonnegative integer, it acts as a
universal slot setter where (set! n foo bar) will set the
nth slot of the aggregate-type foo to the value bar.
This means I don't have to keep track of a registry of
accessors and setters, and I can write code and data
walkers as needed even if user-defined record types are
allowed.
But this isn't, really, a scheme compiler any more.
Between mu functions, a dangerous defmacro that can
capture local variables, and lots and lots of different
basic constants (numbers, strings, characters) having
call semantics like functions, it's definitely turning
into something else.
Bear
| |
| David Rush 2004-09-29, 5:00 am |
| Ray Dillinger <bear@sonic.net> writes:
> But this isn't, really, a scheme compiler any more.
> Between mu functions, a dangerous defmacro that can
> capture local variables, and lots and lots of different
> basic constants (numbers, strings, characters) having
> call semantics like functions, it's definitely turning
> into something else.
Well it doesn't sound like RnRS for any `n', but IIRC you still have:
lexical scope
full TCO
s-expression syntax
first-class & anonymous functions
which would definitely put it into the Scheme family, yes? Is your
language design still a Lisp-1?
david rush
--
As I've gained more experience with Perl it strikes me that it resembles
Lisp in many ways, albeit Lisp as channeled by an awk script on acid.
-- Tim Moore (on comp.lang.lisp)
| |
| Ray Dillinger 2004-09-29, 8:09 pm |
| David Rush wrote:
> Ray Dillinger <bear@sonic.net> writes:
>
>
>
> Well it doesn't sound like RnRS for any `n', but IIRC you still have:
>
> lexical scope
> full TCO
> s-expression syntax
> first-class & anonymous functions
>
> which would definitely put it into the Scheme family, yes? Is your
> language design still a Lisp-1?
>
scheme "family", using the term loosely. I tossed out scheme
macrology because I thought it was wrong. I'm going with a
case-sensitive design because it was much less of a pain-in-
the-butt than trying to get unicode case-folding working
correctly. And several other blatant "this is not scheme"
standard violations and extensions including inexact booleans
(which you can get, for example, from comparing inexact
numbers).
Uh... as for it being a lisp-1, I don't do the common-lisp
thing where namespace is separated by callable vs. noncallable
values, but variables can have any number of named attributes.
And the attributes can have any number of named attributes,
ad nauseam. This is what I use instead of records, objects,
and module namespaces.
So, a compiled module, when imported, simply becomes a top-level
variable in the current module that has various slots, just like
a "record" defined in the current module. Things defined in
the imported module, you can refer to as module:thing, the same
way you refer to record:slot for records in the current module.
I guess "attributes" have been a part of lots of lisps, including
several schemes, and are the traditional "lispy" way to do what
is done in statically-typed languages with record or object types;
But by implementing them with hash tables, I can make them a lot
more efficient and use them for the module system as well.
Anyway, here's a fairly obvious example of how it's used
to aggregate a bunch of values (the way you expect to use
records).
(define in-circle? (lambda (point cir)
"takes two arguments, a point and a circle.
returns true if the point is in the circle."
(< (+ (sqr (- point:x cir:center:x))
(sqr (- point:y cir:center:y)))
(sqr (cir:radius))))
"lambda" in this instance returns a function value; and
the function value also has a named attribute named docstring.
So after this definition, in-circle? has a function value, and
in-circle:docstring is set to the string explaining what the
function does. The function itself accesses attributes in
its arguments, which will be defined if its arguments are
in fact a point and a circle (assumed to have definitions
elsewhere in the same module).
If any of the slots are undefined, a reference to them
produces an error token. Mathematical operations involving
an error token produce a NaN (and an error message at the
console if linked with the REPL). A less-than check on
a number and a NaN, or on two NaN's, will produce a NaB
(and an error message at the console if linked with the
REPL).
So this function returns true or false on valid arguments,
or it produces a NaB (Not a Boolean) if you hand it two
variables that don't have these slots defined.
It doesn't really care whether these are simple records
or imported modules that have these variables bound at
their toplevel; it's all the same representation.
This is also one of the reasons I'm not doing scheme
macrology; macros and mu functions are first-class,
runtime objects that can be imported from another
module, stored in a variable, or returned from a
function.
Bear
| |
| Marcin 'Qrczak' Kowalczyk 2004-09-29, 8:09 pm |
| Ray Dillinger <bear@sonic.net> writes:
> And several other blatant "this is not scheme" standard violations
> and extensions including inexact booleans (which you can get, for
> example, from comparing inexact numbers).
How does 'if' behave for inexact booleans?
> If any of the slots are undefined, a reference to them
> produces an error token. Mathematical operations involving
> an error token produce a NaN (and an error message at the
> console if linked with the REPL). A less-than check on
> a number and a NaN, or on two NaN's, will produce a NaB
> (and an error message at the console if linked with the
> REPL).
I would suppose that this design leads to harder finding of bugs,
where the result of a complicated function is NaN/NaB/etc. and
you don't know which input was bad - compared to signalling the
problem as soon as it's detected.
--
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
| |
| Ray Dillinger 2004-09-29, 8:09 pm |
| Marcin 'Qrczak' Kowalczyk wrote:
> Ray Dillinger <bear@sonic.net> writes:
>
>
>
>
> How does 'if' behave for inexact booleans?
At the moment, the same way it behaves for exact ones.
However, when I am done with it, it should probably
make the if expression return inexact results - which
implies inexact characters, strings, etc. Erf. Hadn't
thought of that. Definitely should have.
[color=darkred]
> I would suppose that this design leads to harder finding of bugs,
> where the result of a complicated function is NaN/NaB/etc. and
> you don't know which input was bad - compared to signalling the
> problem as soon as it's detected.
>(in-circle? 2 4)
#NaV errortoken produced in 2:x : line 425
#NaV errortoken produced in 4:center:x : line 425
#NaN produced in (- #NaV #NaV) : line 425
#NaV errortoken produced in 2:y : line 426
#NaV errortoken produced in 4:center:y : line 426
#NaN produced in (- #NaV #NaV) : line 426
#NaN produced in (+ #NaN #NaN) : line 425
#NaV errortoken produced in 4:radius : line 427
#NaN produced in (sqr #NaV) : line 427
#NaB produced in (< #NaN #NaN) : line 425
#NaB returned from in-circle? : line 423
#NaB produced in (in-circle? 2 4) : current prompt
=> #NaB
>(%show-lines% 423 427)
(define in-circle? (lambda (point cir)
"takes two arguments, a point and a circle.
returns true if the point is in the circle."
(< (+ (sqr (- point:x cir:center:x))
(sqr (- point:y cir:center:y)))
(sqr (cir:radius))))
=> '()
I'm not having a problem finding bugs... and if you
don't consider this to be signalling the "error" as
soon as detected, you're wrong.
Besides, it might not *be* an error. NaB is a value that
has semantics, and correct code can try something, get
a NaB, detect it, then try to do it a different way. Of
course, if it's correct code, it needs an explicit way
to suppress debug messages when run under the REPL, and
I haven't built that yet.
Bear
| |
|
|
| Marcin 'Qrczak' Kowalczyk 2004-09-30, 11:37 am |
| Ray Dillinger <bear@sonic.net> writes:
>
>
> I just wanted to point out that pretending I said something
> else, and then pointing out the di vantages of what you
> pretend I said, is considered very poor form.
Ok, I'm sorry.
So what do you propose instead?
--
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
| |
| Richard C. Cobbe 2004-09-30, 11:37 am |
| Ray Dillinger <bear@sonic.net> writes:
> Marcin 'Qrczak' Kowalczyk wrote:
>
> At the moment, the same way it behaves for exact ones.
> However, when I am done with it, it should probably
> make the if expression return inexact results - which
> implies inexact characters, strings, etc. Erf. Hadn't
> thought of that. Definitely should have.
What about the following?
(if (expr-which-evals-to-inexact-boolean)
"foo"
42)
Do you return an inexact-string-or-number? Or do you constrain the two
arms of an if to have the same type?
Or, even better, what about this?
(if (expr-which-evals-to-inexact-boolean)
"foo")
I really don't think inexact control flow is a very good idea. What
happens if your termination condition in a recursive function produces an
inexact boolean? Do you keep looping until it's exact? (Laziness might
help here, but I don't see how you can apply it in all cases.)
Richard
| |
| Richard C. Cobbe 2004-09-30, 4:08 pm |
| cobbe@ccs.neu.edu (Richard C. Cobbe) writes:
> Ray Dillinger <bear@sonic.net> writes:
>
>
> What about the following?
>
> (if (expr-which-evals-to-inexact-boolean)
> "foo"
> 42)
>
> Do you return an inexact-string-or-number? Or do you constrain the two
> arms of an if to have the same type?
>
> Or, even better, what about this?
>
> (if (expr-which-evals-to-inexact-boolean)
> "foo")
And even more betterer (or something like that),
(if (expr-which-evals-to-inexact-boolean)
(begin (display "in first arm")
(newline)
3)
4)
I can see this evaluating to an inexact integer, but how do you "inexactly"
print something?
Richard
| |
| Joe Marshall 2004-09-30, 4:08 pm |
| cobbe@ccs.neu.edu (Richard C. Cobbe) writes:
> And even more betterer (or something like that),
>
> (if (expr-which-evals-to-inexact-boolean)
> (begin (display "in first arm")
> (newline)
> 3)
> 4)
>
> I can see this evaluating to an inexact integer, but how do you "inexactly"
> print something?
Like this:
``And even more better, what abooleastern Universitely"
print something like thating to an inexact arm")''
| |
| Ray Dillinger 2004-09-30, 4:08 pm |
| Marcin 'Qrczak' Kowalczyk wrote:
> Ray Dillinger <bear@sonic.net> writes:
>
>
>
>
> Ok, I'm sorry.
>
> So what do you propose instead?
>
An aggregation type that plays nice with reflection.
That means:
A) it's not fully disjoint. There's a *SINGLE* predicate
for it among the disjoint types, probably named
(record? datum) or (object? datum). Beyond that,
there should be some more-specific type predicates
that can be defined by the same library that defines a
particular record type, and also some way to ask it,
"what exact kind of record are you?" without having to
know in advance about every possible record type somebody
might ever define or maintain a list of predicates.
B) It's transparent. code in another module must not
need to know all the field names of a particular record
in order to iterate, without errors, through the values
stored in those fields. And doing so should be simple.
Beyond that, there should definitely be named accessors
corresponding to the field names that can be defined by
the same library that defines a particular record type.
The basic idea is that code which has to know about the
semantics of the particular record type can be written
using the knowledge: with named accessors, named constructors
of known arity, etc, that will cause people to know when
they're getting it wrong. But code that cannot know about
the semantics of the record type (for example, a routine from
a generic library that is supposed to comb arbitrary data of
any structure looking for references to a particular object)
can still be written using universal accessors. So, "disjoint
types" as commonly implemented, simply don't meet these
requirements.
The rest isn't *as* important, but I personally don't like
_having_ to have type declarations of any kind; In my
universe, declare-record-type, and all operations specialized
to particular types of record, is probably just function
wrappers and syntactic sugar over universal record operations
that can create a record without having to decide what record
subtype it is, add or delete any named slot to or from any
record, interrogate any record to find out what its field names
are, and arbitrarily mutate fields, including a field that
gives the record subtype.
And these universal operations should be available to the user,
in case he wants to implement a record or object library (or
even a single record or object type) that works differently
from the "sugaring" provided. This is of course a complete
rejection of even the last vestiges of disjointness; it makes
all records be literally the same type, and the differences
between record types just a matter of syntactic sugar, a field
in each record that stores its subtype name, and a family of
function wrappers to give each subtype its own named and
type-checked constructors and accessors.
In general, I come down on the "reflection" side of the debate
between proponents of reflection/expressiveness and proponents
of encapsulation/protection. I figure programmers are supposed
to be smart people; we should rely on them to know when to not
use reflection, and trust them to know exactly how much
encapsulation and abstraction they need.
Bear
| |
| Marcin 'Qrczak' Kowalczyk 2004-09-30, 4:08 pm |
| Ray Dillinger <bear@sonic.net> writes:
> An aggregation type that plays nice with reflection.
> That means:
[...]
If I'm not mistaken, the system I designed for my language fits your
requirements. You have:
- An implicit supertype of record types, which allows to distinguish
them from other objects.
- Given a record object and a field name as a symbol, you can get
the field from the object.
- Given a record object, you can get the list of its field names.
- Given a record object, you can get the constructor function which
makes an object of the same type from field values given in order.
- Given a named function, you can get its name. Record constructors
are named functions.
This infrastructure is provided by the language. The following
operations, which are in general dispatched on types of arguments, are
defined in terms of the above on the record supertype, so they are
available for all record types unless overridden to behave differently
for a particular type:
- Equality of record values.
- Hashing of record values.
- Conversion of a record value to a string.
- Given a record value and a list of "field name, field value" pairs,
a non-destrucively updated version which differs from the original
in these fields.
In fact you can also define the core record functions for some other
type, which then mimics a record but can be implemented differently.
Except that getting a named field from an object is the behavior of
the object itself, you can't change it afterwards, so this would have
to be anticipated from the beginning.
This system does use "disjoint record types", that's why I disagreed
with your claim that it's a bad idea. Disjoint record types *are*
compatible with generic code which doesn't have to know particular
record types of objects it works with.
* * *
This system is designed for immutable records. Objects with mutable
fields exist too, but they don't have these introspective capabilities
because it's not clear what exactly and in what form should be
provided. It makes no sense to ask about fields of an arbitrary object
- it might behave as if it had an infinite number of fields, yet be
implemented completely differently; introspection shouldn't break
invariants which are preserved by normal ways of construction, etc.
So I left this out.
This would have to be different when ported to Scheme. Scheme
programmers expect all compound objects to be mutable, so records
should be mutable too. Equality is designed differently. Generic
functions are only available in some CLOS variants, etc. I don't
have a complete design localized to Scheme in mind.
* * *
Scheme doesn't use type objects. The closest thing is type predicates.
Type objects are used for the things that type predicates are used
for, but also for specialization of generic functions, and for making
dictionaries indexed by types. In my language type objects are *not*
used for making new objects of that type, unlike in Python, Smalltalk,
and Ruby.
Perhaps it would be less intrusive for Scheme to not add type objects
as separate entities, but to promote type predicates to type objects,
which would have other uses apart from checking type membership of an
object (in a similar way as Python has once promoted some conversion
functions to types, while they continue to work like constructors).
This means that for a type predicate:
- you can obtain the most specific type predicate of an object,
- you can use a type predicate as a supertype of another type,
- you can check whether one type predicate implies another,
- they are used as specializers in some CLOS port,
- they are appropriate as dictionary keys, i.e. hashable,
- you can obtain a printed name of the type from it.
This list corresponds to what you can do with a type object in my
language. I'm not familiar with Scheme CLOS ports, so I don't know
whether they use some other type objects or they reuse predicates as
types.
* * *
What is reasonable, but I happen to not provide it (at least yet):
- Given a type predicate, the ability to query properties of records
of that type (the list of fields, the constructor function).
- The ability to create new record types completely dynamically,
where the list of field names is not known at compile time (except
by implementing them manually and specializing relevant functions).
* * *
Complementary to records are singletons, i.e. types with a single
inhabitant. In fact both records and singletons came as a translation
of ML/Haskell-style algebraic types to the dynamically typed world.
For singletons you have:
- An implicit supertype of singleton types.
- Given a singleton object, its printed name.
--
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
| |
| Ray Dillinger 2004-10-01, 3:56 am |
| Marcin 'Qrczak' Kowalczyk wrote:
> If I'm not mistaken, the system I designed for my language fits your
> requirements. You have:
>
> - An implicit supertype of record types, which allows to distinguish
> them from other objects. <clip>
And then contradicted himself by saying....
> This system does use "disjoint record types", that's why I disagreed
> with your claim that it's a bad idea. Disjoint record types *are*
> compatible with generic code which doesn't have to know particular
> record types of objects it works with.
Dude, these are not disjoint. Or perhaps we're just disagreeing
on what "disjoint" means while violently agreeing on what
properties are desirable in an aggregate-type system.
The fact that you have a supertype of which all instances of any
record are members makes them non-disjoint. You can still publish
a complete list of disjoint types, which is going to include
numbers, characters, strings, records .... and with more or less
push and pull, people can write code that iterates over unknown
objects including objects of any user-defined type that may come
along.
The people who want every aggregation they dream up and name, to
be a new _disjoint_ type, make systems in which reflective code
cannot be written.
> This system is designed for immutable records. Objects with mutable
> fields exist too, but they don't have these introspective capabilities
> because it's not clear what exactly and in what form should be
> provided. It makes no sense to ask about fields of an arbitrary object
> - it might behave as if it had an infinite number of fields, yet be
> implemented completely differently; introspection shouldn't break
> invariants which are preserved by normal ways of construction, etc.
> So I left this out.
Immutable records are fine with me, but you better provide
a copy constructor that allows "copies" to be created with
specified differences.
> This would have to be different when ported to Scheme. Scheme
> programmers expect all compound objects to be mutable, so records
> should be mutable too. Equality is designed differently. Generic
> functions are only available in some CLOS variants, etc. I don't
> have a complete design localized to Scheme in mind.
I dunno; scheme supports functional programming so well that
I find mutation is usually fairly optional. I wouldn't call
it a must-have for a scheme record library. Functional
programming (ie, the style to use with immutable records) is
actually a little harder in Common Lisp because of the split
namespace.
> Scheme doesn't use type objects. The closest thing is type predicates.
> Type objects are used for the things that type predicates are used
> for, but also for specialization of generic functions, and for making
> dictionaries indexed by types. In my language type objects are *not*
> used for making new objects of that type, unlike in Python, Smalltalk,
> and Ruby.
and SRFI-9.
Bear
| |
| Marcin 'Qrczak' Kowalczyk 2004-10-01, 3:56 am |
| Ray Dillinger <bear@sonic.net> writes:
>
> And then contradicted himself by saying....
>
>
> Dude, these are not disjoint. Or perhaps we're just disagreeing
> on what "disjoint" means while violently agreeing on what
> properties are desirable in an aggregate-type system.
We must disagree what "disjoint" means. Because for me it's not
contradictory at all.
Disjoint in this context means that particular record types don't
share values with other types (vectors, lists etc.). They also don't
share values with each other, but this is obvious.
This is in contrast to e.g. Erlang, where records are expressed as
tuples having the record type name (symbol) as the first element.
They reuse a preexisting type for representation: a record is also a
tuple, and you can't reliably distinguish whether a tuple was meant
to be a record or it only happens to contain some symbol at the
beginning.
Also e.g. in Perl strings are not disjoint from numbers; in fact
numbers are a subset of strings. But in Python they are disjoint:
no value is a number and string at the same time.
> The fact that you have a supertype of which all instances of any
> record are members makes them non-disjoint. You can still publish
> a complete list of disjoint types, which is going to include
> numbers, characters, strings, records .... and with more or less
> push and pull, people can write code that iterates over unknown
> objects including objects of any user-defined type that may come
> along.
Well, I didn't say that records are the only user-defined types.
Code can still meet an object of a type it knows nothing about,
e.g. ncurses window handle. It can obtain its type object, see its
name, see that it doesn't have any interesting supertypes, but it
can't perform any useful operation on the window since it doesn't
know which operations make sense for it.
> Immutable records are fine with me, but you better provide
> a copy constructor that allows "copies" to be created with
> specified differences.
As I said:
| - Given a record value and a list of "field name, field value" pairs,
| a non-destrucively updated version which differs from the original
| in these fields.
--
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
| |
| Ray Dillinger 2004-10-01, 3:58 pm |
| Marcin 'Qrczak' Kowalczyk wrote:
> Ray Dillinger <bear@sonic.net> writes:
>
>
>
> We must disagree what "disjoint" means. Because for me it's not
> contradictory at all.
>
> Disjoint in this context means that particular record types don't
> share values with other types (vectors, lists etc.). They also don't
> share values with each other, but this is obvious.
Many of the systems I've seen use it to invoke the specter
of scheme's short list of "disjoint types." In standard
scheme, you can write reflective code by including a case
for each possible type that might be found in data. When
the set is arbitrarily extensible, it becomes impossible
to do this in a modular way because you can't write code
that *knows* what the possible types are.
By making "record" a type that you can do something useful
with, even if you don't know in advance the particular name
of the record type or the particular names of its fields, you
make a new disjoint type "record" -- but the fact that I don't
have to know what exact record type something is in order to
process it means that the disjoint type, in the R5RS sense,
is "records", not every possible user-definable subtype of
record. Basically, I'm defining disjointness in terms of
operations -- individual record types may be disjoint in terms
of values, but there is a (complete) set of useful operations
which apply to all of them, so they are not disjoint in terms
of operations.
>
>
> Well, I didn't say that records are the only user-defined types.
>
> Code can still meet an object of a type it knows nothing about,
> e.g. ncurses window handle. It can obtain its type object, see its
> name, see that it doesn't have any interesting supertypes, but it
> can't perform any useful operation on the window since it doesn't
> know which operations make sense for it.
Why is that disjoint from "record"? It's an aggregate with
several numbers, a few functions (which are going to be wrappers
for FFI functions), an array of characters, an array of colors,
and a port (which is going to be an FFI wrapper for a hardware
port), right? I'd use a record to represent it.
Bear
| |
| Marcin 'Qrczak' Kowalczyk 2004-10-01, 3:58 pm |
| Ray Dillinger <bear@sonic.net> writes:
> Many of the systems I've seen use it to invoke the specter
> of scheme's short list of "disjoint types." In standard
> scheme, you can write reflective code by including a case
> for each possible type that might be found in data. When
> the set is arbitrarily extensible, it becomes impossible
> to do this in a modular way because you can't write code
> that *knows* what the possible types are.
What is the practical difference for such generic code between an
object of an unknown type and a procedure? You can't apply a procedure
if you don't know what arguments it expects. You can't p inside its
contents. You can only pass it around, compare it by eq? or eqv? with
other objects, maybe try to write it somewhere, with no guarantee that
the effect will be readable or otherwise useful. The same you could do
with an opaque object of a non-standard type you know nothing about.
>
> Why is that disjoint from "record"?
It's a wrapper for an opaque C pointer with a finalizer. You can't do
anything useful with it if you aren't prepared for handling ncursers
window handles in particular. There are no interesting generic
operations which would make sense for it, except perhaps those that
you can rely on anyway without knowing the type (eqv?). So I see no
gain in pretending that the type is known.
--
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
| |
| Ray Dillinger 2004-10-01, 3:58 pm |
| Marcin 'Qrczak' Kowalczyk wrote:
> Ray Dillinger <bear@sonic.net> writes:
>
>
>
>
> What is the practical difference for such generic code between an
> object of an unknown type and a procedure? You can't apply a procedure
> if you don't know what arguments it expects.
Most introspection functions on procedures are extensions in
scheme, but in principle, a scheme implementor should be able
to provide functions like (arity fn) which returns the number
of arguments, (argtype fn 0) which returns a list of acceptable
types for argument 0 of fn, (docstring fn) which returns the
docstring for the function, (side-effecting? fn) which indicates
whether it has any side effects, (side-effected? fn) which
indicates whether its behavior is inflenced by anything outside
its arguments, (stateful? fn) which indicates whether it has
internal state that could cause its behavior to differ on
the same arguments from one call to the next, etc... In
fact, if you're writing a compiler, you get most of these "for
free" since you'll figure them out anyway in the course of doing
routine optimizations. Hell, how about having an 'unlambda'
function that takes a procedure and returns a list (starting
with the symbol 'lambda') which, if given to 'eval', would
produce an eqv procedure?
And if the implementor was kind enough to guarantee a signal
return in case of error (rather than crashing and halting) you
can indeed call an unknown function and be confident of not
crashing. If it's pure-functional (not stateful, side-effecting,
or side-effected), you can even call it and be confident that
you're not going to cause any changed behavior by doing so.
My point though, is that you *know* what you can do with
procedures. It can be more, or less, in any given scheme
implementation, but if you know the implementation, at least
you can write libraries within its capabilities that will work
modularly with any scheme code.
But the thing with records is they're too simple to present the
kind of implementation difficulties that good introspection on
procedures runs into. They're just aggregations, for crying out
loud! There's no excuse for having them tucked under themselves
in one top-level disjoint type each, where they're unknowable and
unusable to a modular library written outside the current program.
Bear
| |
| Nic Ferrier 2004-10-01, 3:58 pm |
| Ray Dillinger <bear@sonic.net> writes:
> My point though, is that you *know* what you can do with
> procedures. It can be more, or less, in any given scheme
> implementation, but if you know the implementation, at least
> you can write libraries within its capabilities that will work
> modularly with any scheme code.
>
> But the thing with records is they're too simple to present the
> kind of implementation difficulties that good introspection on
> procedures runs into. They're just aggregations, for crying out
> loud! There's no excuse for having them tucked under themselves
> in one top-level disjoint type each, where they're unknowable and
> unusable to a modular library written outside the current program.
I agree... I've just been struggling with building a simple OO system
that resulted in disjoint types into my baby Scheme implementation. It
seemed to me, after some analysis (long dog walks in the rain) that
systems relying on the idea of polymorphism just didn't work.
I think a LISP with polymorphism would be great. ISLISP seemed
promising to me but I couldn't find a usable specification (the PDF
was too wierd and the PS was only readable in one direction).
But even ISLISP seemed too big. A LISP with the minimalist spirit of
Scheme but that was polymorphic would be good. Maybe someone should
tell Paul Graham.
Nic
| |
| Bradd W. Szonye 2004-10-21, 8:57 pm |
| Marcin 'Qrczak' Kowalczyk wrote:
Ray Dillinger <bear@sonic.net> wrote:[color=darkred]
> At the moment, the same way it behaves for exact ones.
> However, when I am done with it, it should probably
> make the if expression return inexact results - which
> implies inexact characters, strings, etc. Erf. Hadn't
> thought of that. Definitely should have.
That reminds me of variable tainting in perl (a mechanism for tracking
untrusted data).
--
Bradd W. Szonye
http://www.szonye.com/bradd
|
|
|
|
|