Home > Archive > Prolog > March 2005 > Documenting Large Prolog Systems
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Documenting Large Prolog Systems
|
|
| David Poole 2005-02-28, 3:58 am |
| Hi All,
What experiences do people have documenting large Prolog systems?
We are looking for something more than predicate or module-level
documentation; a way for a new user to understand the structure of a
large Prolog system. Something like UML, but for relations not just
describing objects. Has anyone tried documenting a Prolog program in
UML? Our intuition is that this is not a good place to start, but I
thought I would ask about what are good places to start. Any experience
you have would be appreciated.
The Prolog program we are using is part of a larger system that is being
documented in UML. I'd appreciate any feedback from anyone who has tried
to document large Prolog systems.
Thanks,
David
p.s. I asked about this in the SWIPL mailing list, but with no answer.
--
David Poole, Office: +1 (604) 822-6254
Professor, Fax: +1 (604) 822-5485
Department of Computer Science, poole@cs.ubc.ca
University of British Columbia, http://www.cs.ubc.ca/spider/poole
Vancouver, B.C., Canada V6T 1Z4 ftp://ftp.cs.ubc.ca/ftp/local/poole
| |
|
| On Sun, 27 Feb 2005 21:50:17 -0800, David Poole <poole@cs.ubc.ca>
wrote:
>Hi All,
>What experiences do people have documenting large Prolog systems?
>
>We are looking for something more than predicate or module-level
>documentation; a way for a new user to understand the structure of a
>large Prolog system. Something like UML, but for relations not just
>describing objects. Has anyone tried documenting a Prolog program in
>UML? Our intuition is that this is not a good place to start, but I
>thought I would ask about what are good places to start. Any experience
>you have would be appreciated.
>
>The Prolog program we are using is part of a larger system that is being
>documented in UML. I'd appreciate any feedback from anyone who has tried
>to document large Prolog systems.
I am using Z, strictly speaking, some subset of Object-Z. Prolog
system is encapsulated as Java classes, therefore UML can be used on
the client side.
Maybe Z is not the ideal tool, but good enough...
A.L.
| |
| Gregory Bourassa 2005-03-01, 3:58 am |
| On Sun, 27 Feb 2005 21:50:17 -0800, David Poole wrote:
David,
I have documented large Prolog systems, and it is not too hard. That said,
any of the off-the-shelf documentation tools like UML will be tortured if you
try to use them -- they have no representation for backtracking, failure,
unification, and multiple solutions from backtracking.
Your best bet is to assume an "Encyclopoedia of predicates" approach and
document the main predicates -- not the helpers which are only there to
support interesting predicates. For each predicate or relation you should
document:
Name
Arity
Known flow patterns (bound on call, free on call for each
argument)...e.g. (var, var, nonvar) (var,var,var) (novar,var,nonvar), or
(free,free.bound) etc., or (i,i,o)...whatever notation you like as long as
you do it.
A comment for each flow pattern
Some comments on the expected terms which can be passed as arguments
-- this is very important to maintainers. Let's be practical, rarely can
every argument can be of every functor type and still work!
Whether or not failure can happen
What is returned on multiple entries due to backtracking
Whether or not the predicate can backtrack (deterministic or
nondeterministic)
Whether or not you can expect any system-level exceptions and what
they might be
What data (facts) if any are expected to exist in the environment at
run-time for the predicate to succeed
....and probably other things, but they escape my memory just now -- the above
constitutes a good start
If you can, partition the system into "packages". There is usually a set of
list-handling and other predicates which are just "utilities" used
everywhere, then there will be other modules or packages which you can
recognise, which do things like file access, IP communications,
pretty-printing, graphics, and finally the various parts of the core
behaviour of your system.
Let the rest of the project team know that the Prolog subsystem is
meta-relational and meta-object-oriented, so they will have to deal with
your documentation as something more than they think they see -- they will
think they are seeing something very like a C-language API. Don't stand
for such complacency.
Regards.
Gregory Bourassa
>Hi All,
>What experiences do people have documenting large Prolog systems?
>
>We are looking for something more than predicate or module-level
>documentation; a way for a new user to understand the structure of a
>large Prolog system. Something like UML, but for relations not just
>describing objects. Has anyone tried documenting a Prolog program in
>UML? Our intuition is that this is not a good place to start, but I
>thought I would ask about what are good places to start. Any experience
>you have would be appreciated.
>
>The Prolog program we are using is part of a larger system that is being
>documented in UML. I'd appreciate any feedback from anyone who has tried
>to document large Prolog systems.
>
>Thanks,
>David
>
>p.s. I asked about this in the SWIPL mailing list, but with no answer.
>
>
>--
>David Poole, Office: +1 (604) 822-6254
>Professor, Fax: +1 (604) 822-5485
>Department of Computer Science, poole@cs.ubc.ca
>University of British Columbia, http://www.cs.ubc.ca/spider/poole
>Vancouver, B.C., Canada V6T 1Z4 ftp://ftp.cs.ubc.ca/ftp/local/poole
| |
| Brian Hulley 2005-03-01, 8:58 am |
| Gregory Bourassa wrote:
> support interesting predicates. For each predicate or relation you
should
> document:
> Name
> Arity
> Known flow patterns (bound on call, free on call for each
> argument)...e.g. (var, var, nonvar) (var,var,var) (novar,var,nonvar),
or
> (free,free.bound) etc., or (i,i,o)...whatever notation you like as
long as
> you do it.
This reminds me of a paper I came across a while ago (unfortunately I
don't remember the authors' names) where they automatically detected
the sets of argument mode assignments for each predicate to see if the
program was "simply well-moded" (to enable optimizations when
compiling). Ie you start from your application's top level predicate eg
go/0, and recursively traverse the rest of the program asserting the
modes at each stage. (Presumably you could do something similar working
bottom up for the case where you have an API rather than an app.)
> Some comments on the expected terms which can be passed as arguments
It would be interesting to know if anyone has automatically generated
this info (ie the type lattice)- on the face of it it seems possible
for some sub-systems of predicates (ie those which don't construct
atoms/functors from character codes or user input etc)
Cheers - Brian.
| |
| Zoltan Somogyi 2005-03-02, 3:58 am |
| "Gregory Bourassa" <bourassa_nospam@magma.ca> writes:
>For each predicate or relation you should
>document:
> Name
> Arity
> Known flow patterns (bound on call, free on call for each
>argument)...e.g. (var, var, nonvar) (var,var,var) (novar,var,nonvar), or
>(free,free.bound) etc., or (i,i,o)...whatever notation you like as long as
>you do it.
> Some comments on the expected terms which can be passed as arguments
>-- this is very important to maintainers. Let's be practical, rarely can
>every argument can be of every functor type and still work!
> Whether or not failure can happen
> Whether or not the predicate can backtrack (deterministic or
>nondeterministic)
Just for the record, the above pieces of information (which is a large part
of the list from the post I am replying to) are all present in Mercury's
type, mode and determinism declarations for a predicate.
We designed Mercury to have this information in declarations because
our experience, just as Gregory's, led us to believe that maintainers
are often lost without this information. However, in Mercury, the
information in these declarations is verified by the compiler, so
maintainers can rely on it. In Prolog, it is all to easy to update
the code without updating the documentation.
Zoltan Somogyi <zs@cs.mu.OZ.AU> http://www.cs.mu.oz.au/~zs/
Department of Computer Science and Software Engineering, Univ. of Melbourne
| |
| Zoltan Somogyi 2005-03-02, 3:58 am |
| "Brian Hulley" <brianh@metamilk.com> writes:
>This reminds me of a paper I came across a while ago (unfortunately I
>don't remember the authors' names) where they automatically detected
>the sets of argument mode assignments for each predicate to see if the
>program was "simply well-moded" (to enable optimizations when
>compiling). Ie you start from your application's top level predicate eg
>go/0, and recursively traverse the rest of the program asserting the
>modes at each stage. (Presumably you could do something similar working
>bottom up for the case where you have an API rather than an app.)
The standard name for this is mode inference, and there have been many
papers on it, mostly in the eighties. It never got anywhere much, for
two reasons.
First, the search space (the possible sets of predicate/mode pairs)
is too large to be searched exhaustively, and even the best heuristics
that people came up with don't cut it down sufficiently. The only
way to keep the process feasible was to introduce approximations.
As the program becomes larger, these approximations come to dominate
the output; which means that the larger the program, and thus the more
you need the tools, the less the tool tells you.
Second, if you apply inference to a real program, all you get is
information about how information actually *does* flow in the program;
it doesn't tell you how it *should* flow, or where the actual flow
differs from the desired flow. Even if the user does a visual diff
of the output of the inference against their expectations, these systems
don't really help in tracking down the sources of discrepancies.
The problems reinforce each other. The more approximations the inference
makes, the harder the job of doing the visual diff is, and the more bugs
exist in the program, the more approximations must be made.
Suppose you have a predicate whose mode should be (in,out), i.e. the first
argument is input (caller should supply a ground term) but the second is
output (caller should supply an fresh, unaliased variable). Suppose you
have a bug in which one clause out of several switches the roles of the two
arguments. Since a straight mode inference algorithm takes the code as gospel,
it cannot infer (in,out) as the mode; it must infer (in or out, in or out),
a mode that contains no meaningful information. This uncertainty propagates
to the predicate's callers and to their callees, and from there to the rest
of the program.
[color=darkred]
>It would be interesting to know if anyone has automatically generated
>this info (ie the type lattice)- on the face of it it seems possible
>for some sub-systems of predicates (ie those which don't construct
>atoms/functors from character codes or user input etc)
While type inference usually means inference of the types of predicates'
arguments, there is a form which tries to infer the definitions of types
themselves; I think you are asking for both. While the first is definitely
feasible and in fact is a routine part of many languages (e.g. Haskell,
and it is available though not recommended for Mercury), there are good
reasons for not doing the second. There are usually much fewer types than
predicates, so defining them explicitly is less work for the programmer,
and any mistakes by the inference of type definitions are magnified hugely
when you try to use its output in type inference of predicate arguments.
In most people's opinion, the cost/benefit ratio of type definition inference
is very far from favourable.
Zoltan Somogyi <zs@cs.mu.OZ.AU> http://www.cs.mu.oz.au/~zs/
Department of Computer Science and Software Engineering, Univ. of Melbourne
| |
| Brian Hulley 2005-03-02, 4:03 pm |
|
Zoltan Somogyi wrote:
> it cannot infer (in,out) as the mode; it must infer (in or out, in or
out),
I would have thought that the goal of mode inference would be to infer
a *set* of mode assignments for each predicate, thus the above bug
would return the set, {(in, out), (out,in)} which would at least tell
you that the predicate is being used in more than one way, which might
help a tool flag possible bugs since I would have expected most
predicates in a large program to only have one flow (except the very
general utility predicates like member/2 etc)
My perceived problem with systems which require flow declarations is
that they can break the flow of thought from programmer to predicate
(!) in the same way that the need to first decide how to reify a domain
into objects and methods prevents exploratory programming in object
oriented languages. Also, the concept of in/out seems somehow, to me at
least, not quite the same as the pure concept of unification thus I
would not want to have to think in terms of in/out during the
incarnation of some prolog-thought into a prolog predicate.
However such declarations are obviously very useful later on, for code
maintenance, generation of efficient code etc, hence the desire to
generate them automatically.
Of course the above is just my personal experience and no doubt others
find such declarations helpful right from the beginning - I'll have
another look at the Mercury website.
Cheers - Brian.
| |
| Lindsey Spratt 2005-03-02, 8:58 pm |
| I discussed the implementation of large system (SPARCL) by having an
abstraction hierarchy of subsystems and modules, where a subsystem was
composed of several modules. The implementation discussion was
organized by subsystem (subsystems can share modules). The discussion
of the implementation of a system was organized in part using module
dependency graphs. (The module dependency graphs were programmatically
generated.)
This discussion was chapter 6 of my PhD thesis which can be found at:
http://homepage.mac.com/lspratt/pap...ic%2C%20PDF.sit
The
The modules in my discussion are fairly large in many cases and for
documentation purposes are perhaps like the "packages" Gregory Bourassa
mentions. I think something like the encylopedic approach Gregory
advocates would be useful, but only in the context of documentation of
the architecture or higher levels of abstraction of the system (such as
the subsystem and module levels).
Regards,
Lindsey Spratt
http://homepage.mac.com/lspratt
On 2005-02-28 22:39:00 -0500, "Gregory Bourassa"
<bourassa_nospam@magma.ca> said:
[color=darkred]
> On Sun, 27 Feb 2005 21:50:17 -0800, David Poole wrote:
>
> David,
>
> I have documented large Prolog systems, and it is not too hard. That said,
> any of the off-the-shelf documentation tools like UML will be tortured if you
> try to use them -- they have no representation for backtracking, failure,
> unification, and multiple solutions from backtracking.
>
> Your best bet is to assume an "Encyclopoedia of predicates" approach and
> document the main predicates -- not the helpers which are only there to
> support interesting predicates. For each predicate or relation you should
> document:
> Name
> Arity
> Known flow patterns (bound on call, free on call for each
> argument)...e.g. (var, var, nonvar) (var,var,var) (novar,var,nonvar), or
> (free,free.bound) etc., or (i,i,o)...whatever notation you like as long as
> you do it.
> A comment for each flow pattern
> Some comments on the expected terms which can be passed as arguments
> -- this is very important to maintainers. Let's be practical, rarely can
> every argument can be of every functor type and still work!
> Whether or not failure can happen
> What is returned on multiple entries due to backtracking
> Whether or not the predicate can backtrack (deterministic or
> nondeterministic)
> Whether or not you can expect any system-level exceptions and what
> they might be
> What data (facts) if any are expected to exist in the environment at
> run-time for the predicate to succeed
>
> ...and probably other things, but they escape my memory just now -- the above
> constitutes a good start
>
> If you can, partition the system into "packages". There is usually a set of
> list-handling and other predicates which are just "utilities" used
> everywhere, then there will be other modules or packages which you can
> recognise, which do things like file access, IP communications,
> pretty-printing, graphics, and finally the various parts of the core
> behaviour of your system.
>
> Let the rest of the project team know that the Prolog subsystem is
> meta-relational and meta-object-oriented, so they will have to deal with
> your documentation as something more than they think they see -- they will
> think they are seeing something very like a C-language API. Don't stand
> for such complacency.
>
> Regards.
>
> Gregory Bourassa
>
>
| |
|
| On Wed, 2 Mar 2005 17:23:27 -0500, Lindsey Spratt
<spratt@alum.mit.edu> wrote:
>I discussed the implementation of large system (SPARCL) by having an
>abstraction hierarchy of subsystems and modules, where a subsystem was
>composed of several modules. The implementation discussion was
>organized by subsystem (subsystems can share modules). The discussion
>of the implementation of a system was organized in part using module
>dependency graphs. (The module dependency graphs were programmatically
>generated.)
>
>This discussion was chapter 6 of my PhD thesis which can be found at:
>http://homepage.mac.com/lspratt/pap...ic%2C%20PDF.sit
What is *.sit?...
A.L.
| |
| Zoltan Somogyi 2005-03-03, 4:00 am |
| I wrote:
> it cannot infer (in,out) as the mode; it must infer (in or out, in or out),
"Brian Hulley" <brianh@metamilk.com> writes:
>I would have thought that the goal of mode inference would be to infer
>a *set* of mode assignments for each predicate, thus the above bug
>would return the set, {(in, out), (out,in)} which would at least tell
>you that the predicate is being used in more than one way, which might
>help a tool flag possible bugs
Yes. But if the incorrect analysis of one predicate contanimates the
analysis of many other predicates, then the programmer is faced with
a whole slew of messages saying "this predicate has more than one mode,
and is therefore likely to be buggy" with no clue as to which of these
are cause and which are effect.
>since I would have expected most
>predicates in a large program to only have one flow (except the very
>general utility predicates like member/2 etc)
That is true. Our statistics on the Mercury system, taken at various
times over the years, have consistently said that the average number
of modes per predicate is in the 1.03 to 1.05 range. As you say, it
is mostly (although not exclusively) utility predicates in the library
that have more than one mode. Since some of these have more than two modes,
the fraction of predicates with more than one mode is probably around 2-3%.
Those statistics are for one program; other programs may yield different
numbers. However, I wouldn't expect the numbers to be *too* different.
>My perceived problem with systems which require flow declarations is
>that they can break the flow of thought from programmer to predicate
There is nothing preventing you from writing the code first and adding
the declarations later, as long as you add the declarations (at least
for the predicates exported from a module) before you compile the code.
>Also, the concept of in/out seems somehow, to me at
>least, not quite the same as the pure concept of unification
Unification is used in several ways in Prolog. Most of the time, it is
used purely for parameter passing, and for that, thinking in terms of in/out
is useful. More rarely, it is used for solving Herbrand constraints. When
we designed Mercury in 1993-94, we didn't know how to handle constraints
(of any kind) while keeping the properties we wanted (ability to handle I/O
in a declarative manner, reliability from checked declarations,
maintainability for the same reason, and speed), so we didn't include
constraints. The HAL project later explored the integration of constraints
into Mercury, first as a separate language (HAL) that compiles to Mercury,
and now as language extensions in Mercury itself.
The point is, parameter passing is fundamentally simple and should be fast.
Constraint solving isn't anywhere near that simple in general, and can't be
made that fast. Using the same mechanism for both may be intellectually
parsimonious, but it is not a sound engineering choice if performance
is a concern, which it is in programming languages.
>thus I
>would not want to have to think in terms of in/out during the
>incarnation of some prolog-thought into a prolog predicate.
This is where people differ, partly because of psychological differences
and partly because they tackle different tasks. If I had to code a problem
which can be modelled as Herbrand constraints, I wouldn't think in terms of
in/out either, but in pretty much all other cases, I would.
>However such declarations are obviously very useful later on, for code
>maintenance, generation of efficient code etc, hence the desire to
>generate them automatically.
The Mercury compiler can infer the types, modes and determinisms of
predicates that are local to their module; it will print out the inferred
declarations, and you can cut-and-paste them into the source file if you like.
We have found that it is usually better not to take advantage of this
capability, and not only because the mode inference quite weak (the other two
are much better). Several people who originally disliked the thought of
declarations have come around to accepting and appreciating them, as
they work with other people's code.
Zoltan Somogyi <zs@cs.mu.OZ.AU> http://www.cs.mu.oz.au/~zs/
Department of Computer Science and Software Engineering, Univ. of Melbourne
| |
| Lindsey Spratt 2005-03-03, 3:58 pm |
| On 2005-03-02 20:13:08 -0500, A.L.
<alewando_tego_nie@oddpost_tego_tez_nie.com> said:
> On Wed, 2 Mar 2005 17:23:27 -0500, Lindsey Spratt
> <spratt@alum.mit.edu> wrote:
>
[color=darkred]
> What is *.sit?...
The .sit is a "stuffit" file, the common compressed archive form on Mac
OS X. I've added a .zip file:
http://homepage.mac.com/lspratt/pap...1996-SPARCL.zip
Lindsey
Lindsey Spratt
http://homepage.mac.com/lspratt
| |
|
|
| Tom Breton 2005-03-04, 3:59 am |
| Zoltan Somogyi <zs@students.cs.mu.OZ.AU> writes:
> The HAL project later explored the integration of constraints
> into Mercury, first as a separate language (HAL) that compiles to Mercury,
> and now as language extensions in Mercury itself.
Can you elaborate on those language extensions?
My information must be out of date. Last I saw, the HAL docs
mentioned only extending Mercury with a few new node labels like "new"
and "old", and the Mercury docs didn't seem to mention HAL or
constraints.
--
Tom Breton, the calm-eyed visionary
| |
| Zoltan Somogyi 2005-03-04, 8:57 am |
| I wrote:
> The HAL project later explored the integration of constraints
> into Mercury, first as a separate language (HAL) that compiles to Mercury,
> and now as language extensions in Mercury itself.
Tom Breton <tehom@REMOVEpanNOSPAMix.com> writes:
>Can you elaborate on those language extensions?
The main one is that you can now declare a type to be a solver type.
A solver type has two faces. Everywhere in the program except its defining
module, it stands for a data structure (e.g. a float or a term) on which
constraints may be placed; its instantiation state is usually "any".
Inside the defining module, i.e. the module that implements the constraint
solver, its type will typically be different (e.g. it may be an integer
representing an index into a table) and its instantiation states will also
be different (typically ground). The solver writer must convert between
the two views when necessary, and is required to ensure that the operations
on the internal representation faithfully implement the semantics of the
public solver type.
Recent versions of the Mercury reference manual have more detail on solver
types. If you are interested in the design rationale, you'll have to search
the mercury-developer mailing list, available from the Mercury web site.
>My information must be out of date. Last I saw, the HAL docs
>mentioned only extending Mercury with a few new node labels like "new"
>and "old", and the Mercury docs didn't seem to mention HAL or
>constraints.
The Mercury web site didn't say much about HAL because HAL was being built
by a different group (mostly across town), and we figured that having stuff
about HAL on our web page was a double maintenance problem. So we just had
a link to the HAL page.
The Mercury docs don't trumpet constraints yet because that capability
is still being worked on, and will be for a while. We don't want people
to unknowingly start relying on features that may be changed in the near
future. Of course, we can always use beta testers, and any feedback would be
appreciated.
Zoltan Somogyi <zs@cs.mu.OZ.AU> http://www.cs.mu.oz.au/~zs/
Department of Computer Science and Software Engineering, Univ. of Melbourne
|
|
|
|
|