Home > Archive > Functional > June 2007 > More on Function Purity
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
More on Function Purity
|
|
|
| Hi guys,
I have a couple more questions on function purity, let's say as defined
roughly by languages such as Haskell or ML. I hope you don't mind the
C-style/imperative syntax.
Let's say we have 3 `kinds' of data:
- static:
*Fully* resolved at compile time, immutable/non-extant at runtime.
- const:
Evaluated at runtime on initialization, but immutable afterwards.
- variable:
Evaluated at runtime, mutable at all times.
Let's also say there is only one type, the arbitrary precision int, and
void. Now, say we have a program with the following 'globals':
static int S = 1 + 2 + 3; // resolves to 6.
const int C = GetTimeOfDay(); // runtime
variable int V = GetTimeOfDay(); // runtime
Such that:
void test() {
S = 42; // Error.
C = 42; // Error.
V = 42; // Ok.
print(S); // Ok.
print(C); // Ok.
print(V); // Ok.
}
So, I would assume that a function *cannot* be pure if it:
* Reads from V; and/or
* Writes to V.
I would also assume that a function *can* be pure if it:
* "Reads" from S.
(Writing to S is impossible).
My question is, would the function still be pure if it:
* Reads from C?
(Writing to C is impossible).
If you'll remember, C is a runtime, global, constant piece of data. It
is technically part of the `environment' (since it is at global scope),
but it will never change after it has been initialized (which is always
prior to any function reading from it).
So for instance, would this be correct:
int foo1() { return S; } // Pure.
int foo2() { return C; } // Pure?
int foo3() { return V; } // Not pure.
Now, let's use local data:
int foo4() { static int s = 1; return s; } // Pure.
int foo5() { const int c = 2; return c; } // Pure.
int foo6() { variable int v = 3; v = 4; return v; } // Pure.
Many thanks for your time!
Cheers,
-Al-
| |
| Jon Harrop 2007-05-17, 8:04 am |
| Al wrote:
> My question is, would the function still be pure if it:
>
> * Reads from C?
> (Writing to C is impossible).
I think it is pure.
let n=3
let add m = n+m
The "add" function is pure.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/produc...journal/?usenet
| |
| Ville Oikarinen 2007-05-18, 8:04 am |
| On Wed, 16 May 2007, Al wrote:
> variable int V = GetTimeOfDay(); // runtime
> ...
> So, I would assume that a function *cannot* be pure if it:
>
> * Reads from V; and/or
The code as a whole is not pure, because it uses a mutable variable. But
it's the variable itself (and the code that writes to it) that's impure,
not the function. The function just reads the variable, ignoring the fact
that it *could* be written, too. (Actually, writing to V only once in the
code could justify the whole code to be called pure, but at least the
function is pure.)
If you make V immutable, the code becomes pure functional *without changes
to the function*; therefore the function cannot be impure.
(Although I find it somehow inconvenient to talk about pure functionality
using imperative code.)
I think the compile time/runtime evaluation question is not relevant here:
a program that contains only expressions evaluatable at compile time is
not useful: you don't need to publish the program; the result it evaluates
to is enough :) So it's quite OK to call it pure functional style even if
the evaluation result depends on the runtime environment.
- Ville Oikarinen
| |
| Ben Rudiak-Gould 2007-05-18, 7:05 pm |
| Al wrote:
> const int C = GetTimeOfDay(); // runtime
>
> int foo2() { return C; } // Pure?
This violates referential transparency in that if you substitute
GetTimeOfDay() for c in the body of foo2() then it will no longer return the
same value on every call. But we still get an interesting example if we
replace GetTimeOfDay() with GetCommandLine().
Haskell is designed on the principle that not only should everything have
the same value in every context within the program, everything should have
the same value in every external execution context, i.e. every time the
program is run. This is why it has System.getArgs :: IO [String] rather than
System.args :: [String]. There is Haskell software that usefully exploits
this property (WASH).
-- Ben
| |
| Joachim Durchholz 2007-05-18, 7:05 pm |
| Ville Oikarinen schrieb:
> On Wed, 16 May 2007, Al wrote:
>
>
> The code as a whole is not pure, because it uses a mutable variable. But
> it's the variable itself (and the code that writes to it) that's impure,
> not the function. The function just reads the variable, ignoring the fact
> that it *could* be written, too. (Actually, writing to V only once in the
> code could justify the whole code to be called pure, but at least the
> function is pure.)
>
> If you make V immutable, the code becomes pure functional *without changes
> to the function*; therefore the function cannot be impure.
Actually, the function still is impure.
If fn() accesses V, and V is updated asynchronously, then
2*fn()
is semantically different from
fn()+fn()
and this at least breaks the "equational reasoning" kind of purity (and
probably other definitions of purity, too).
In my book, purity is not a local property. It's a property of a
function and the context it's executing in: guarantees about
(im)mutability come into play, for example.
In this concrete case, V and any functions accessing it are impure,
simply because V may be updated at any time.
If the updates are synchronous, anything that makes use of purity
arguments (equational reasoning, compiler transforms, whatever) would
have to be confined so that it does not span an evaluation step that
updates V.
If the language syntactically or lexically identifies "islands of
purity" between updates, then you can do equational reasoning within
these islands, and the function is pure within them. If the language
does not, then you'll have to do some kind of whole-system analysis to
make sure that V isn't updated in the function or any functions called
from it. (This can be a challenge, e.g. if the function in question is a
HOF: executing a submitted function may update V, so you need dataflow
analysis to determine that none of the functions submitted as parameters
will ever update V - note that this kind of analysis is undecidable, so
assuming that a function is impure if it accesses an impure variable is
just a reasonable conservative approximation.)
> (Although I find it somehow inconvenient to talk about pure functionality
> using imperative code.)
In non-imperative code, it's difficult to have impure code in the first
place, so you don't get to write impure examples.
> I think the compile time/runtime evaluation question is not relevant here:
> a program that contains only expressions evaluatable at compile time is
> not useful: you don't need to publish the program; the result it evaluates
> to is enough :)
Compile-time evaluation cannot continue as soon as inputs are involved.
If the functions are pure, what you get after compile-time evaluation is
a function that needs an input for the next reduction step.
Regards,
Jo
| |
|
| Hi,
Ben Rudiak-Gould wrote:
> Al wrote:
>
> This violates referential transparency in that if you substitute
> GetTimeOfDay() for c in the body of foo2() then it will no longer return
> the same value on every call. But we still get an interesting example if
> we replace GetTimeOfDay() with GetCommandLine().
Hm... that's an interesting variation. So for an expression to be
referentially transparent, all evaluations of it need to yield the same
value, not only within a single execution of the binary, but across the
set of *all* its possible executions? Is this accurate?
However, it doesn't seem like this should be a requirement for a
function to be pure. For instance, I believe the compiler could memoize
all calls too foo2(), since it will always return the same value (per
execution). Is that not enough?
> Haskell is designed on the principle that not only should everything
> have the same value in every context within the program, everything
> should have the same value in every external execution context, i.e.
> every time the program is run. This is why it has System.getArgs :: IO
> [String] rather than System.args :: [String]. There is Haskell software
> that usefully exploits this property (WASH).
I see. Is this something that is more generally useful, though?
Thanks!
-Al-
| |
| Ville Oikarinen 2007-05-21, 8:04 am |
| On Fri, 18 May 2007, Joachim Durchholz wrote:
> Actually, the function still is impure.
> If fn() accesses V, and V is updated asynchronously, then
> 2*fn()
> is semantically different from
> fn()+fn()
> and this at least breaks the "equational reasoning" kind of purity (and
> probably other definitions of purity, too).
>
> In my book, purity is not a local property. It's a property of a
> function and the context it's executing in: guarantees about
> (im)mutability come into play, for example.
>
> In this concrete case, V and any functions accessing it are impure,
> simply because V may be updated at any time.
I believe we undestand this in the same way. I just left it unsaid that I
assume the immutable V gets initialized before the first call to the
function. (This is why I felt uncomfortable with the imperative code:
using the more functional let syntax we wouldn't be discussing this.)
And yes, either the code as a whole is pure functional or it's not.
However, I think it's useful to call those parts of the program pure that
don't need to be modified in order to make the program pure. It's not
their "fault" the program is impure.
>
> Compile-time evaluation cannot continue as soon as inputs are involved.
> If the functions are pure, what you get after compile-time evaluation is
> a function that needs an input for the next reduction step.
Yes.
- Ville Oikarinen
| |
| henryware@gmail.com 2007-05-24, 7:07 pm |
| On May 18, 5:21 pm, Joachim Durchholz <j...@durchholz.org> wrote:
> Ville Oikarinen schrieb:
>
>
>
>
>
>
>
>
> Actually, the function still is impure.
> If fn() accesses V, and V is updated asynchronously, then
> 2*fn()
> is semantically different from
> fn()+fn()
> and this at least breaks the "equational reasoning" kind of purity (and
> probably other definitions of purity, too).
>
> In my book, purity is not a local property. It's a property of a
> function and the context it's executing in: guarantees about
> (im)mutability come into play, for example.
>
If fn() was changed to fn(V) would it then be pure?
| |
| Joachim Durchholz 2007-05-25, 4:18 am |
| henryware@gmail.com schrieb:
> If fn() was changed to fn(V) would it then be pure?
According to the definitions I know: yes.
Regards,
Jo
| |
| Ingo Menger 2007-05-25, 8:05 am |
| On 18 Mai, 22:13, Ben Rudiak-Gould <br276delet...@cam.ac.uk> wrote:
> Haskell is designed on the principle that not only should everything have
> the same value in every context within the program, everything should have
> the same value in every external execution context, i.e. every time the
> program is run. This is why it has System.getArgs :: IO [String] rather than
> System.args :: [String]. There is Haskell software that usefully exploits
> this property (WASH).
I have always wondered why this is so. I could see no sin against
purity laws when main had a type like:
main :: [String] -> IO ()
| |
| Dirk Thierbach 2007-05-25, 10:05 pm |
| Ingo Menger <quetzalcotl@consultant.com> wrote:
> On 18 Mai, 22:13, Ben Rudiak-Gould <br276delet...@cam.ac.uk> wrote:
[color=darkred]
> I have always wondered why this is so. I could see no sin against
> purity laws when main had a type like:
> main :: [String] -> IO ()
Then by the same argument, one could include the environment in the
type of main. And maybe the process id. And the parent process id.
And so on, and so on.
Given that, I think it's better to put *everything* that's somehow
OS-dependent inside the IO monad. After all, it's one of the features
of a monad to hide such things. KISS.
- Dirk
| |
| Ingo Menger 2007-05-26, 8:04 am |
| On 25 Mai, 18:29, Dirk Thierbach <dthierb...@usenet.arcornews.de>
wrote:
> Ingo Menger <quetzalc...@consultant.com> wrote:
>
> Then by the same argument, one could include the environment in the
> type of main. And maybe the process id. And the parent process id.
> And so on, and so on.
Yes, sure. As long as this makes the program not really OS dependent,
I still see no sin. For example, the environment is just a [(String,
String)]. If some alien OS has none, this is fine, then env = [].
The process id is another matter. It does not have to be an Int. There
may be no process hirarchy, etc.
| |
| Joachim Durchholz 2007-05-26, 8:04 am |
| Ingo Menger schrieb:
> On 25 Mai, 18:29, Dirk Thierbach <dthierb...@usenet.arcornews.de>
> wrote:
>
> Yes, sure. As long as this makes the program not really OS dependent,
> I still see no sin.
But it does. Depending on OS, there might be different environment
variables. They might even change (imagine an OS that continuously
updates time-of-day, or last-died-pid in the environment).
Besides, it's unnecessary. There are (or should be) functions for
accessing the environment inside IO.
Regards,
Jo
| |
| Dirk Thierbach 2007-05-26, 10:06 pm |
| Ingo Menger <quetzalcotl@consultant.com> wrote:
> On 25 Mai, 18:29, Dirk Thierbach <dthierb...@usenet.arcornews.de>
[color=darkred]
> Yes, sure. As long as this makes the program not really OS dependent,
> I still see no sin.
The point is not so much that it's a "sin" (or leads to impurity), the
point is that it's somewhat ad-hoc: You treat some os-dependent stuff
special, by elevating it to a very visible position that every single
program has to live with, whereas you treat other (arguably equally
important stuff) differently and hide it inside the IO monad (so
you needn't be bothered about it unless you want to use it).
So, the cleaner and neutral position is to treat them all the same
way, and handle all of them inside the IO monad.
- Dirk
| |
|
|
|
|
| David Hopwood 2007-05-27, 10:13 pm |
| Dirk Thierbach wrote:
> Ingo Menger <quetzalcotl@consultant.com> wrote:
>
>
>
Note that this is quite different from "System.args :: [String]". The
latter is impure because it isn't a fixed function -- the return value
depends on something that is not given as an argument.
[color=darkred]
> Then by the same argument, one could include the environment in the
> type of main. And maybe the process id. And the parent process id.
> And so on, and so on.
main :: StartupContext -> IO ()
getArgs :: StartupContext -> [String]
getEnvVar :: StartupContext -> String -> String
getProcessID :: StartupContext -> ProcessID
getParentProcessID :: StartupContext -> ProcessID
etc.
I don't see the problem here. This design is both as pure, and as
convenient as the one that was chosen for Haskell.
--
David Hopwood <david.hopwood@industrial-designers.co.uk>
| |
| Marcin 'Qrczak' Kowalczyk 2007-05-27, 10:13 pm |
| Dnia 27-05-2007, N o godzinie 15:58 +0000, David Hopwood napisa=B3(a):
> main :: StartupContext -> IO ()
> getArgs :: StartupContext -> [String]
> getEnvVar :: StartupContext -> String -> String
> getProcessID :: StartupContext -> ProcessID
> getParentProcessID :: StartupContext -> ProcessID
> etc.
>=20
> I don't see the problem here. This design is both as pure, and as
> convenient as the one that was chosen for Haskell.
It's not as convenient because any code which wants to access these data
must have the argument of main propagated to it.
--=20
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
| |
| Philippa Cowderoy 2007-05-27, 10:13 pm |
| On Sun, 27 May 2007, David Hopwood wrote:
>
> main :: StartupContext -> IO ()
> getArgs :: StartupContext -> [String]
> getEnvVar :: StartupContext -> String -> String
> getProcessID :: StartupContext -> ProcessID
> getParentProcessID :: StartupContext -> ProcessID
> etc.
>
> I don't see the problem here. This design is both as pure, and as
> convenient as the one that was chosen for Haskell.
>
However, it imposes additional requirements on the platform the program
runs on. Why not admit that we're getting the StartupContext from the
world via IO?
--
flippa@flippac.org
The task of the academic is not to scale great
intellectual mountains, but to flatten them.
| |
| Donn Cave 2007-05-29, 7:06 pm |
| In article <1180183078.332955.326200@w5g2000hsg.googlegroups.com>,
Ingo Menger <quetzalcotl@consultant.com> wrote:
> Yes, sure. As long as this makes the program not really OS dependent,
> I still see no sin. For example, the environment is just a [(String,
> String)]. If some alien OS has none, this is fine, then env = [].
[(String, String)] would sure be more useful, but (in case anyone
is interested and doesn't know this already), the environment as
rendered in the 3rd parameter of C's main(int, char **, char **)
is more like [String], where "=" separates the left from right sides.
> The process id is another matter. It does not have to be an Int. There
> may be no process hirarchy, etc.
Yes, and it does indeed require I/O, in the sense that in say
Berkeley UNIX you have to ask the kernel. And since its value
depends on fork(2), I suppose it isn't pure anyway.
I didn't get that this was really a serious proposal to change
the language, but the discussion has certainly led to some surprises.
- IO is Haskell's way of sequestering platform dependencies.
I thought IO was a fairly clear concept having to do with
execution of functional code in the impure context of an
actual computer. Not saying the concept is entirely clear
to me, but if it's about platform depedency then I am more
than I thought.
- getArgs is impure because its result depends on something
not given as an argument.
I think I can sort of see that. A static argument list like
program = "test"
progArgs = program:["-f", "config"]
wouldn't have the same dependency, inasmuch as the value of
progArgs is known at compile time, while the value of getArgs
may vary from one execution to the next. But I'm still
surprised that this is enough to make it impure. It still
seems to me like a functionally tractable value, because it's
stable. I'm discounting the fact that the argv storage can
be overwritten, on the assumption Haskell wouldn't support
that, so it's an immutable list of strings, possibly similar
to a constant list defined outside the present compilation
module.
Donn Cave, donn@u.washington.edu
| |
| Joachim Durchholz 2007-05-29, 7:06 pm |
| Donn Cave schrieb:
> - IO is Haskell's way of sequestering platform dependencies.
>
> I thought IO was a fairly clear concept having to do with
> execution of functional code in the impure context of an
> actual computer. Not saying the concept is entirely clear
> to me, but if it's about platform depedency then I am more
> than I thought.
This confusion isn't entirely your fault. One of the SPJ slides said "IO
was all too convenient for dumping everything into it that we didn't
fully understand". Command-line access may be one of these.
OTOH, I think that this calls more for breaking up IO into modules,
rather than allowing access to execution-context stuff from outside of IO.
> - getArgs is impure because its result depends on something
> not given as an argument.
I think that's the central argument.
Though I think it's quite weakened. Monads are a way to hide implicit
arguments, too. (I currently think that that's their very purpose: open
a sideband channel for communication between functions that isn't
explicit in the source code.)
> I think I can sort of see that. A static argument list like
> program = "test"
> progArgs = program:["-f", "config"]
>
> wouldn't have the same dependency, inasmuch as the value of
> progArgs is known at compile time, while the value of getArgs
> may vary from one execution to the next. But I'm still
> surprised that this is enough to make it impure. It still
> seems to me like a functionally tractable value, because it's
> stable.
I think that depends on the specifics of your personal definition of purity.
There are several, and it's exactly this kind of case where the
differences begin to have an actual effect.
> I'm discounting the fact that the argv storage can
> be overwritten, on the assumption Haskell wouldn't support
> that, so it's an immutable list of strings, possibly similar
> to a constant list defined outside the present compilation
> module.
I think you're not giving Haskell due respect here. Haskell has been
dubbed "the world's finest imperative language" (I don't buy the
"finest" bit, but I do buy the "imperative" bit).
Also, it should be possible to alter the argv storage from Haskell. It's
one of those obscure but perfectly valid techniques that are available
under Unixoid operating systems. (Whether implementing that is worth the
effort is an entirely different question, of course - but making argv
access non-IO would preclude this possibility from the outset.)
Regards,
Jo
| |
| Donn Cave 2007-05-29, 7:06 pm |
| In article <f3hv8s$7e3$1@online.de>,
Joachim Durchholz <jo@durchholz.org> wrote:
> Donn Cave schrieb:
>
> I think you're not giving Haskell due respect here. Haskell has been
> dubbed "the world's finest imperative language" (I don't buy the
> "finest" bit, but I do buy the "imperative" bit).
>
> Also, it should be possible to alter the argv storage from Haskell. It's
> one of those obscure but perfectly valid techniques that are available
> under Unixoid operating systems. (Whether implementing that is worth the
> effort is an entirely different question, of course - but making argv
> access non-IO would preclude this possibility from the outset.)
You completely lost me on the due respect comment. If you mean
I underestimate Haskell if I think it can't manage to overwrite argv,
that would be like thinking less of Andres Segovia because he probably
couldn't play guitar behind his back like some rock star can.
I think Haskell won't support it because no one in their right mind
will want it to, so it's fine to rule it out ahead of time.
But like I said, I don't think anyone was really advancing this
as a serious proposal, I'm only interested in it to the degree that
it exposes what may be a fault in my understanding of functional
programming, and just for that purpose I would prefer to assume an
immutable argv.
Donn Cave, donn@u.washington.edu
| |
| Joachim Durchholz 2007-05-29, 7:06 pm |
| Donn Cave schrieb:
> In article <f3hv8s$7e3$1@online.de>,
> Joachim Durchholz <jo@durchholz.org> wrote:
>
>
>
> You completely lost me on the due respect comment. If you mean
> I underestimate Haskell if I think it can't manage to overwrite argv,
> that would be like thinking less of Andres Segovia because he probably
> couldn't play guitar behind his back like some rock star can.
> I think Haskell won't support it because no one in their right mind
> will want it to, so it's fine to rule it out ahead of time.
No, there are valid uses for the ability to overwrite argv. (argc, too.)
For a bad example, programs that take passwords on the parameter line
usually want to erase that line as quickly as possible, since it's
possibly to spy the command line with tools like ps and top.
For a better example, some daemons give a short feedback about their
status on the command line. That can be very useful for monitoring them
in top.
Of course, that's really a side issue. The real question is whether
something that's known to be constant but comes from an external source
should be considered pure or not.
For that, I think it depends on your definition of "pure", and that will
most likely depend on what you want to do. For equational reasoning and
program transformation, I think this situation can be considered "pure";
I'd want to rethink any assumptions on that function's purity when
implementing the backend; and it's certainly impure when you do
theorems-for-free style reasoning.
Regards,
Jo
| |
| Dirk Thierbach 2007-05-30, 4:07 am |
| Donn Cave <donn@u.washington.edu> wrote:
> - IO is Haskell's way of sequestering platform dependencies.
>
> I thought IO was a fairly clear concept having to do with
> execution of functional code in the impure context of an
> actual computer.
The IO monad is, as the name suggests, basically for Input/Output.
As such, this is the part (and more or less the only part) of Haskell
that interacts with the operating system. Therefore, this is also the
part that abstracts away OS differences for different platforms.
Additionally, since every OS uses side effects, it allows (like other
monads) to mix pure functions with impure monadic operations, which
then in turn cause the side effects by calling the OS.
But that has nothing to do with "the impure context of an actual
computer" (if I understand correctly what you mean by that). A Haskell
programming can happily execute on an actual computer without ever
making use of the IO monad. Then of course it is restricted to be an
"algorithm" -- something that will deliver a value in the end, but is
not able to interact with anything in any way. But sometimes that's
enough.
And it's also possible to mix impure and pure functions without the
IO monad, for example with the ST monad, which just provides state
(i.e., variables in memory).
> Not saying the concept is entirely clear to me, but if it's about
> platform depedency then I am more than I thought.
Well, it's only indirectly about platform dependency. I think of IO
as a sort of library, like "stdio.h" in C.
> - getArgs is impure because its result depends on something
> not given as an argument.
Yes.
> It still seems to me like a functionally tractable value, because
> it's stable.
But it's not stable across different invocations of the program.
Purity expresses the ability to reason about a piece of code like a
black box, just looking at the result it delivers for some given
arguments. There's no hidden state to consider, neither in some global
variables, nor in operating system, nor on some harddisk, nor does it
depend on when and where you execute the code, who you are, and what
you did before.
You could in principle replace the pure code with a (rather big) table
that just maps the given arguments to the result listed in the table,
and the program won't change a bit. For getArgs, that would mean you
could replace it with a constant list of strings. But that is of course
not the idea if you're dealing with commandline arguments :-)
- Dirk
| |
| Donn Cave 2007-05-30, 10:06 pm |
| In article <20070530062610.972.1.NOFFLE@dthierbach.news.arcor.de>,
Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
> Purity expresses the ability to reason about a piece of code like a
> black box, just looking at the result it delivers for some given
> arguments. There's no hidden state to consider, neither in some global
> variables, nor in operating system, nor on some harddisk, nor does it
> depend on when and where you execute the code, who you are, and what
> you did before.
Yes, that's how I interpreted the argument, but as I look around
a little it doesn't look to me like there's a real consensus on it.
For example, the ultimate authority of our time, Wikipedia:
Pure function
... if both these statements about the function hold.
1. The function always evaluates the same result value given
the same argument value(s). The function result value cannot
depend on any hidden information or state that may change as
program execution proceeds, nor can it depend on any external
input from IO devices.
2. Evaluation of the result does not cause any semantically
observable side effect or output, such as mutation of mutable
objects or output to IO devices.
Note "state that may change as program execution proceeds", which
explicitly excludes argv (minus the stupid tricks that I hope we
can ignore at least for the sake of discussion.)
....
On another tack, to see what GHC thinks about values defined
in a separate module, take these two modules
Spud.hs
module Spud (spud) where
spud = ["a", "b"]
main.hs
module Main (main) where
import Spud (spud)
main = print spud
I can modify the value of spud, and rebuild the program,
apparently without re-compiling main.hs. So it seems to
me that this is effectively "hidden state", at the time
of compilation of main.hs -- spud is treated as an unknown
value at compilation time. If that's true, then it doesn't
seem to me that argv is different in any really interesting
way - in both cases, the value is unknown at compilation time
but fixed by the time the program can examine it.
Donn Cave, donn@u.washington.edu
| |
| Dirk Thierbach 2007-05-31, 4:19 am |
| Donn Cave <donn@u.washington.edu> wrote:
> Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
> Yes, that's how I interpreted the argument, but as I look around
> a little it doesn't look to me like there's a real consensus on it.
>
> For example, the ultimate authority of our time, Wikipedia:
>
> Pure function
> ... if both these statements about the function hold.
>
> 1. The function always evaluates the same result value given
> the same argument value(s). The function result value cannot
> depend on any hidden information or state that may change as
> program execution proceeds, nor can it depend on any external
> input from IO devices.
>
> 2. Evaluation of the result does not cause any semantically
> observable side effect or output, such as mutation of mutable
> objects or output to IO devices.
>
> Note "state that may change as program execution proceeds", which
> explicitly excludes argv
But "hidden information" includes it. And anyway, these are just
further explanations of the first sentence (which is the important
one).
I don't see how this differs from the explanation I gave. And anyway,
it's Wikipedia.
> On another tack, to see what GHC thinks about values defined
> in a separate module, take these two modules
>
> Spud.hs
> module Spud (spud) where
> spud = ["a", "b"]
> main.hs
> module Main (main) where
> import Spud (spud)
> main = print spud
> I can modify the value of spud,
No, you don't "modify" the "value" of spud. spud is just a name for
a term, nothing else.
> and rebuild the program, apparently without re-compiling main.hs.
> So it seems to me that this is effectively "hidden state",
Oh, come on. The program consists of all it's modules linked together,
and if the compiler is smart enough to figure out that it just
needs to re-link and not re-compile, then that's just a feature of
the compiler, no "hidden state".
If that discussion now descends into nitpicking over Wikipedia
phrasings, or inventing ridiculous examples (sorry), then I am
out of it.
- Dirk
| |
| Ingo Menger 2007-05-31, 4:19 am |
| On 30 Mai, 22:36, Dirk Thierbach <dthierb...@usenet.arcornews.de>
wrote:
> Donn Cave <d...@u.washington.edu> wrote:
> If that discussion now descends into nitpicking over Wikipedia
> phrasings, or inventing ridiculous examples (sorry), then I am
> out of it.
>
Sorry, but I feel Donn's example is legitimate.
One could contrive the example one step further to show that passing
the command line to main does not violate purity.
For this, we imagine what could be done if the Haskell runtime did not
include an IO action that accessed the comand line.
In this case one could write Main.hs like so:
module Main where
import Top (realmain)
main = let
args = [] :: [String]
in realmain args
and have a tool to run haskell programs, that, on invocation made a
copy of the Main.hs, replaced the args-Line with one that encoded the
arguments given on the tools commandline, compiled and linked the new
Main.hs and run the resulting executale.
Apart from program startup time one could not distinguish such a
scenario from one where a modified Runtime invoked realmain directly
with the current command line arguments.
Therefore, either all of them are pure or all of them are not pure.
I think they are, and I think like Donn that it does not matter how
and when the args::[String] is fixed - as long as this is done before
main is evaluated.
| |
| Dirk Thierbach 2007-05-31, 7:07 pm |
| Ingo Menger <quetzalcotl@consultant.com> wrote:
> On 30 Mai, 22:36, Dirk Thierbach <dthierb...@usenet.arcornews.de>
[color=darkred]
[color=darkred]
> Sorry, but I feel Donn's example is legitimate.
Oh, come on. The difference between compile-time changes and running
the finished program should be clear, shouldn't it?
What about if someone would start to argue "but those things you call
'constants' in the program are not really constants, I can change them
in the source code?"
> One could contrive the example one step further to show that passing
> the command line to main does not violate purity.
But changing main so it does take the command line (or anything else
from "outside") as ana rgument does indeed not violate purity. I have
said so twice now. And there's no need to invent complicate examples
for that.
> and I think like Donn that it does not matter how and when the
> args::[String] is fixed - as long as this is done before main is
> evaluated.
The point of purity is that it allows you to reason about the program
using the program itself (i.e., the source code with all modules,
or however you'd define it) alone. If a function is pure, you can safely
replace it with anything that respects the same argument/result
relationship. Also, you can move it around, delay it's execution, and
whatnot, and all this is completely safe. with getArgs, you cannot
do that -- once you try to replace it with it's result, you're stuck
with one single invocation with a fixed commandline. Hence, it's not
pure, as is everything else that depends on something that is defined
outside of the program (i.e., source code) itself.
If you don't like that, tough.
I'll say EOT now.
- Dirk
| |
| Donn Cave 2007-05-31, 7:07 pm |
| In article <20070531125441.1072.2.NOFFLE@dthierbach.news.arcor.de>,
Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
> The point of purity is that it allows you to reason about the program
> using the program itself (i.e., the source code with all modules,
> or however you'd define it) alone. If a function is pure, you can safely
> replace it with anything that respects the same argument/result
> relationship. Also, you can move it around, delay it's execution, and
> whatnot, and all this is completely safe. with getArgs, you cannot
> do that -- once you try to replace it with it's result, you're stuck
> with one single invocation with a fixed commandline. Hence, it's not
> pure, as is everything else that depends on something that is defined
> outside of the program (i.e., source code) itself.
Well, that was really my point - you _can_ move it around, delay
its execution and whatnot.
The rest seems to come down to _when_ its value is decidable, even
granted that this moment is before main is evaluated. I could imagine
this might be interesting in principle, but to me it doesn't appear
to be interesting in practice.
Donn Cave, donn@u.washington.edu
| |
| Ivan Jager 2007-05-31, 7:07 pm |
| On 2007-05-31, Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
> Ingo Menger <quetzalcotl@consultant.com> wrote:
>
>
>
> Oh, come on. The difference between compile-time changes and running
> the finished program should be clear, shouldn't it?
The difference between execution time changes and loading the finished
program should also be clear, shouldn't it?
The only reason I see for System.getArgs to be an IO [String] is that
someone may want to implement System.setArgs later on.
In this example, is the g function pure?
let f x = let g y = x+y in ...
Should getting x require the IO monad?
If a nested function can depend on the arguments to its parent, why
can't a function which is inside a program depend on the arguments to
the program?
Ivan
| |
| Dirk Thierbach 2007-06-01, 8:05 am |
| Ivan Jager <aij+nospam@andrew.cmu.edu> wrote:
> On 2007-05-31, Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
[color=darkred]
> The difference between execution time changes and loading the finished
> program should also be clear, shouldn't it?
Where do execution time changes come in? The example was about seperate
compilation of modules, not execution time.
> The only reason I see for System.getArgs to be an IO [String] is that
> someone may want to implement System.setArgs later on.
*Sigh*
> In this example, is the g function pure?
> let f x = let g y = x+y in ...
Yes.
> If a nested function can depend on the arguments to its parent, why
> can't a function which is inside a program depend on the arguments to
> the program?
First, because it's not an argument (that would be pure, as mentioned
for the fourth time now). It's a function in itself, in case of
"getArgs" one without any arguments. Second, because it's an expression
whose value depends on something specifified *outside* of the program.
As user input is. Or reading a file.
There's no big difference between the user specifying arguments on the
command line, the program asking the user for input, or the program
reading values from a file. All of them are impure when done by
"executing" some function, even if the values read/input don't change
during one execution of the program. But they can possibly change between
different executions of the program, and that's what makes them
impure.
The test for purity is, very simply: Can I replace the expression in
question with a "black box" that just reflects the argument/result
relationship, and will the program behave exactly the same? For
a variant of getArgs outside the IO monad, that won't work. Hence, it's
impure. What is so difficult to understand about that?
- Dirk
| |
| Ville Oikarinen 2007-06-01, 8:05 am |
| On Fri, 1 Jun 2007, Dirk Thierbach wrote:
> There's no big difference between the user specifying arguments on the
> command line, the program asking the user for input, or the program
> reading values from a file. All of them are impure when done by
> "executing" some function, even if the values read/input don't change
> during one execution of the program. But they can possibly change between
> different executions of the program, and that's what makes them
> impure.
I said this before: if a pure program only depends on values that don't
change between invocations, then the whole program is just a big constant.
And there is no point in publishing such a program: just publish its
result (unless the result is bigger than the program that generates it).
So while you _can_ define purity like that, what practical use does such a
definition have? Most functions depend on values given at call-time.
- Ville Oikarinen
| |
| Ingo Menger 2007-06-01, 8:05 am |
| On 31 Mai, 14:54, Dirk Thierbach <dthierb...@usenet.arcornews.de>
wrote:
> Ingo Menger <quetzalc...@consultant.com> wrote:
> But changing main so it does take the command line (or anything else
> from "outside") as ana rgument does indeed not violate purity.
Ok.
> Hence, [getArgs] it's not
> pure, as is everything else that depends on something that is defined
> outside of the program (i.e., source code) itself.
But you couldn't write a haskell program that could detect what of the
following is true:
a) getArgs operates on a fixed list written out in the source code
somewhere.
b) getArgs operates on a list provided by the runtime system at
program startup.
So, your concept of purity is meaningless, since there are cases where
it's violation is not posibly observable.
| |
| Dirk Thierbach 2007-06-01, 8:05 am |
| Ingo Menger <quetzalcotl@consultant.com> wrote:
> But you couldn't write a haskell program that could detect what of the
> following is true:
> a) getArgs operates on a fixed list written out in the source code
> somewhere.
> b) getArgs operates on a list provided by the runtime system at
> program startup.
> So, your concept of purity is meaningless, since there are cases where
> it's violation is not posibly observable.
But the violation need not be observable by the program itself. Very
few properties of programs are actually observable by the program
itself, and in some cases (e.g. halting problem) the point is actually
that they cannot possibly be observable by the program itself.
The concept of purity is interesting for the compiler (because it
allows more transformations), and for the programmer reasoning about
his program (because it simplifies this reasoning).
- Dirk
| |
| Dirk Thierbach 2007-06-01, 8:05 am |
| Ville Oikarinen <ville@oikarinen.org> wrote:
> I said this before: if a pure program only depends on values that don't
> change between invocations, then the whole program is just a big constant.
Yes, exactly. That's actually what a lambda term is. Just one big
constant, when finally reduced to normal-form.
> And there is no point in publishing such a program: just publish its
> result (unless the result is bigger than the program that generates it).
But sometimes it's easier to write the program than to calculate the
result with pencil and paper. If I want to know the number of all
different solutions to the 8-queen-problem, then that's just a
constant. Nevertheless, I will write a program to calculate a
number. If I knew this number in advance, I wouldn't need the program :-)
> So while you _can_ define purity like that, what practical use does such a
> definition have?
See other post: It helps when reasoning about the program. If a
function is pure, it neither has nor depends on side-effects. So I can
move at around, delay its execution, mechanically transform it into
another function, prove things about it, etc. If it's impure, I cannot
do that: it would change the program. That's why in Haskell impure
stuff is "imprisoned" in a monad.
And if I nevertheless want to reason about some program that depends
on impure values, I do exactly what Ingo has proposed in the
beginning: I rewrite whatever the function of interest is to take
extra parameters which capture the "impure" dependencies. Then the
dependencies are made explicit, the function itself becomes pure, is
now easier to reason about, and can for example be tested separately
(without the need for actual I/O). The only thing that remains impure
is the call of the function inside the monad (or, in our hypothetical
case with getArgs, the evaluation of external information before the
start of the program).
That's a perfectly legal method, and I do that in fact quite
frequently myself.
- Dirk
| |
| Ingo Menger 2007-06-01, 8:05 am |
| On 1 Jun., 11:01, Dirk Thierbach <dthierb...@usenet.arcornews.de>
wrote:
> Ingo Menger <quetzalc...@consultant.com> wrote:
> The concept of purity is interesting for the compiler (because it
> allows more transformations), and for the programmer reasoning about
> his program (because it simplifies this reasoning).
I see the point for the compiler, who could, when getArgs = ["prog", "-
x"], replace expressions like head (head getArgs) with 'p'.
But for the human reasoner, the interesting thing would be that (head
(head getArgs) == head (head getArgs))
| |
| Tony Finch 2007-06-01, 8:05 am |
| Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
>
>The point of purity is that it allows you to reason about the program
>using the program itself (i.e., the source code with all modules,
>or however you'd define it) alone. If a function is pure, you can safely
>replace it with anything that respects the same argument/result
>relationship. Also, you can move it around, delay it's execution, and
>whatnot, and all this is completely safe. with getArgs, you cannot
>do that -- once you try to replace it with it's result, you're stuck
>with one single invocation with a fixed commandline. Hence, it's not
>pure, as is everything else that depends on something that is defined
>outside of the program (i.e., source code) itself.
Your argument against getArgs is equivalent to arguing that closures
are not pure, because the closed function can refer to free variables
which can have different values each time the closure is evaluated.
runprog argv = unsafePerformIO main
where getArgs () = argv
main = do ...
Tony.
--
f.a.n.finch <dot@dotat.at> http://dotat.at/
LUNDY FASTNET IRISH SEA: SOUTHEAST BECOMING VARIABLE THEN SOUTH 3 OR 4.
MODERATE OCCASIONALLY ROUGH. SHOWERS. GOOD.
| |
| Dirk Thierbach 2007-06-01, 8:05 am |
| Ingo Menger <quetzalcotl@consultant.com> wrote:
> On 1 Jun., 11:01, Dirk Thierbach <dthierb...@usenet.arcornews.de>
> wrote:
[color=darkred]
[color=darkred]
> I see the point for the compiler, who could, when getArgs = ["prog", "-
> x"], replace expressions like head (head getArgs) with 'p'.
Yes, for example. Thank you. I was beginning to get desperate.
> But for the human reasoner, the interesting thing would be that (head
> (head getArgs) == head (head getArgs))
For the human reasoner, the interesting thing in this case is the same
as for the compiler, namely, that he cannot substitute an expression
which is not a variable by its final value, even though it has a
ground type, because the final value is not known before the program
actually is executed (in other words, it's impure, if it would be allowed
in this way).
- Dirk
| |
| Dirk Thierbach 2007-06-01, 8:05 am |
| Tony Finch <dot@dotat.at> wrote:
> Your argument against getArgs is equivalent to arguing that closures
> are not pure, because the closed function can refer to free variables
> which can have different values each time the closure is evaluated.
No. There's a big difference between free variables and expressions
with side effects. As soon as you declare "args" as a free variable
*and clearly say so*, for example by requiring it in the argument list
of main, everything is pure (as now stated multiple times). If you
treat "args" as a named expression of ground type, which has the side
effect of evaluating to a different value for different executions of
the program, it's impure.
Is the difference clear?
- Dirk
| |
| Tony Finch 2007-06-01, 10:07 pm |
| Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
>
>No. There's a big difference between free variables and expressions
>with side effects. As soon as you declare "args" as a free variable
>*and clearly say so*, for example by requiring it in the argument list
>of main, everything is pure (as now stated multiple times). If you
>treat "args" as a named expression of ground type, which has the side
>effect of evaluating to a different value for different executions of
>the program, it's impure.
An argument isn't a free variable. I don't see why you insist that a
top-level free variable must be more fixed in value than a free variable
in a nested function.
Tony.
--
f.a.n.finch <dot@dotat.at> http://dotat.at/
NORTHEAST VIKING NORTH UTSIRE: VARIABLE 3 OR 4. SLIGHT OR MODERATE. OCCASIONAL
RAIN, FOG PATCHES. MODERATE OR GOOD, OCCASIONALLY VERY POOR.
| |
| Dirk Thierbach 2007-06-01, 10:07 pm |
| Tony Finch <dot@dotat.at> wrote:
> Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
[color=darkred]
> An argument isn't a free variable.
In the body of the expression, it's a free variable. It becomes bound
when below a lambda abstraction that binds that variable.
> I don't see why you insist that a top-level free variable must be
> more fixed in value than a free variable in a nested function.
I don't do that. There must be a misunderstanding somewhere.
Besides, there are no top-level free variables allowed in
Haskell. Only closed terms can be executed, because any remaining free
variables would have no defined meaning.
In theory, one could lift that restriction and introduce a number of
top-level free variables that have a defined meaning (or equivalently,
silently put everything under the corresponding lambda abstractions).
But that would now be really ugly and ad-hoc (because it's not
explicit, for once), and make arbitrary parts of the OS interaction
quite different from the others. Not a good idea.
In case that is where the misunderstanding comes from, note that
"getArgs" is not a variable (whether free or not): It's a primitive that
is let-bound to this name.
- Dirk
| |
| David Hopwood 2007-06-01, 10:07 pm |
| Donn Cave wrote:
> Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
>
>
> Yes, that's how I interpreted the argument, but as I look around
> a little it doesn't look to me like there's a real consensus on it.
>
> For example, the ultimate authority of our time, Wikipedia:
>
> Pure function
> ... if both these statements about the function hold.
>
> 1. The function always evaluates the same result value given
> the same argument value(s). The function result value cannot
> depend on any hidden information or state that may change as
> program execution proceeds, nor can it depend on any external
> input from IO devices.
>
> 2. Evaluation of the result does not cause any semantically
> observable side effect or output, such as mutation of mutable
> objects or output to IO devices.
Is a function that uses strictly internal state, e.g. via Haskell
runST [*], pure? According to Haskell's type system, it is, and I think
that is also the general concensus. However, it does not satisfy statement 2.
So, "semantically observable" should probably be qualified as something
like "semantically observable to the rest of the program".
[*]
<http://www.haskell.org/ghc/docs/lat...l-Monad-ST.html>
<http://citeseer.ist.psu.edu/63882.html>
--
David Hopwood <david.hopwood@industrial-designers.co.uk>
| |
| Dirk Thierbach 2007-06-02, 7:05 pm |
| David Hopwood <david.hopwood@industrial-designers.co.uk> wrote:
> Donn Cave wrote:
[color=darkred]
[color=darkred]
> Is a function that uses strictly internal state, e.g. via Haskell
> runST [*], pure? According to Haskell's type system, it is, and I
> think that is also the general concensus. However, it does not
> satisfy statement 2.
Well, it's Wikipedia :-)
> So, "semantically observable" should probably be qualified as something
> like "semantically observable to the rest of the program".
Or maybe even "semantically observable to the rest of the program, or
from outside of the program". Point (1) deals with functions which depend
on side-effects, while (2) deals with functions which cause side-effects.
In both cases, the side-effects can be "inside" or "outside" of the
program.
(After executing the monad via runST, the internal state is discarded,
and hence any possible changes to the internal state become completely
unobservable -- as if they had never happened. Hence, no side-effects
remain, and that's why it's considered pure.)
- Dirk
| |
| Matthias Blume 2007-06-02, 7:05 pm |
| Dirk Thierbach <dthierbach@usenet.arcornews.de> writes:
> David Hopwood <david.hopwood@industrial-designers.co.uk> wrote:
>
>
>
>
> Well, it's Wikipedia :-)
What do you mean? It /does/ satisfy statement 2. That's the whole
point of runST!
>
> Or maybe even "semantically observable to the rest of the program, or
> from outside of the program". Point (1) deals with functions which depend
> on side-effects, while (2) deals with functions which cause side-effects.
> In both cases, the side-effects can be "inside" or "outside" of the
> program.
>
> (After executing the monad via runST, the internal state is discarded,
> and hence any possible changes to the internal state become completely
> unobservable -- as if they had never happened. Hence, no side-effects
> remain, and that's why it's considered pure.)
Exactly. There are no observable effects that "survive" runST. For
example, it cannot be used to let a function "remember" things from
one invocation to the next.
| |
| Dirk Thierbach 2007-06-02, 7:05 pm |
| Matthias Blume <find@my.address.elsewhere> wrote:
> Dirk Thierbach <dthierbach@usenet.arcornews.de> writes:
[color=darkred]
[color=darkred]
[color=darkred]
[color=darkred]
> What do you mean? It /does/ satisfy statement 2. That's the whole
> point of runST!
The wording is ambigous: runST /does/ cause semantically observable
side effects. There are just only observable "inside" the monad,
not "outside". So, if you understand statement 2 to mean *any*
observable side effect, including those "inside", then runST /does not/
satisfy statement 2, though it's considered pure.
(Yes, that's just bean counting, and we all know what is meant, and
too much precision can really get on everyone's nerves, but you asked.)
- Dirk
|
|
|
|
|