Home > Archive > Prolog > September 2007 > understanding query order and performance
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
understanding query order and performance
|
|
| Chui Tey 2007-09-21, 7:08 pm |
| Take this rather simple query:
t(1).
t(2).
t(4).
listofTs_1([], 0).
listofTs_1([t(H)|T], Length):-
LengthMinusOne is Length - 1,
length(T, LengthMinusOne),
t(H),
listofTs_1(T, LengthMinusOne).
% returns t(1), t(1), ...], [t(2), t(1), ...], ...
?- time((listofTs_1(L, 10), fail)).
% 236,193 inferences, 0.08 CPU in 0.08 seconds (100% CPU, 3023270
Lips)
listofTs_1b([], 0).
listofTs_1b([t(H)|T], Length):-
t(H),
LengthMinusOne is Length - 1,
length(T, LengthMinusOne),
listofTs_1b(T, LengthMinusOne).
?- time((listofTs_1b(L, 10), fail)).
% 472,385 inferences, 0.14 CPU in 0.14 seconds (100% CPU, 3359182
Lips)
Shuffling the innocuous t(H) to the top of the list costs double the
number of inferences. After thinking a bit about it, it makes sense
why this is so - since it makes an extra inference that a length-check
would have discarded. Is it good idiomatic prolog to do all length
declaration upfront, or am I drawing too long a bow from one single
experiment?
| |
| Markus Triska 2007-09-21, 7:08 pm |
| Chui Tey <teyc@cognoware.com> writes:
> why this is so - since it makes an extra inference that a length-check
> would have discarded.
The actual reason is that the second version recomputes LengthMinusOne
way more often (each time t/1 backtracks). Instead of time/1, use
profile/1 to see that the second versions spends more time in is/2.
> Is it good idiomatic prolog to do all length declaration upfront, or
> am I drawing too long a bow from one single experiment?
What about this version:
n_ts(N, Ts) :- length(Ts, N), ts(Ts).
ts([]).
ts([t(T)|Ts]) :- t(T), ts(Ts).
Everything that *can* be handled with pattern matching *should* be
handled with pattern matching; there's no reason to drag the list
length with you. This definition can also be used in all directions:
%?- n_ts(N, Ts).
%@ N = 0,
%@ Ts = [];
%@ N = 1,
%@ Ts = [t(1)] ;
%@ ... etc.
On top of it, it's much faster. Making it *not* tail recursive makes
it faster still on enumeration. However, it uses more local stack when
the list is given.
--
comp.lang.prolog FAQ: http://www.logic.at/prolog/faq/
| |
| Markus Triska 2007-09-21, 7:08 pm |
|
> The actual reason is that the second version recomputes LengthMinusOne
> way more often (each time t/1 backtracks). Instead of time/1, use
> profile/1 to see that the second versions spends more time in is/2.
Sorry, the costly part is length/2; I also recommend the "seconds"
view of the profiler.
| |
| Markus Triska 2007-09-21, 7:08 pm |
|
> The actual reason is that the second version recomputes
Sorry, this explanation is flawed; the other part of the message holds.
| |
| bart demoen 2007-09-21, 7:08 pm |
| On Fri, 21 Sep 2007 08:17:12 -0700, Chui Tey wrote:
> Take this rather simple query:
>
> t(1).
> t(2).
> t(4).
>
> listofTs_1([], 0).
>
> listofTs_1([t(H)|T], Length):-
> LengthMinusOne is Length - 1,
> length(T, LengthMinusOne),
> t(H),
> listofTs_1(T, LengthMinusOne).
>
> % returns t(1), t(1), ...], [t(2), t(1), ...], ...
>
> ?- time((listofTs_1(L, 10), fail)).
> % 236,193 inferences, 0.08 CPU in 0.08 seconds (100% CPU, 3023270
> Lips)
>
> listofTs_1b([], 0).
>
> listofTs_1b([t(H)|T], Length):-
> t(H),
> LengthMinusOne is Length - 1,
> length(T, LengthMinusOne),
> listofTs_1b(T, LengthMinusOne).
>
> ?- time((listofTs_1b(L, 10), fail)).
> % 472,385 inferences, 0.14 CPU in 0.14 seconds (100% CPU, 3359182
> Lips)
>
> Shuffling the innocuous t(H) to the top of the list costs double the
> number of inferences. After thinking a bit about it, it makes sense
> why this is so - since it makes an extra inference that a length-check
> would have discarded. Is it good idiomatic prolog to do all length
> declaration upfront, or am I drawing too long a bow from one single
> experiment?
You might not believe this, but it is possible to deduce automatically the
complexity of a query from the Prolog code. In the first case, the
recurrence equation would be something like:
cost(N) = 1 + a*(N-1) + 3*cost(N-1)
where the 3 is related to the fact that t/1 has 3 facts.
In the second case, it would be
cost(N) = 1 + 3*(1+a*(N-1) + cost(N-1))
[just doing this without much checking, so be carefull]
You solve these equations and you should either
find the factor two you seem to measure [I don't in other Prolog systems]
and be satisfied
or
find something different and try to find out why ....
Maybe also do your experiment with 4 and 5 and 6 facts for t/1 ...
And to answer your last question explicitly: yes, you are.
Cheers
Bart Demoen
| |
| Chui Tey 2007-09-23, 4:27 am |
| Thank you all for your replies..
% listofTs_1
% 236,193 inferences, 0.11 CPU in 0.11 seconds (100% CPU, 2159479
Lips)
% listofTs_1b
% 472,385 inferences, 0.14 CPU in 0.14 seconds (100% CPU, 3359182
Lips)
% n_ts - MUCH faster
% 118,101 inferences, 0.02 CPU in 0.03 seconds (50% CPU, 7558464 Lips)
Further questions:
#1 How do I tell which operations are TAIL recursive?
#2 Please explain "... not tail recursive makes it faster still on
enumeration"
| |
| Markus Triska 2007-09-23, 4:27 am |
| Chui Tey <teyc@cognoware.com> writes:
> .. not tail recursive makes it faster still on enumeration
Try your query (which enumerates all solutions) with this version of
ts/1, which is not tail-recursive:
ts([]).
ts([t(T)|Ts]) :- ts(Ts), t(T).
--
comp.lang.prolog FAQ: http://www.logic.at/prolog/faq/
| |
| J. Burse 2007-09-23, 8:06 am |
| Markus Triska schrieb:
> Chui Tey <teyc@cognoware.com> writes:
>
>
> Try your query (which enumerates all solutions) with this version of
> ts/1, which is not tail-recursive:
>
> ts([]).
> ts([t(T)|Ts]) :- ts(Ts), t(T).
>
Not strictly tail recursive, but can
still be optimized, lets call it "last
call" optimization.
Tail recursive:
A :- B.
A :- C, A
Last call optimization:
A :- B.
A :- C, D.
Because D occurs in the last rule of
the rule set for A, no choice point
needs to be stored.
Last call optimization can be also applied to:
A :- C, !, D.
A :- B.
When C has been passed, no choice point
is needed, because the rule has a
cut.
Last call optimization can further be applied when indexing is used:
A(f1,..) :- C, D.
A(f2,..) :- B.
Here when a call A(X,..) occurs and X is bound,
then with a reduced rule set can be worked,
and also no choice point is needed.
Bye
| |
| Markus Triska 2007-09-23, 8:06 am |
| "J. Burse" <janburse@fastmail.fm> writes:
> Not strictly tail recursive, but can still be optimized
It's not tail recursive at all, since the point was that the version
that is NOT tail recursive can be much faster than the tail recursive
one; I recommend to actually measure it with your system, and to
reconsider which of the two versions, if any, is the "optimized" one:
t(a).
t(b).
t(c).
ts1([]).
ts1([t(T)|Ts]) :- t(T), ts1(Ts). % tail recursive
ts2([]).
ts2([t(T)|Ts]) :- ts2(Ts), t(T). % NOT tail recursive
With SWI-Prolog, I got:
%?- time((length(Ts,15),ts1(Ts),fail)).
%@ % 28,697,816 inferences, 6.12 CPU in 6.13 seconds (100% CPU)
%?- time((length(Ts,15),ts2(Ts),fail)).
%@ 7,174,472 inferences, 2.30 CPU in 2.31 seconds (100% CPU)
--
comp.lang.prolog FAQ: http://www.logic.at/prolog/faq/
| |
| J. Burse 2007-09-23, 7:10 pm |
| Markus Triska schrieb:
> t(a).
> t(b).
> t(c).
>
> ts1([]).
> ts1([t(T)|Ts]) :- t(T), ts1(Ts). % tail recursive
>
> ts2([]).
> ts2([t(T)|Ts]) :- ts2(Ts), t(T). % NOT tail recursive
>
> With SWI-Prolog, I got:
>
> %?- time((length(Ts,15),ts1(Ts),fail)).
> %@ % 28,697,816 inferences, 6.12 CPU in 6.13 seconds (100% CPU)
>
> %?- time((length(Ts,15),ts2(Ts),fail)).
> %@ 7,174,472 inferences, 2.30 CPU in 2.31 seconds (100% CPU)
>
ts1 will create variable displays all the time (each
solution of t) for the list, where as ts2 will
create the list only once. (already bmd@cs.kuleuven.be
observation?!)
Simple goal ordering problem. For example if you have:
(1) nondet, det. versus (2) det, nondet.
Now assume nondet generates n solutions and has cost d.
And det has cost c, independently of the result of det.
Then we have as a cost:
(1) d+n*c (2) c+d
For n>1 (1) is slower than (2). For n=1 they have the
same speed. For n<1 (2) is slower than (1). So best would
be dynamic goal reordering, based on runtime statistics.
BTW: SQL databases do this for their joins.
Bye
| |
| bart demoen 2007-09-24, 7:09 pm |
| [not a reply to what Markus wrote, but I am using his post for
partial quotes]
On Sun, 23 Sep 2007 15:25:26 +0200, Markus Triska wrote:
[...]
> the point was that the version
> that is NOT tail recursive can be much faster than the tail recursive
> one;
[...]
> t(a).
> t(b).
> t(c).
>
> ts1([]).
> ts1([_|Ts]) :- t(T), ts1(Ts). % tail recursive
>
> ts2([]).
> ts2([_|Ts]) :- ts2(Ts), t(T). % NOT tail recursive
>
> With SWI-Prolog, I got:
>
> %?- time((length(Ts,15),ts1(Ts),fail)).
> %@ % 28,697,816 inferences, 6.12 CPU in 6.13 seconds (100% CPU)
>
> %?- time((length(Ts,15),ts2(Ts),fail)).
> %@ 7,174,472 inferences, 2.30 CPU in 2.31 seconds (100% CPU)
It is worth "unfolding" the queries - not for length 15, but say 2.
length(Ts,2), ts1(Ts) then becomes:
Ts = [_,_], (Ts = [] ;
Ts = [_|Ts1], t(T1), (Ts1 = [] ;
Ts1 = [_|Ts2], t(T2), (Ts2 = [] ; ...))).
Trowing away what we know fails, we get:
Ts = [_,_], Ts = [_|Ts1], t(T1), Ts1 = [_|Ts2], t(T2), Ts2 = [].
For length(Ts,2), ts2(Ts), we get:
Ts = [_,_], Ts = [_|Ts1], Ts1 = [_|Ts2], Ts2 = [], t(T1), t(T2).
Now it is staring us in the face: in this case, the tail recursive version
mixes the unifications with the backtracking goals, while the non-tail
recursive version does all the unifications first, and the backtracking
goals at the end. In this way, the second version does the unifications
just once, while the other version does (some) unifications many, many
times.
Reasoning on performance in the presence of non-determinism is more
difficult than in the deterministic case.
Cheers
Bart Demoen
|
|
|
|
|