For Programmers: Free Programming Magazines  


Home > Archive > Prolog > May 2004 > hexdump









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author hexdump
Brian Raiter

2004-05-04, 2:59 pm

I'm trying to write the classic hexdump program in Prolog (GNU Prolog,
to be precise), as a part of figuring out how to use Prolog to do
traditionally imperative tasks. If you've never seen hexdump before,
it's an old standard for displaying the contents of binary
files. Here's a sample output from an imaginary 40-byte file.

00000000: 0400 0000 2F6C 6962 2F6C 642D 6C69 6E75 ..../lib/ld-linu
00000010: 782E 736F 2E32 0000 0400 0000 1000 0000 x.so.2..........
00000020: 0100 0000 474E 5500 ....GNU.

The basic idea is: 16 bytes per line, offset on the far left, byte
values grouped in pairs, and ASCII display on the far right with
non-graphic characters displayed as periods.

I did manage to get a (mostly) working Prolog hexdump program, but
it's so ugly I hesitate to post it here. I would encourage others to
post their attempts, however. I'd like to see what other people come
up with.

The problem I'm having is with the main loop. Here's how I wrote it:

hexdump(_, [], -1).
hexdump(Pos, Line, -1) :- displayline(Pos, Line).
hexdump(Pos, Line, Ch) :- length(Line, L), L == 16,
displayline(Pos, Line), NewPos is Pos + L,
hexdump(NewPos, [], Ch).
hexdump(Pos, Line, Ch) :- append(Line, [Ch], NewLine), get_byte(NewCh),
hexdump(Pos, NewLine, NewCh).

hexdump :- get_byte(Ch), hexdump(0, [], Ch).

(displayline/2 is where the actual output is done.) Now, this works
fine for small inputs. But for large files the program runs out of
stack space. I thought the tail-recursion would prevent that, but
apparently not. I've tried rearranging the clauses and scattering cuts
around to remove choice points, but even so I can't go more than about
64k before memory is exhausted dying.

Does GNU Prolog just not do tail-recursion optimization? Or is there a
better way to do this that doesn't rely on tail-recursion
optimizations to keep the stack from blowing up?

Side question: I'm currently opening the file passed on the cmdline
and using set_input/1 to make it the current input file. I'd like to
default to the standard user input if no filename is given on the
cmdline, but that file is opened in text mode. How do I get a binary
stream out of the standard user input?

b
Kral Stefan

2004-05-04, 2:59 pm

Brian,

Brian Raiter (blr@drizzle.com) wrote:
: [...]
: Does GNU Prolog just not do tail-recursion optimization?
: Or is there a better way to do this that doesn't rely on
: tail-recursion optimizations to keep the stack from blowing up?
GNU-Prolog does not have a garbage-collector yet --- one
way to keep memory requirements reasonably small is using
failure-driven loops. (Intermediate data allocated on the
heap is released as soon as failure is triggered.)

Regards,
Stefan.

--
Stefan Kral http://www.complang.tuwien.ac.at/skral/
bart demoen

2004-05-04, 2:59 pm


Brian Raiter wrote:
>
> I did manage to get a (mostly) working Prolog hexdump program, but
> it's so ugly I hesitate to post it here. I would encourage others to
> post their attempts, however. I'd like to see what other people come
> up with.


That's not really fair: you don't want to show us your disgustingly
shaped turnip, but you want to see ours !


> The problem I'm having is with the main loop. Here's how I wrote it:
>
> hexdump(_, [], -1).
> hexdump(Pos, Line, -1) :- displayline(Pos, Line).
> hexdump(Pos, Line, Ch) :- length(Line, L), L == 16,
> displayline(Pos, Line), NewPos is Pos + L,
> hexdump(NewPos, [], Ch).
> hexdump(Pos, Line, Ch) :- append(Line, [Ch], NewLine), get_byte(NewCh),
> hexdump(Pos, NewLine, NewCh).
>
> hexdump :- get_byte(Ch), hexdump(0, [], Ch).
>
> (displayline/2 is where the actual output is done.) Now, this works
> fine for small inputs. But for large files the program runs out of
> stack space. I thought the tail-recursion would prevent that, but
> apparently not.


That's because it has nothing to do with the stack that is affected by
tail recursion optimization. You get the error:

Fatal Error: global stack overflow

right ?

That's the "heap" with the dynamically allocated data, like lists.
And it is not a surprise that you get that error, since

1) GNU Prolog does not have a heap garbage collector (*)
2) you have naive reverse hidden in you program: the goal
append(Line,[Ch],NewLine)
is making you run out of heap much earlier than
necessary, in fact your program allocates memory
quadratic in the number of read chars, where it could
be linear (or less, but that's more tricky)

(*) I am partly to blame for that :-(

Here is one suggestion:
change the last clause into

hexdump(Pos, Line, Ch) :-
get0(NewCh),
hexdump(Pos, [Ch|Line], NewCh).

and just before doing displayline, reverse the Line - but don't do it
the nrev way - use the accumulator.

An extra suggestion:

write displayline as:

displayline(Pos,Line) :-
rev(Line,ToPrint),
<do the actual writing>,
fail.
displayline(_,_).

In this way, it does not leave behind garbage and you will gain another
factor of 2 (number of bytes you can treat).

Finally, you can go for something that runs in constant space -
something like:

hexdump :- hexdump(0).

hexdump(Pos) :-
hexdump(Pos,16,[]),
!.
hexdump(Pos) :-
NewPos is Pos + 16,
hexdump(NewPos).

hexdump(Pos,ToRead,Line) :-
(ToRead == 0 ->
displayline(Pos,Line),
fail
;
get0(Ch),
(Ch == -1 ->
displayline(Pos,Line)
;
LessToRead is ToRead - 1,
hexdump(Pos,LessToRead,[Ch|Line])
)
).


Well, now I did show you my turnip :-)


> Does GNU Prolog just not do tail-recursion optimization?


GNU Prolog does proper tail-recusion optimization.

I can't help you with your side-question.

Cheers

Bart Demoen

Brian Raiter

2004-05-04, 2:59 pm

> That's not really fair: you don't want to show us your disgustingly
> shaped turnip, but you want to see ours !


True, it isn't fair. I just wanted to invite people to show off their
programs, if they had a mind to.

> 1) GNU Prolog does not have a heap garbage collector (*)


OH. Gee, that would explain it, now wouldn't it?

> 2) you have naive reverse hidden in you program: the goal
> append(Line,[Ch],NewLine)
> is making you run out of heap much earlier than
> necessary, in fact your program allocates memory
> quadratic in the number of read chars, where it could
> be linear (or less, but that's more tricky)


Yep. With tail recursion and (mistakenly-presumed) garbage collection,
I figured that it wouldn't be an issue....

> Finally, you can go for something that runs in constant space -
> something like:


I see. The trick is to fail after displayline has succeeded, so the
extra objects get cleaned up and deallocated, right?

> Well, now I did show you my turnip :-)


Thank you for doing so. Tell you what -- I'll show my turnip, once I
actually get a turnip that works.

b
Brian Raiter

2004-05-12, 9:20 pm

Okay, as promised, here's my rendition of hexdump in GNU Prolog. Still
haven't figured out how to treat the standard input stream as binary.
The other big annoyance with this program is that I couldn't find a
way for format to output a hexadecimal number with leading zeros. (I
figured out how to output a hexadecimal number, and how to output a
decimal number with leading zeros, but not the two together.) So I had
to write my out routine to display hexadecimal values, too.

Comments on the code are of course welcome.

b
________________

% printhex(+Len, +N)
% printhex displays a number N on the current output stream as a
% hexadecimal value. At least Len digits are used; if the number
% requires fewer digits, the output is padded with zeros on the left.

computehex(0, 0, D, D).
computehex(L, 0, D, R) :- NewL is L - 1, computehex(NewL, 0, [0|D], R).
computehex(0, N, D, R) :- computehex(1, N, D, R).
computehex(L, N, D, R) :- NewL is L - 1, NewD is N rem 16, NewN is N // 16,
computehex(NewL, NewN, [NewD|D], R).

computehex(Len, N, Digits) :- computehex(Len, N, [], Digits).

printhex([]).
printhex([D|Digits]) :- D < 10, A is D + 48, put_code(A), printhex(Digits).
printhex([D|Digits]) :- D >= 10, A is D + 55, put_code(A), printhex(Digits).

printhex(Len, N) :- computehex(Len, N, Digits), printhex(Digits).

% hexline(+Bytes, +Len)
% hexline displays a list of bytes (in Bytes) as hexadecimal values on
% the current output stream. The bytes are output in pairs, with a space
% between each pair. If Bytes has fewer than Len elements, the output is
% padded with spaces on the right so that it uses as much space as a
% list of Len bytes.

hexline([], 0).
hexline([], N) :- (N mod 2 =:= 0 -> print(' ') ; true), print(' '),
N1 is N - 1, hexline([], N1).
hexline([B|Bytes], N) :- (N mod 2 =:= 0 -> print(' ') ; true), printhex(2, B),
N1 is N - 1, hexline(Bytes, N1).

% asciiline(+Bytes)
% asciiline displays a list of bytes (in Bytes) as characters on the
% current output stream. Non-printing characters have periods
% substituted.

asciiline([]).
asciiline([B|Bytes]) :- (B >= 32, B =< 126 ; B >= 160, B =< 255),
put_code(B), asciiline(Bytes).
asciiline([_|Bytes]) :- print('.'), asciiline(Bytes).

% displayline(+Pos, +RBytes)
% displayline displays a line of bytes in the classic hexdump format on
% the current output stream. RBytes is a list of the bytes to display in
% reverse order. Pos is a number which is used as the line's offset.

displayline(_, []).
displayline(Pos, RBytes) :- reverse(RBytes, Bytes),
printhex(8, Pos), print(':'),
hexline(Bytes, 16), print(' '),
asciiline(Bytes), nl, !.

%
% The main loop.
%

hexdump(Pos, 0, Line) :- displayline(Pos, Line), !, fail.
hexdump(Pos, Count, Line) :- get_byte(Ch),
(Ch == -1 -> displayline(Pos, Line)
; NewCount is Count - 1,
hexdump(Pos, NewCount, [Ch|Line])).

hexdump(Pos) :- hexdump(Pos, 16, []), !.
hexdump(Pos) :- NewPos is Pos + 16,
hexdump(NewPos).

hexdump :- hexdump(0).

%
% The top level.
%

parseargs :- argument_counter(Argc),
(Argc == 2 -> argument_value(1, InputFile),
open(InputFile, read, Input, [type(binary)]),
set_input(Input)
; true),
(Argc > 2 -> throw('Usage: hexdump [FILENAME]')
; true).

main :- parseargs, hexdump.

:- initialization(main).
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com