For Programmers: Free Programming Magazines  


Home > Archive > AWK > July 2004 > Detecting ascii nulls









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Detecting ascii nulls
Gerard C Blais

2004-07-25, 3:55 pm



I have input files with ascii nulls, i.e., hex 00, octal 000.

I'd like to, among other things, detect these nulls, count them, print
data about the offending row, and ulimately, delete the null
character.

/\000/ {nulls++
gsub(/\000/,"")
}

seems like it ought to be a start - and it is, using gawk, on my PC
under windows.

Under Solaris, using /usr/xpg4/bin/awk, nulls are detected on EVERY
line, whether the lines have nulls, or not.

Any ideas?

Thanks!

Gerry
Ian Stirling

2004-07-25, 8:55 pm

Gerard C Blais <gerard.blais@mci.com> wrote:
>
>
> I have input files with ascii nulls, i.e., hex 00, octal 000.
>
> I'd like to, among other things, detect these nulls, count them, print
> data about the offending row, and ulimately, delete the null
> character.
>
> /\000/ {nulls++
> gsub(/\000/,"")
> }
>
> seems like it ought to be a start - and it is, using gawk, on my PC
> under windows.
>
> Under Solaris, using /usr/xpg4/bin/awk, nulls are detected on EVERY
> line, whether the lines have nulls, or not.


Some awks don't deal with \000 well.
For example, with gawk,
find . -print0|awk 'BEGIN{RS="\000"}//'
will print the list of files one per line.

Charles Demas

2004-07-25, 8:55 pm

In article <ac08g0pj1iof1ro206qjersge6rguia323@4ax.com>,
Gerard C Blais <gerard.blais@mci.com> wrote:
>
>
>I have input files with ascii nulls, i.e., hex 00, octal 000.
>
>I'd like to, among other things, detect these nulls, count them, print
>data about the offending row, and ulimately, delete the null
>character.
>
>/\000/ {nulls++
> gsub(/\000/,"")
> }
>
>seems like it ought to be a start - and it is, using gawk, on my PC
>under windows.
>
>Under Solaris, using /usr/xpg4/bin/awk, nulls are detected on EVERY
>line, whether the lines have nulls, or not.
>
>Any ideas?


Use Perl

Awk and sed were meant to deal with text, and nulls are not generally
considered text.


Chuck Demas

--
Eat Healthy | _ _ | Nothing would be done at all,
Stay Fit | @ @ | If a man waited to do it so well,
Die Anyway | v | That no one could find fault with it.
demas@theworld.com | \___/ | http://world.std.com/~cpd
Kenny McCormack

2004-07-25, 8:55 pm

In article <ce17n9$hv$1@pcls3.std.com>,
Charles Demas <demas@TheWorld.com> wrote:
....
>Use Perl
>
>Awk and sed were meant to deal with text, and nulls are not generally
>considered text.


Use C (or, better yet, assembler). Unix text utilities (Awk, sed, Perl,
cut, join, etc) were meant to deal with text, and nulls are not generally
considered text.

Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)
handle nulls just fine.

Patrick TJ McPhee

2004-07-26, 3:55 am

In article <ce17n9$hv$1@pcls3.std.com>,
Charles Demas <demas@TheWorld.com> wrote:

% Use Perl
%
% Awk and sed were meant to deal with text, and nulls are not generally
% considered text.

Whereas perl was not meant to deal with anything in particular,
so if it doesn't handles nulls well, it goes without notice.
--

Patrick TJ McPhee
East York Canada
ptjm@interlog.com
E. Rosten

2004-07-26, 8:55 am

> Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)


Without wishing to get in to a flame war about this: I also find mawk a
useful tool to have around. It's often somewhat faster than gawk. That
applies to the versions that were around when I installed the software
on this computer, anyhow.

-Ed


> handle nulls just fine.





--
(You can't go wrong with psycho-rats.) (er258)(@)(eng.cam)(.ac.uk)

/d{def}def/f{/Times findfont s scalefont setfont}d/s{10}d/r{roll}d f 5/m
{moveto}d -1 r 230 350 m 0 1 179{1 index show 88 rotate 4 mul 0 rmoveto}
for /s 15 d f pop 240 420 m 0 1 3 { 4 2 1 r sub -1 r show } for showpage

Stepan Kasal

2004-07-26, 8:55 am

Hello,

In article <4104DA32.8040409@my.sig>, E. Rosten wrote:
>
> I also find mawk a useful tool [...] It's [...] faster than gawk.


I agree, and I'm sure Arnold would agree too. If the speed of execution of
your code is critical, mawk can help.

Stepan
E. Rosten

2004-07-26, 3:55 pm

>>>Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)
>
>
> I agree, and I'm sure Arnold would agree too. If the speed of execution of
> your code is critical, mawk can help.


I have a habit of using awk for a general quick & easy interpreted
language for all sorts of stuff (reasonably often including image
manipulation), so speed of execution starts to matter quite a lot.

But gawk offers enought neat features that I have it around as well.

-Ed



--
(You can't go wrong with psycho-rats.) (er258)(@)(eng.cam)(.ac.uk)

/d{def}def/f{/Times findfont s scalefont setfont}d/s{10}d/r{roll}d f 5/m
{moveto}d -1 r 230 350 m 0 1 179{1 index show 88 rotate 4 mul 0 rmoveto}
for /s 15 d f pop 240 420 m 0 1 3 { 4 2 1 r sub -1 r show } for showpage

Gerard C Blais

2004-07-26, 3:55 pm

Thanks for all the suggestions.

I've found a mawk, and will try to get it insatlled on the Solaris
box.

Gerry

On Sun, 25 Jul 2004 14:58:53 -0400, Gerard C Blais
<gerard.blais@mci.com> wrote:

>
>
>I have input files with ascii nulls, i.e., hex 00, octal 000.
>
>I'd like to, among other things, detect these nulls, count them, print
>data about the offending row, and ulimately, delete the null
>character.
>
>/\000/ {nulls++
> gsub(/\000/,"")
> }
>
>seems like it ought to be a start - and it is, using gawk, on my PC
>under windows.
>
>Under Solaris, using /usr/xpg4/bin/awk, nulls are detected on EVERY
>line, whether the lines have nulls, or not.
>
>Any ideas?
>
>Thanks!
>
>Gerry


Harlan Grove

2004-07-28, 3:56 pm

"Kenny McCormack" <gazelle@yin.interaccess.com> wrote...
....
>Use C (or, better yet, assembler). Unix text utilities (Awk, sed, Perl,
>cut, join, etc) were meant to deal with text, and nulls are not generally
>considered text.
>
>Actually, both Gawk & TAWK (the only flavors of AWK anyone should ever use)
>handle nulls just fine.


It's been a long, long time since Perl was just a text utility. It handles
NULLs just fine.

That said, if the OP wanted to remove NULLs from files, wouldn't tr -d work
for that? If the OP wanted to count NULLs and report on their lines, the
file could be passed through od and a vanila awk script used to keep track
of NULLs and newlines. All I'm trying to show is that the OP's tasks could
be performed with the standard Solaris POSIX tools.



Posted Via Nuthinbutnews.Com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.nuthinbutnews.com
Charles Demas

2004-07-28, 3:56 pm

In article <410727e8$1_1@127.0.0.1>, Harlan Grove <hrlngrv@aol.com> wrote:
>"Kenny McCormack" <gazelle@yin.interaccess.com> wrote...
>...
>
>It's been a long, long time since Perl was just a text utility. It handles
>NULLs just fine.
>
>That said, if the OP wanted to remove NULLs from files, wouldn't tr -d work
>for that?


I seem to recall that some versions of tr don't handle nulls, or maybe
it was that nulls were automatically deleted by that version of tr.



Chuck Demas

--
Eat Healthy | _ _ | Nothing would be done at all,
Stay Fit | @ @ | If a man waited to do it so well,
Die Anyway | v | That no one could find fault with it.
demas@theworld.com | \___/ | http://world.std.com/~cpd
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com