Home > Archive > AWK > November 2004 > Re: Pick on Perl - in Python
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Re: Pick on Perl - in Python
|
|
| Paddy McCarthy 2004-11-26, 3:56 am |
| w_a_x_man@yahoo.com (William James) wrote in message news:<f8860640.0411232153.239861d8@posting.google.com>...
> I noticed something amusing at the Perl site
> http://www.tek-tips.com/viewthread....d=955815&page=1
>
> To convert a file like this
>
> header.jpg
> header.jpg
> footer.gif
> wrap.bmp
>
> to
>
> wrap.bmp|1
> header.jpg|2
> footer.gif|1
>
> the most highly decorated programmer at the site posted this:
>
> -----
> my %log;
> open(LOG, "<test.log") || die qq(Can't open "test.log" for input!\n);
> chomp, $log{$_}++ while <LOG>;
> close(LOG) || die qq(Can't close "test.log"!\n);
> open(LOG, ">test.log") || die qq(Can't open "test.log" for output!\n);
> writelog(%log};
> close(LOG) || die qq(Can't close "test.log"!\n);
>
> sub writelog {
> my %log = @_;
> while (my ($k, $v) = each %log) {
> print LOG join("|", ($k, $v)), "\n";
> }
> }
> -----
>
> Then a rebel awker posted this:
>
> -----
> { a[$0]++ }
> END { for ( k in a )
> print k "|" a[k]
> }
> -----
>
> I suggest that awkers that read this should regularly visit that site
> and sieze every opportunity to make Perl code look hideous.
Hi fellow AWK folks,
I couldn't help thinking that the above was unfair ... for about a micro-
second.
But I then got thinking about how I would tackle this in Python.
Whoops, I should first explain that I was weaned on AWK, am forced to write
in Perl at work (Python forbidden, AWK tolerated); and write in Python and
AWK, for leisure.
My Python solution follows:
### START FILE ###
''' count_uniq.py Counts the unique lines in a file.
Rather like `sort myfile|uniq -c` on unix,
(except the count is to the right).
Example:
pad@Yes-man:/tmp$ cat tmp.log
header.jpg
header.jpg
footer.gif
wrap.bmp
pad@Yes-man:/tmp$ python count_uniq.py tmp.log
footer.gif 1
header.jpg 2
wrap.bmp 1
pad@Yes-man:/tmp$
'''
import fileinput
# get all lines and strip the newline
lines = [ line.rstrip() for line in fileinput.input() ]
# use a set to generate unique lines then count
unique = [ name +" "+ str(lines.count(name)) for name in sorted(set(lines)) ]
print "\n".join(unique)
### END FILE ###
Paddy.
| |
| William James 2004-11-29, 3:58 pm |
| Python does seem preferable to Perl. However...
Some time ago, after I looked at a couple of tutorials, the first
two minutes spent toying with the language showed that it couldn't do
an absurdly simple thing.
Print "foobar\n" with 2 print commands.
In Awk:
printf "foo"; printf "bar\n"
or
ORS=""
print "foo"; print "bar\n"
This utterly trivial task is beyond Python's "print" command
(unless you fiddle with the inner mechanism).
print "foo",
print "bar"
produces "foo bar". There is no way to prevent "print" from
appending a newline or prepending a space (if it follows a print
statement terminated with ","). The implementors of the language
decided that since they didn't like to print without a newline
or space between strings, you shouldn't be allowed to do it.
Did I say "no way"? Change that to "no sane, simple way".
import sys
print "foo",
sys.stdout.softspace=0
print "bar"
This is really fun! I get to type a gigantic compound word just to
print a string! And I get to type "sys.stdout.softspace=0" every time,
yes, every single time I want to suppress the space!
Since the high and mighty implementors frown on us peons being able
to use a simple and fully functional "print", this is the standard way:
import sys
sys.stdout.write("foo")
print "bar"
That the following statement isn't obvious to most people shows that
that the majority have been conditioned to slavish obedience:
In any scripting language one should be able to print "foobar" in chunks
without having to import anything and without having to use a complex
construction such as sys.stdout.write().
| |
| Paddy McCarthy 2004-11-29, 3:58 pm |
| w_a_x_man@yahoo.com (William James) wrote in message news:<f8860640.0411271446.30f4cce1@posting.google.com>...
> Python does seem preferable to Perl. However...
>
> Some time ago, after I looked at a couple of tutorials, the first
> two minutes spent toying with the language showed that it couldn't do
> an absurdly simple thing.
>
> Print "foobar\n" with 2 print commands.
>
> In Awk:
> printf "foo"; printf "bar\n"
> or
> ORS=""
> print "foo"; print "bar\n"
>
> This utterly trivial task is beyond Python's "print" command
> (unless you fiddle with the inner mechanism).
Yep,
Python print works differently to statements of the same name in
other languages
As you found out, it is best used in a line oriented fashion.
`print "foo" "bar"` gives foobar followed by a newline, as does
print "foo"+bar"
Python tries to have a few key things available without importing
modules, but a huge amount of diverse (and documented), modules
available in every distribution.
I very rarely resort to using sys.stderr.write() in Python as it's
now natural for me to organise my Python programs to generate lines
for printing.
Several years ago it was natural for me to sort AWK data by composing
lines for printing to a unix sort command then reading the result back
line by line into an array, using the line number as the key. It was
the
AWK way of doing things then - there was no inbuilt sort, and the Unix
sort was usually very fast.
>
> print "foo",
> print "bar"
>
> produces "foo bar". There is no way to prevent "print" from
> appending a newline or prepending a space (if it follows a print
> statement terminated with ","). The implementors of the language
> decided that since they didn't like to print without a newline
> or space between strings, you shouldn't be allowed to do it.
>
> Did I say "no way"? Change that to "no sane, simple way".
>
> import sys
> print "foo",
> sys.stdout.softspace=0
> print "bar"
You might also use the following:
from sys import stdout
stdout.write("foo")
stdout.write("bar")
>
> This is really fun! I get to type a gigantic compound word just to
> print a string! And I get to type "sys.stdout.softspace=0" every time,
> yes, every single time I want to suppress the space!
>
> Since the high and mighty implementors frown on us peons being able
> to use a simple and fully functional "print", this is the standard way:
>
> import sys
> sys.stdout.write("foo")
> print "bar"
Ah, so you do know about the write method.
I have never found the Python language developers to be high and
mighty
they can spend ages on the newsgroups gently helping people through
their
problems.
When writing AWK I write in the AWK way. When learning a new language
as
well as working out what it would be good for, I try and find the
style
of how problems are solved in the language.
I found it hard to use trailing conditionals in Perl at first,
something
like `print $foo unless $bar=="OK";' On first learning perl I thought
that
it was a hack that allowed people apply band-aids until the script
worked.
Now I only think that it *can* be used as such,
>
> That the following statement isn't obvious to most people shows that
> that the majority have been conditioned to slavish obedience:
> In any scripting language one should be able to print "foobar" in chunks
> without having to import anything and without having to use a complex
> construction such as sys.stdout.write().
If I was new to some programming language that had a print statement
then I
would expect that statement to send output to my terminal. What
happens if
I give it multiple arguments? Can it take multiple arguments?
Seperators?
Line terminators?
Those questions I would expect the language designers to have thought
carefully about and i hope they would have a 'pleasing' solution, but
I
wouldn't expect them to make the same decisions as used in another
programming language.
Given my own language choice I would have a place for AWK and Python.
There probably would also be a place for Perl - I love how `perl -p -i
-e` can do
quick, in-place edits to a file, and have added that to my toolbox.
I do like AWK, but if I look at the data and data manipulation thats
needed
in a problem and I need a dictionary whose values are lists then I
would
rather code it all in Python than have to use something like the
following in
AWK:
data["car_bits"] = data["car_bits"] SUBSEP another_car_bit
I have also used:
data["car_bits",++bit_count["car_bits"]] = another_car_bit
In Python that would be something like:
data["car_bits"] = [] # an empty list
...
data["car_bits"] += another_car_bit
There are uually other gains throughout the program when dealing with
data like this.
I guess I don't have to tell people here what AWK is good for. Its
good
for a lot!
Paddy.
P.S. My thanks to the maintainers of gawk that added the sort
functions.
Nice one :-)
|
|
|
|
|