Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Multiple printf's: Not printing properly on same line.
HI all,
I'm a bit of an awk newbie. I'm trying to use some conditional
statements to generate printf statements to print colums on a page. As
I read the printf info, it said it didn't do a linefeed till you
explicitly put one in '\n'.

Here's an example of what I'm trying to do on an input of email
addresses. I'm trying in this section to split off the username before
the @ sign. And if there are underscores (_) in the username, I find
out if they are first and middle initials and a last name...and I want
to print out in columns the email address, the first name or first
initial, the middle initial or last name, and the last name if there
are a first name/first initial and middle initial.

I'm trying to do:

BEGIN {FS="~";}
{

v_email = ""
v_name = ""
v_first_name = ""
v_first_initial = ""
v_middle_initial = ""
v_last_name = ""

if ($10 !="" && /@/){
v_email = $10}
else if ($11!="" && /@/){
v_email = $11}
else if ($12 !="" && /@/){
v_email = $12}


split(v_email,v_email_parts,"@")

v_name = v_email_parts[1]

#First, test for under score splitter

if (match(v_name,/_/)){

split(v_name,v_name_parts,"_")

printf("%s\t",v_email)

if (length(v_name_parts[1]) == 1){
v_first_initial = v_name_parts[1]
printf("%s\t",v_first_initial)
}
else {
v_first_name = v_name_parts[1]
printf("%s\t",v_first_name)
}

if (length(v_name_parts[2]) == 1){
v_middle_initial = v_name_parts[2]
printf("%s\t",v_middle_initial)
}
else {
v_last_name = v_name_parts[2]
printf("%s\t",v_last_name)
}

if (length(v_name_parts[3]) > 0){
v_last_name = v_name_parts[3]
printf("%s\t",v_last_name)
}
printf("\n")

}
}

The file picks off the email from one of three colums..and it does
this perfectly. So, lets say the input in the v_email section is like

f_d_flinstone@bedrock.com
mick_jagger@stones.com
john_d_doe@dead.zone

I'd expect the out put to be

f_d_flinstone@bedrock.com     f    d    flintstone
mick_jagger@stones.com        mick jagger
john_d_doe@dead.zone          john  d   doe

But, this isn't the case...I get something like:
f_d_flind@bedrdck.cflintstone

This isn't a real example, since I don't want to publish real email
addresses here. But, it appears to be overwriting the first entry
(v_email) instead of tabbing over across the page till it hits the \n.

If I comment out each printf statement except for one, they all work
individually..just blows when run all together. Can someone give me a
hint as to what's going wrong...or links to good info on this? I can't
find any good examples on the newsgroups or books so far on this.

This is just part of a program I'm writing, I'll be parsing for all
kinds of things in the name, but, this is the first section I'm
tackling.

Thanks in advance!!

Chilecayenne

Report this thread to moderator Post Follow-up to this message
Old Post
cayenne
09-02-04 08:55 AM


Re: Multiple printf's: Not printing properly on same line.
On 1 Sep 2004 14:25:53 -0700 in comp.lang.awk, chilecayenne@yahoo.com
(cayenne) wrote:

>HI all,
>I'm a bit of an awk newbie. I'm trying to use some conditional
>statements to generate printf statements to print colums on a page. As
>I read the printf info, it said it didn't do a linefeed till you
>explicitly put one in '\n'.

Correct.

>Here's an example of what I'm trying to do on an input of email
>addresses. I'm trying in this section to split off the username before
>the @ sign. And if there are underscores (_) in the username, I find
>out if they are first and middle initials and a last name...and I want
>to print out in columns the email address, the first name or first
>initial, the middle initial or last name, and the last name if there
>are a first name/first initial and middle initial.

...

>The file picks off the email from one of three colums..and it does
>this perfectly. So, lets say the input in the v_email section is like
>
>f_d_flinstone@bedrock.com
>mick_jagger@stones.com
>john_d_doe@dead.zone
>
>I'd expect the out put to be
>
>f_d_flinstone@bedrock.com     f    d    flintstone
>mick_jagger@stones.com        mick jagger
>john_d_doe@dead.zone          john  d   doe

Exactly what gawk produces.

>But, this isn't the case...I get something like:
>f_d_flind@bedrdck.cflintstone

Looks like your version of awk may have a problem. Need more info.
What command are you using to run awk? Which awk and version are you
using, under which shell and version, and OS and version?

--
Thanks. Take care, Brian Inglis 	Calgary, Alberta, Canada

Brian.Inglis@CSi.com  	(Brian[dot]Inglis{at}SystematicSW[dot]a
b[dot]ca)
fake address		use address above to reply

Report this thread to moderator Post Follow-up to this message
Old Post
Brian Inglis
09-02-04 08:55 AM


Re: Multiple printf's: Not printing properly on same line.

cayenne wrote:
> HI all,
> I'm a bit of an awk newbie. I'm trying to use some conditional
> statements to generate printf statements to print colums on a page. As
> I read the printf info, it said it didn't do a linefeed till you
> explicitly put one in '\n'.
>
> Here's an example of what I'm trying to do on an input of email
> addresses. I'm trying in this section to split off the username before
> the @ sign. And if there are underscores (_) in the username, I find
> out if they are first and middle initials and a last name...and I want
> to print out in columns the email address, the first name or first
> initial, the middle initial or last name, and the last name if there
> are a first name/first initial and middle initial.
>
> I'm trying to do:
>
> BEGIN {FS="~";}
> {
>
> v_email = ""
> v_name = ""
> v_first_name = ""
> v_first_initial = ""
> v_middle_initial = ""
> v_last_name = ""
>
>   if ($10 !="" && /@/){
>     v_email = $10}
>   else if ($11!="" && /@/){
>     v_email = $11}
>   else if ($12 !="" && /@/){
>     v_email = $12}
>
>
> split(v_email,v_email_parts,"@")
>
> v_name = v_email_parts[1]
>
> #First, test for under score splitter
>
>   if (match(v_name,/_/)){
>
> split(v_name,v_name_parts,"_")
>
>  printf("%s\t",v_email)
>
> if (length(v_name_parts[1]) == 1){
>   v_first_initial = v_name_parts[1]
>   printf("%s\t",v_first_initial)
>   }
> else {
>   v_first_name = v_name_parts[1]
>   printf("%s\t",v_first_name)
>   }
>
> if (length(v_name_parts[2]) == 1){
>   v_middle_initial = v_name_parts[2]
>   printf("%s\t",v_middle_initial)
>   }
> else {
>   v_last_name = v_name_parts[2]
>  printf("%s\t",v_last_name)
>  }
>
> if (length(v_name_parts[3]) > 0){
>   v_last_name = v_name_parts[3]
>   printf("%s\t",v_last_name)
>   }
> printf("\n")
>
>   }
> }
>
> The file picks off the email from one of three colums..and it does
> this perfectly. So, lets say the input in the v_email section is like
>
> f_d_flinstone@bedrock.com
> mick_jagger@stones.com
> john_d_doe@dead.zone
>
> I'd expect the out put to be
>
> f_d_flinstone@bedrock.com     f    d    flintstone
> mick_jagger@stones.com        mick jagger
> john_d_doe@dead.zone          john  d   doe
>
> But, this isn't the case...I get something like:
> f_d_flind@bedrdck.cflintstone

The above code shouldn't produce that given the input you show Have you
tried getting rid of some of the printfs to narrow it down to exactly
which printf(s) cause the problem?

Two possibilites are that your actual input file either:

a) contains control characters which could cause the output to look
jumbled, or
b) contains empty lines or others which don't contain a "@" in which
case your initial tests for setting v_email would fail and you fall into
the "split" with v_email set to "" and I don't know what would happen
with the resultant invalid array accesses you do after that.

For "a", which I think is the most likely problem, you just need to
check your input. For "b", you should really structure your code as:

BEGIN{ ... }
/@/ { ... }

rather than just:

BEGIN{ ... }
{ ... }

to make sure you're only processing lines with an "@" symbol (presumably
email addresses).

An unrelated enhancement you might want to consider is to change this:

if (match(v_name,/_/)){

split(v_name,v_name_parts,"_")

to this:

num_parts = split(v_name,v_name_parts,"_")

if (num_parts > 1){

i.e. just check the value returned from split to see if there as an "_"
rather than having to call a separate "match" function first.

You also don't need to test for whether the last name is in the 2nd or
3rd piosition becayuse you can just do:

v_last_name = v_name_parts[num_parts]

You should probably revisit the way you're assigning v_last_name anyway
since your current method would, given an input address of
"jim_bob_jones@whatever.com", set the first name to "jim" and the last
name to "jones" but completely ignore the "bob" (actually it would save
that as the last name then over-write it).

Hope that helps,

Ed.


Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
09-02-04 08:56 PM


Re: Multiple printf's: Not printing properly on same line.
Brian Inglis <Brian.Inglis@SystematicSW.Invalid> wrote in message news:<behcj0tujq3v7f6gqr0
62m585833228prv@4ax.com>...
> On 1 Sep 2004 14:25:53 -0700 in comp.lang.awk, chilecayenne@yahoo.com
> (cayenne) wrote:
> 
>
> Correct.
> 
>
> ...
> 
>
> Exactly what gawk produces.
> 
>
> Looks like your version of awk may have a problem. Need more info.
> What command are you using to run awk? Which awk and version are you
> using, under which shell and version, and OS and version?


Hi Brian, thank you very much for your reply!! I'm running Gentoo
Linux, with the gentoo sources kernel: linux-2.4.20-gentoo-r5.

awk --version gives me:
GNU Awk 3.1.3
Copyright (C) 1989, 1991-2003 Free Software Foundation.

I'm a little new to the differences with awk, gawk, and nawk...but,
just to check things a little further I did a look in /bin to find
that awk is a link to gawk on my system:

/bin/awk -> gawk-3.1.3

I'm using the following to run my script:
cat white_pages.csv | awk -f phone1.awk | more

white_pages.csv is my file I'm picking off the email addresses of, and
phone1.awk is my script file. I'm just using more to scroll down the
results to look at them onscreen for now.

Thanks for any insight and suggestions you can help me with! I really
like working with awk so far...but, is easy to stumble as you progress
to slighly more complex things.

CC

Report this thread to moderator Post Follow-up to this message
Old Post
cayenne
09-02-04 08:56 PM


Re: Multiple printf's: Not printing properly on same line.

cayenne wrote:
<snip>
> I'm using the following to run my script:
> cat white_pages.csv | awk -f phone1.awk | more

This is commonly called "UUOC" (Useless Use Of Cat) since awk can take a
file name argument. Do this instead:

awk -f phone1.awk white_pages.csv | more

Regards,

Ed.


Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
09-02-04 08:56 PM


Re: Multiple printf's: Not printing properly on same line.
In article <2deb3d1.0409021126.2c6446fc@posting.google.com>,
chilecayenne@yahoo.com (cayenne) wrote:

> Ed Morton <morton@lsupcaemnt.com> wrote in message
> news:<DcydnafNxcKtu6rcRVn-jA@comcast.com>...
> <snip> 
>
> Thanks for the reply Ed.
> I've gone through and commented out all but one the printf's...each
> one by themselves works just fine.
>
> Yeah, I know I need to clean up the code, and had thought about the
> middle name vs. middle intital...this is just a first run through as I
> started to refine it...and got stuck with the printing problem at this
> early of an stage.
>
> This file is a csv from MS excell. I'll try to check for special
> characters...maybe run a dos2unix on it...But, like I said, if I just
> do one printf, it works...each one individually works...but, if I
> start to use 2 or more of them to spit things out in columns, it mixes
> them all into one line.
>
> I'll check on the special characters tho...
> Any other suggestions greatly appreciated!!
> :-)
>
> CC

Change the command line to

awk -f phone1.awk white_pages.csv | cat -vte | more

The 'cat -vte' will tell you if there are any invisible characters and
especially if there are <CR><LF> pairs by displaying ^M for the <CR>
values and $ for the <LF>

Bob Harris

Report this thread to moderator Post Follow-up to this message
Old Post
Bob Harris
09-03-04 08:55 AM


Re: Multiple printf's: Not printing properly on same line.
In article <2deb3d1.0409030642.1a10d901@posting.google.com>,
cayenne <chilecayenne@yahoo.com> wrote:
...
>ps. Just curious, I've seen mentioned in responses here and other
>forums where people get irritated about using cat 'too much'. Just
>curious as to why?

For some reason, newbies often write:

cat somefile | someutil ...

and this is unnecessary, and, in theory at least, wasteful.  I won't go
into the various details as to why it is unnecessary and wrong (STFW), but
I will take the opportunity to say that, many, many moons ago, I saw the
following in an MSDOS manual:

type file | more

and so, as with most things that are wrong in computing, it is all Bill
Gates's fault.  Note that the above is particularly bad in DOS, which
doesn't have any sort of multitasking and thus has only fake pipes.


Report this thread to moderator Post Follow-up to this message
Old Post
Kenny McCormack
09-03-04 08:56 PM


Re: Multiple printf's: Not printing properly on same line.

cayenne wrote:
<snip>
> ps. Just curious, I've seen mentioned in responses here and other
> forums where people get irritated about using cat 'too much'. Just
> curious as to why?

It's not so much about getting irritated, it's more helping people learn
when they don't need to use it. The common mistake newcomers make is to use:

cat file | some_command

when "some_command" can take a file argument and so the above could be
written as:

some_command file

or it could just read redirected input as:

some_command < file

and save an external command (cat) and a pipe.

Let's say you have a kid who puts on their shoes, then takes them off
and puts on their socks then puts their shoes back on. Wouldn't you tell
them that they don't actually need to put on their shoes the first time?
After seeing several kids do this, wouldn't you get a tad irritated
and wonder who the heck is out there telling kids that that's the right
way to do things? It's kinda like that....

Ed.


Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
09-03-04 08:56 PM


Re: Multiple printf's: Not printing properly on same line.
Kenny McCormack wrote:
 
>
>
> For some reason, newbies often write:
>
> 	cat somefile | someutil ...


I often do (although I'm not a newbie), since I think in terms of
pipelines and this way it goes nicely from left to right.

> and this is unnecessary, and, in theory at least, wasteful.  I won't go
> into the various details as to why it is unnecessary and wrong (STFW), but

On occasion, it can be better. Copying a large file from one disk to
another is often better achieved with:

cat file1 | cat > file2

(or better still dd), since you're reading and writing in parallel like
this.

-Ed





--
(You can't go wrong with psycho-rats.)       (er258)(@)(eng.cam)(.ac.uk)

/d{def}def/f{/Times findfont s scalefont setfont}d/s{10}d/r{roll}d f 5/m
{moveto}d -1 r 230 350 m 0 1 179{1 index show 88 rotate 4 mul 0 rmoveto}
for /s 15 d f pop 240 420 m 0 1 3 { 4 2 1 r sub -1 r show } for showpage


Report this thread to moderator Post Follow-up to this message
Old Post
E. Rosten
09-03-04 08:56 PM


Re: Multiple printf's: Not printing properly on same line.
In article <q5idnS1ImonjMqXcRVn-rQ@comcast.com>,
Ed Morton  <morton@lsupcaemnt.com> wrote:
>
>
>E. Rosten wrote: 
>
>Then presumably writing it as:
>
>	cat somefile | cat | someutil
>
>is even better since it extends even further from left to right ;-).
>It's fine to think in terms of pipelines, but I can't imagine why you'd
>want to introduce commands gratuitously at the head or tail of a pipeine.

Indeed.  The standard Randy Schwartz answer to "but I like to see my data
go from left to right" is:

< file someutil

which works in any Bourne-ish shell.
 
>
>I've never come across that situation. When you say it's better - do you
>mean faster, or more reliable, or something else? Can you go into any
>more detail on why it's better as the benefits aren't intuitively obvious.

I don't the claim holds water in any general sense.  It *might* be true on
some particular piece of hardware under some particular set of conditions.


Report this thread to moderator Post Follow-up to this message
Old Post
Kenny McCormack
09-03-04 08:56 PM


Sponsored Links




Last Thread Next Thread Next
Pages (2): [1] 2 »
Search this forum -> 
Post New Thread

AWK archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 05:01 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.