For Programmers: Free Programming Magazines  


Home > Archive > AWK > July 2006 > joining all lines









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author joining all lines
saylee

2006-06-24, 7:59 am

I need to join all the lines in a file as one single line, and then
split it to different lines whenever there is a ";" (semicolon)

please help..

Xicheng Jia

2006-06-24, 7:59 am

saylee wrote:
> I need to join all the lines in a file as one single line, and then
> split it to different lines whenever there is a ";" (semicolon)
>
> please help..


you problem will be easier if you use other shell utilities instead of
awk, i.e.

tr -s "\n" " " < myfile.txt | sed 's/;\s*/;\n/g'

(GNU sed)
Xicheng

quarkLore

2006-06-24, 7:59 am


Xicheng Jia wrote:[color=darkred]
> saylee wrote:
basically you want text to be separated by ";" rather than new line
character

in that case, this script can be used [ ] is for debugging purpose:

awk 'BEGIN {RS=";"}// {printf("[%s]\n",$0);}'
<file_name>

This will keep the new lines. If you want to remove new line characters
also then

awk 'BEGIN {RS=";"}//
{gsub("\n","");printf("[%s]\n",$0);}' <file_name>

gsub will substitute new line characters with nothing, so will be
deleting them

Loki Harfagr

2006-06-24, 7:59 am

Le Thu, 22 Jun 2006 22:39:06 -0700, Xicheng Jia a écrit_:

> saylee wrote:
>
> you problem will be easier if you use other shell utilities instead of
> awk,


Well, why ? Especially in the kind of problems where awk's the best tool ...?

> i.e.
>
> tr -s "\n" " " < myfile.txt | sed 's/;\s*/;\n/g'



$ gawk -v RS=';' -v FS='\n' '$1=$1' yourfile.txt
Chris F.A. Johnson

2006-06-24, 7:59 am

On 2006-06-23, Loki Harfagr wrote:
> Le Thu, 22 Jun 2006 22:39:06 -0700, Xicheng Jia a écrit_:
>
>
> Well, why ? Especially in the kind of problems where awk's the best tool ...?
>
>
>
> $ gawk -v RS=';' -v FS='\n' '$1=$1' yourfile.txt


Or, to preserve the semicolons:

awk -v RS=';' -v FS='\n' '$1=$1 {printf "%s;\n", $0}' yourfile.txt


--
Chris F.A. Johnson, author <http://cfaj.freeshell.org>
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
===== My code in this post, if any, assumes the POSIX locale
===== and is released under the GNU General Public Licence
William Park

2006-06-24, 6:56 pm

Xicheng Jia <xicheng@gmail.com> wrote:
> saylee wrote:
>
> you problem will be easier if you use other shell utilities instead of
> awk, i.e.
>
> tr -s "\n" " " < myfile.txt | sed 's/;\s*/;\n/g'


Good stuff. OP may want to experiment with one-liners, like
tr '\n;' ' \n'

--
William Park <opengeometry@yahoo.ca>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/
Steffen Schuler

2006-07-01, 6:56 pm

Loki Harfagr wrote:
> Le Thu, 22 Jun 2006 22:39:06 -0700, Xicheng Jia a écrit :
>
>
>
>
> Well, why ? Especially in the kind of problems where awk's the best tool ...?
>
>
>
>
>
> $ gawk -v RS=';' -v FS='\n' '$1=$1' yourfile.txt

some strange behaviour:

schuler@paulus:bin$ echo -e "1;2\n3;\n4\n" | gawk -v RS=';' -v FS='\n'
'$1=$1'
1
2 3
schuler@paulus:bin$ echo -e "1;2\n3;\n4\n" | gawk -v RS=';' '$1=$1'
1
2 3
4
schuler@paulus:bin$ gawk --version
GNU Awk 3.1.4
[...]

Is it a bug or a feature?

Greetings from Munich

Steffen Schuler
Steffen Schuler

2006-07-01, 6:56 pm

Steffen Schuler wrote:
> Loki Harfagr wrote:
>
>
> some strange behaviour:
>
> schuler@paulus:bin$ echo -e "1;2\n3;\n4\n" | gawk -v RS=';' -v FS='\n'
> '$1=$1'
> 1
> 2 3
> schuler@paulus:bin$ echo -e "1;2\n3;\n4\n" | gawk -v RS=';' '$1=$1'
> 1
> 2 3
> 4
> schuler@paulus:bin$ gawk --version
> GNU Awk 3.1.4
> [...]
>
> Is it a bug or a feature?
>
> Greetings from Munich
>
> Steffen Schuler

better use:

schuler@paulus:bin$ echo -e "1;2\n3;\n4\n5 6" | awk -W posix -v FS=';'
-v OFS=';\n' -v ORS=' ' '$1=$1'; echo
1;
2 3;
4 5 6
schuler@paulus:bin$

that's also POSIX compatible

Greetings from Munich

Steffen Schuler
Loki Harfagr

2006-07-02, 6:56 pm

Le Sat, 01 Jul 2006 17:12:28 +0200, Steffen Schuler a écrit_:

> Steffen Schuler wrote:
> better use:
>
> schuler@paulus:bin$ echo -e "1;2\n3;\n4\n5 6" | awk -W posix -v FS=';'
> -v OFS=';\n' -v ORS=' ' '$1=$1'; echo
> 1;
> 2 3;
> 4 5 6
> schuler@paulus:bin$
>
> that's also POSIX compatible
>
> Greetings from Munich
>
> Steffen Schuler



Thanks for your remarks, this "feature" is quite strange in a way :-O)

$ echo -e "1;2\n3;\n4\n5 6" | awk -v RS=";" -v FS="\n" -v ORS=";\n" '$1=$1'
1;
2 3;

And, now this feature vanishes:
if I use something a bit more "expressive" than '$1=$1'
:
$ echo -e "1;2\n3;\n4\n5 6" | awk -v RS=";" -v FS="\n" -v ORS=";\n"
'{$1=$1} 1'
1;
2 3;
4 5 6 ;

Or its correct posix closest equivalent :
$ echo -e "1;2\n3;\n4\n5 6" | awk -W posix -v RS=";" -v FS="\n" -v ORS=";\n" '{$1=$1;print}'
1;
2 3;
4 5 6 ;


So, it seems this "feature" is implied by the behaviour of the ""implicit""
in the cagey version '$1=$1' ... Mmm, I guess Chris F.A. is quite right to
repeat sometimes we should avoid trying and look smart with "special
syntax" !-)

Thanks Steffen. I think I'll now try and remember this thread before
to write some scripts, at least it'll remind me to extend the test
templates !-)


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com