Code Comments
Programming Forum and web based access to our favorite programming groups.Hi, I have a problem with an awk-script! I extracted some infos out of a database and I know that there are 33 columns - the problem i have is: within these columns there are some with carriage return in it. I have to eliminate those carriage returns. But when awk "sees" one of those carriage returns he operates as if the next column was a new row. I hope my description was clear enough, if not feel free to ask more details, but I think my description was clear enough. Greetings
Post Follow-up to this message
chrishunnell@gmail.com wrote:
> Most of the columns are of fixed size but some are not. This is why I
> am not able to use the FIELDWIDTHS.
> This would be some possible input for the script:
>
> index£stuff£more
> \
> problem here\
> more problem\
> £end of line£
>
> and this should be the output:
> index£stuff£more problem here more problem£end of line£
>
> At the moment the sample seems to be 5 records, but actually it is only
> one record.
> Is it possible to concat the 5 records?
>
Should there be a backslash at the end of that first line? if so then
this would work:
gawk -vRS="#$" -vORS="" '{gsub(/\\\n/," ")}1'
if not then you'd need:
gawk -vRS="#$" -vORS="" '{gsub(/\\\n/," ");gsub(/\n^ /," ")}1'
The above assumes your real line always ends in a pound sign as your
sample input showed.
You need gawk in the above to use an RS with multiple characters. I
substituted a hash ("#") for the pound sign since I don't have that on
my keyboard.
Ed.
Post Follow-up to this messageEd, I want to think you for solving my problem. Greetings, Chris
Post Follow-up to this messagechrishunnell@gmail.com wrote: > after I analysed the script a few minutes i understood it and it is > very simple - but one question arose: > what does the 1 do? > It's a true condition which invokes the default action of printing $0. Remove it and you'll see no output. Ed.
Post Follow-up to this message
chrishunnell@gmail.com wrote:
> after I analysed the script a few minutes i understood it and it is
> very simple
Then try this ;-) :
gawk 'BEGIN{RS="[\\\\]\n|\n"}{ORS=RT~/\\/?"":"\n"}1'
It's a more idiomatic solution for the general question of "how do I
join lines that end in backslashes?". It wouldn't work as-is for the
input sample you posted since your to-be-joined lines don't always end
in backslashes and you want to replace the backslash-newlines with spaces.
Ed.
Post Follow-up to this message
Ed Morton wrote:
>
>
> chrishunnell@gmail.com wrote:
>
>
>
> Then try this ;-) :
>
> gawk 'BEGIN{RS="[\\\\]\n|\n"}{ORS=RT~/\\/?"":"\n"}1'
Make that:
gawk 'BEGIN{RS="\\\\\n|\n"}{ORS=RT~/\\/?"":"\n"}1'
Ed.
Post Follow-up to this messageIn article <6ZSdna1AZ9BkNuXfRVn-jg@comcast.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
...
>
>Make that:
>
>gawk 'BEGIN{RS="\\\\\n|\n"}{ORS=RT~/\\/?"":"\n"}1'
Change that to:
BEGIN{RS="\\\\\n|\n"}ORS=RT~/\\/?" ":"\n"
and you get the desired "backslashes changed into spaces" behavior, as well
as saving a few more keystrokes.
Post Follow-up to this messageMost of the columns are of fixed size but some are not. This is why I am not able to use the FIELDWIDTHS. This would be some possible input for the script: index=A3stuff=A3more \ problem here\ more problem\ =A3end of line=A3 and this should be the output: index=A3stuff=A3more problem here more problem=A3end of line=A3 At the moment the sample seems to be 5 records, but actually it is only one record. Is it possible to concat the 5 records?
Post Follow-up to this message
chrishunnell@gmail.com wrote:
> Most of the columns are of fixed size but some are not. This is why I
> am not able to use the FIELDWIDTHS.
> This would be some possible input for the script:
>
> index£stuff£more
> \
> problem here\
> more problem\
> £end of line£
>
> and this should be the output:
> index£stuff£more problem here more problem£end of line£
>
> At the moment the sample seems to be 5 records, but actually it is only
> one record.
> Is it possible to concat the 5 records?
>
Should there be a backslash at the end of that first line? if so then
this would work:
gawk -vRS="#$" -vORS="" '{gsub(/\\\n/," ")}1'
if not then you'd need:
gawk -vRS="#$" -vORS="" '{gsub(/\\\n/," ");gsub(/\n^ /," ")}1'
The above assumes your real line always ends in a pound sign as your
sample input showed.
You need gawk in the above to use an RS with multiple characters. I
substituted a hash ("#") for the pound sign since I don't have that on
my keyboard.
Ed.
Post Follow-up to this messageIn article <6ZSdna1AZ9BkNuXfRVn-jg@comcast.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
...
>
>Make that:
>
>gawk 'BEGIN{RS="\\\\\n|\n"}{ORS=RT~/\\/?"":"\n"}1'
Change that to:
BEGIN{RS="\\\\\n|\n"}ORS=RT~/\\/?" ":"\n"
and you get the desired "backslashes changed into spaces" behavior, as well
as saving a few more keystrokes.
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.