Code Comments
Programming Forum and web based access to our favorite programming groups.Hi,
I have several tens of files containing SQL code with lines of the form:
INSERT INTO tablex (var1, var2, var3, var4, var5) VALUES (1,'03/20/08 00:00:
00','AM',4,NULL,'some text',NULL);
The time stamp string may consist of both date and time, or only date,
but the date is always MM/DD/YY. I'd like to reformat the date as
YYYY-MM-DD, so the date in the example above should be "2008-03-20".
There are 2 issues I'm struggling with: 1) years may be in the 20th or
21st century, so that every year starting with 0 should belong in the
current century, and the rest in the previous, and 2) time stamps may be
present in more than one variable (i.e. the var's above). gensub()
seemed the way to go:
awk --re-interval '{
sub(/ end,/, " ending,")
if ($0 ~ /[[:digit:]]{2}\/[[:digit:]]{2}\/[[:digit:]]{2}/) {
$0=gensub(/([[:digit:]]{2})\/([[:digit:]]{2})\/([[:digit:]]{2})/,
"\\3/\\2/\\1", "g")
}
print
}'
This of course only rearranges the date pieces, but doesn't add the
century information. The sub() is just to show that other operations
are needed before printing every line. Any suggestions on how to
proceed in this case?
Cheers,
--
Seb
Post Follow-up to this messageSeb wrote:
> Hi,
>
> I have several tens of files containing SQL code with lines of the form:
>
> INSERT INTO tablex (var1, var2, var3, var4, var5) VALUES (1,'03/20/08 00:0
0:00','AM',4,NULL,'some text',NULL);
>
> The time stamp string may consist of both date and time, or only date,
> but the date is always MM/DD/YY. I'd like to reformat the date as
> YYYY-MM-DD, so the date in the example above should be "2008-03-20".
> There are 2 issues I'm struggling with: 1) years may be in the 20th or
> 21st century, so that every year starting with 0 should belong in the
> current century, and the rest in the previous, and 2) time stamps may be
> present in more than one variable (i.e. the var's above). gensub()
> seemed the way to go:
>
> awk --re-interval '{
> sub(/ end,/, " ending,")
> if ($0 ~ /[[:digit:]]{2}\/[[:digit:]]{2}\/[[:digit:]]{2}/) {
> $0=gensub(/([[:digit:]]{2})\/([[:digit:]]{2})\/([[:digit:]]{2})/,
> "\\3/\\2/\\1", "g")
> }
> print
> }'
>
> This of course only rearranges the date pieces, but doesn't add the
> century information. The sub() is just to show that other operations
> are needed before printing every line. Any suggestions on how to
> proceed in this case?
You can write two gensub()s, where you replace the third [[:digit:]]{2}
subpattern by 0[[:digit:]] and [1-9][[:digit:]], resp., and add the
respective century information before the replacement \\3.
Janis
>
> Cheers,
>
Post Follow-up to this messageOn Wed, 26 Mar 2008 02:45:40 +0100,
Janis Papanagnou <Janis_Papanagnou@hotmail.com> wrote:
[...]
> You can write two gensub()s, where you replace the third
> [[:digit:]]{2} subpattern by 0[[:digit:]] and [1-9][[:digit:]], resp.,
> and add the respective century information before the replacement \\3.
Good idea, this seemed to work:
awk --re-interval '{
sub(/ end,/, " ending,")
if ($0 ~ /[[:digit:]]{2}\/[[:digit:]]{2}\/[[:digit:]]{2}/) {
$0=gensub(/([[:digit:]]{2})\/([[:digit:]]{2})\/([1-9])([[:digit:]])/,
"19\\3\\4-\\1-\\2", "g")
$0=gensub(/([[:digit:]]{2})\/([[:digit:]]{2})\/0([[:digit:]])/,
"200\\3-\\1-\\2", "g")
}
print
}'
The regexps are a bit nasty though. Thanks!
--
Seb
Post Follow-up to this message
On 3/25/2008 9:03 PM, Seb wrote:
> On Wed, 26 Mar 2008 02:45:40 +0100,
> Janis Papanagnou <Janis_Papanagnou@hotmail.com> wrote:
>
> [...]
>
>
>
>
> Good idea, this seemed to work:
>
>
> awk --re-interval '{
> sub(/ end,/, " ending,")
> if ($0 ~ /[[:digit:]]{2}\/[[:digit:]]{2}\/[[:digit:]]{2}/) {
> $0=gensub(/([[:digit:]]{2})\/([[:digit:]]{2})\/([1-9])([[:digit:]])/,
> "19\\3\\4-\\1-\\2", "g")
> $0=gensub(/([[:digit:]]{2})\/([[:digit:]]{2})\/0([[:digit:]])/,
> "200\\3-\\1-\\2", "g")
> }
> print
> }'
>
> The regexps are a bit nasty though. Thanks!
I don't think you need the test, just do the gensub() and the pattern will
either match or it won't. Maybe you could simplify it a bit too, e.g.:
awk --re-interval '{
sub(/ end,/, " ending,")
$0=gensub(/([[:digit:]]{2})\/([[:digit:]]{2})\/([[:digit:]]{2})/,
"19\\3\\4-\\1-\\2", "g")
$0=gensub(/19(0[[:digit:]]-)/,"20\\1", "g")
print
}' file
Ed.
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.