Home > Archive > AWK > February 2007 > Comparing 2 dates
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Hermann Peifer 2007-02-10, 6:57 pm |
| Hi All,
I have 2 dates in YYYYMMDD format and would like to know how many days
are between them. A Google search and the GAWK manual tells me that
mktime(datespec) are a good choice.
I hence reformatted my dates so that they match datespec, i.e.:
YYYY MM DD HH MM SS
Then I calculated the difference by using mktime():
BEGIN{FS = OFS = "\t"}
{
print $1,$2,(mktime($1)-mktime($2))/86400
}
All ~6000 results look fine, apart from the 2nd one:
1993 03 30 00 00 00 1993 03 31 00 00 00 -1
1993 03 28 00 00 00 1993 03 29 00 00 00 -0.958333
1993 04 07 00 00 00 1993 04 08 00 00 00 -1
2000 01 01 00 00 00 2001 01 01 00 00 00 -366
2000 01 01 00 00 00 2002 01 01 00 00 00 -731
2000 12 20 00 00 00 2005 01 12 00 00 00 -1484
....
Any idea why the 2nd result is not: -1
?
Thanks, Hermann
| |
| Kenny McCormack 2007-02-10, 6:57 pm |
| In article <45CE2E1C.20601@gmx.net>, Hermann Peifer <peifer@gmx.net> wrote:
>Hi All,
>
>I have 2 dates in YYYYMMDD format and would like to know how many days
>are between them. A Google search and the GAWK manual tells me that
>mktime(datespec) are a good choice.
>
>I hence reformatted my dates so that they match datespec, i.e.:
>YYYY MM DD HH MM SS
>
>Then I calculated the difference by using mktime():
>
>BEGIN{FS = OFS = "\t"}
>{
>print $1,$2,(mktime($1)-mktime($2))/86400
>}
>
>All ~6000 results look fine, apart from the 2nd one:
>
>1993 03 30 00 00 00 1993 03 31 00 00 00 -1
>1993 03 28 00 00 00 1993 03 29 00 00 00 -0.958333
>1993 04 07 00 00 00 1993 04 08 00 00 00 -1
>2000 01 01 00 00 00 2001 01 01 00 00 00 -366
>2000 01 01 00 00 00 2002 01 01 00 00 00 -731
>2000 12 20 00 00 00 2005 01 12 00 00 00 -1484
>...
>
>Any idea why the 2nd result is not: -1
Hmmm. I get the expected results.
Must be something whack about your GAWK version.
| |
| Hermann Peifer 2007-02-10, 6:57 pm |
| Kenny McCormack wrote:
> In article <45CE2E1C.20601@gmx.net>, Hermann Peifer <peifer@gmx.net> wrote:
>
> Hmmm. I get the expected results.
> Must be something whack about your GAWK version.
>
I have identical results with 2 different GAWKs:
> cat /proc/version
Linux version 2.6.10-1.771_FC2smp (bhcompile@porky.build.redhat.com)
(gcc version 3.3.3 20040412 (Red Hat Linux 3.3.3-7)) #1 SMP Mon Mar 28
01:10:51 EST 2005
> gawk --version
GNU Awk 3.1.3
and
$ cat /proc/version
CYGWIN_NT-5.1 1.5.24(0.156/4/2) 2007-01-31 10:57
$ gawk --version
GNU Awk 3.1.5
Hermann
| |
| Steffen Schuler 2007-02-10, 6:57 pm |
| Hermann Peifer wrote:
> Hi All,
>
> I have 2 dates in YYYYMMDD format and would like to know how many days
> are between them. A Google search and the GAWK manual tells me that
> mktime(datespec) are a good choice.
>
> I hence reformatted my dates so that they match datespec, i.e.:
> YYYY MM DD HH MM SS
>
> Then I calculated the difference by using mktime():
>
> BEGIN{FS = OFS = "\t"}
> {
> print $1,$2,(mktime($1)-mktime($2))/86400
> }
>
> All ~6000 results look fine, apart from the 2nd one:
>
> 1993 03 30 00 00 00 1993 03 31 00 00 00 -1
> 1993 03 28 00 00 00 1993 03 29 00 00 00 -0.958333
> 1993 04 07 00 00 00 1993 04 08 00 00 00 -1
> 2000 01 01 00 00 00 2001 01 01 00 00 00 -366
> 2000 01 01 00 00 00 2002 01 01 00 00 00 -731
> 2000 12 20 00 00 00 2005 01 12 00 00 00 -1484
> ...
>
> Any idea why the 2nd result is not: -1
>
> ?
>
> Thanks, Hermann
Hi Hermann,
I assume you belong to the timezone of Europe/Berlin. At
28.03.1993 there was a time resetting according to
www.zeitumstellung.de/zeitumstellun...sch-archiv.html
This means you lost an hour at this date. If you look at the timestamps
of gawk/mktime they differ by exactly one hour or 3600 seconds.
Best Regards,
Steffen Schuler
| |
| Hermann Peifer 2007-02-10, 6:57 pm |
| Steffen Schuler wrote:
> Hermann Peifer wrote:
>
> Hi Hermann,
>
> I assume you belong to the timezone of Europe/Berlin. At
> 28.03.1993 there was a time resetting according to
> www.zeitumstellung.de/zeitumstellun...sch-archiv.html
> This means you lost an hour at this date. If you look at the timestamps
> of gawk/mktime they differ by exactly one hour or 3600 seconds.
>
> Best Regards,
>
> Steffen Schuler
Thanks for the hint. This must be it.
I do indeed belong to the timezone Europe/Copenhagen. The data and the
dates do however come from 34 different countries in Europe. From
Iceland down to Turkey. Dates and timestamps are expected to refer to
UTC. Now I just have to convince gawk to ignore daylight saving times.
Thanks again, Hermann
| |
| Don Stokes 2007-02-10, 6:57 pm |
| In article <45CE4013.7040005@gmx.net>, Hermann Peifer <peifer@gmx.net> wrote:
>I do indeed belong to the timezone Europe/Copenhagen. The data and the
>dates do however come from 34 different countries in Europe. From
>Iceland down to Turkey. Dates and timestamps are expected to refer to
>UTC. Now I just have to convince gawk to ignore daylight saving times.
mktime() is always going to use what it thinks is local time, so
daylight saving will always get you.
If you're just looking for differences in days, you can ignore the time
zone and deal with DST offsets by rounding, i.e:
days = int((mktime(lastdate) - mktime(firstdate)) / 86400 + .5)
If you're not dealing with whole days, or doing something more complicated,
you can run the script with the environment variable "TZ" set to "UTC",
which will mean the "local time" used by mktime() & friends will be
interpreted as UTC with no daylight saving.
-- don
| |
| Steffen Schuler 2007-02-11, 3:57 am |
| Hermann Peifer wrote:
> Steffen Schuler wrote:
>
> Thanks for the hint. This must be it.
>
> I do indeed belong to the timezone Europe/Copenhagen. The data and the
> dates do however come from 34 different countries in Europe. From
> Iceland down to Turkey. Dates and timestamps are expected to refer to
> UTC. Now I just have to convince gawk to ignore daylight saving times.
>
> Thanks again, Hermann
Hi Hermann,
Try that:
$ TZ=Etc/Utc gawk 'BEGIN {print (mktime("1993 03 28 00 00 00") -
mktime("1993 03 29 00 00 00"))/86400}'
-1
With the timezone environment variable TZ you can use UTC in your
application (here GAWK). See tzset(3). UTC ignores daylight
saving times.
Best Regards,
Steffen Schuler
| |
| Hermann Peifer 2007-02-11, 3:57 am |
| Don Stokes wrote:
> In article <45CE4013.7040005@gmx.net>, Hermann Peifer <peifer@gmx.net> wrote:
>
> mktime() is always going to use what it thinks is local time, so
> daylight saving will always get you.
I just came to the same conclusion after trying out what the GAWK manual
suggests about mktime() and DST flags. I should have first read your reply.
> If you're just looking for differences in days, you can ignore the time
> zone and deal with DST offsets by rounding, i.e:
>
> days = int((mktime(lastdate) - mktime(firstdate)) / 86400 + .5)
Great. I will just copy & paste this line into my script.
> If you're not dealing with whole days, or doing something more complicated,
> you can run the script with the environment variable "TZ" set to "UTC",
> which will mean the "local time" used by mktime() & friends will be
> interpreted as UTC with no daylight saving.
I hardly do anything complicated, in particular not with awk, where I am
on novice level. In any case: thanks for the hint which I will keep in
memory, as one never knows...
Hermann
| |
| Hermann Peifer 2007-02-11, 3:57 am |
| Steffen Schuler wrote:
> Hermann Peifer wrote:
>
> Hi Hermann,
>
> Try that:
>
> $ TZ=Etc/Utc gawk 'BEGIN {print (mktime("1993 03 28 00 00 00") -
> mktime("1993 03 29 00 00 00"))/86400}'
> -1
>
Works fine. Thanks again. Hermann
|
|
|
|
|