For Programmers: Free Programming Magazines  


Home > Archive > AWK > April 2007 > calculate the difference between timestamps









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author calculate the difference between timestamps
Michael Jaritz

2007-04-21, 6:56 pm

Hello,
this is my input:

event;TimeA ;TimeB
-----;-----------------------------;----------------------------
foo ;19-APR-07 12.13.54.916000 AM ;19-APR-07 12.13.55.108000 AM
bar ;19-APR-07 02.41.10.869000 PM ;19-APR-07 02.41.10.895000 PM
^^^ ^^^
`----- always "000" ------´

I need the difference between TimeA and TimeB in milliseconds.

This is my idea:

BEGIN {
ts1 = "19-APR-07 12.13.54.916000 AM"
ts2 = "19-APR-07 12.13.55.108000 AM"
diff = change_timestamp(ts2) - change_timestamp(ts1)
print ts2
print ts1
print "diff: "diff" msec"
print ""
ts1 = "31-DEC-06 11.59.59.999000 PM"
ts2 = "01-JAN-07 12.00.00.001000 AM"
diff = change_timestamp(ts2) - change_timestamp(ts1)
print ts2
print ts1
print "diff: "diff" msec"
}

function change_timestamp(ts) {
month = gensub( /^...(...).+$/, "\\1", "g", ts )
month = numeric_month(month)
hour = gensub( /^.+ (..).+$/, "\\1", "g", ts )
milliseconds = gensub( /^.+(...)... ..$/, "\\1", "g", ts )
appendix = gensub( /^.+(..)$/, "\\1", "g", ts )
if( appendix == "PM" )
hour = gensub( /^0/, "", 1, hour ) + 12
if( appendix == "AM" )
hour = gensub( /12/, "00", 1, hour )
datespec = gensub( /^(..)-...-(..) ...(..).(..)....... ..$/,
"20\\2 | \\1 | \\3 \\4", "g", ts )
sub( /\|/, month, datespec )
sub( /\|/, hour, datespec )
seconds_since_1970 = mktime( datespec )
msec_since_1970 = (seconds_since_1970 * 1000) + milliseconds
return msec_since_1970
}

function numeric_month(str) {
switch (str) {
case "JAN":
ret = "01"; break
case "FEB":
ret = "02"; break
case "MAR":
ret = "03"; break
case "APR":
ret = "04"; break
case "MAY":
ret = "05"; break
case "JUN":
ret = "06"; break
case "JUL":
ret = "07"; break
case "AUG":
ret = "08"; break
case "SEP":
ret = "09"; break
case "OCT":
ret = "10"; break
case "NOV":
ret = "11"; break
case "DEC":
ret = "12"; break
default:
ret = "00"
}
return ret
}

I think it works, but its not elegant.
Are there easier, maybe faster possibilities?

Michael
Bernd Nawothnig

2007-04-22, 7:56 am

On 2007-04-21, Michael Jaritz wrote:

> event;TimeA ;TimeB
> -----;-----------------------------;----------------------------
> foo ;19-APR-07 12.13.54.916000 AM ;19-APR-07 12.13.55.108000 AM
> bar ;19-APR-07 02.41.10.869000 PM ;19-APR-07 02.41.10.895000 PM
> ^^^ ^^^
> `----- always "000" ------´


> I need the difference between TimeA and TimeB in milliseconds.


> This is my idea:


> BEGIN {

numeric_month["JAN"] = "01"
numeric_month["FEB"] = "02"
numeric_month["MAR"] = "03"
numeric_month["APR"] = "04"
numeric_month["MAY"] = "05"
numeric_month["JUN"] = "06"
numeric_month["JUL"] = "07"
numeric_month["AUG"] = "08"
numeric_month["SEP"] = "09"
numeric_month["OCT"] = "10"
numeric_month["NOV"] = "11"
numeric_month["DEC"] = "12"
> ts1 = "19-APR-07 12.13.54.916000 AM"
> ts2 = "19-APR-07 12.13.55.108000 AM"
> diff = change_timestamp(ts2) - change_timestamp(ts1)
> print ts2
> print ts1
> print "diff: "diff" msec"
> print ""
> ts1 = "31-DEC-06 11.59.59.999000 PM"
> ts2 = "01-JAN-07 12.00.00.001000 AM"
> diff = change_timestamp(ts2) - change_timestamp(ts1)
> print ts2
> print ts1
> print "diff: "diff" msec"
> }


> function change_timestamp(ts) {
> month = gensub( /^...(...).+$/, "\\1", "g", ts )


instead of

> month = numeric_month(month)


use the array:

month = numeric_month[month]

> hour = gensub( /^.+ (..).+$/, "\\1", "g", ts )
> milliseconds = gensub( /^.+(...)... ..$/, "\\1", "g", ts )
> appendix = gensub( /^.+(..)$/, "\\1", "g", ts )
> if( appendix == "PM" )
> hour = gensub( /^0/, "", 1, hour ) + 12
> if( appendix == "AM" )
> hour = gensub( /12/, "00", 1, hour )
> datespec = gensub( /^(..)-...-(..) ...(..).(..)....... ..$/,
> "20\\2 | \\1 | \\3 \\4", "g", ts )
> sub( /\|/, month, datespec )
> sub( /\|/, hour, datespec )
> seconds_since_1970 = mktime( datespec )
> msec_since_1970 = (seconds_since_1970 * 1000) + milliseconds
> return msec_since_1970
> }


> function numeric_month(str) {
> [...]


The function is now no longer needed.

> I think it works, but its not elegant.
> Are there easier, maybe faster possibilities?


The array (hash) should be faster than your function. The default
value is now different but it seems you don't really need that.



Bernd

--
Those who desire to give up freedom in order to gain security
will not have, nor do they deserve, either one.
[T. Jefferson or B. Franklin, unsure]
Kenny McCormack

2007-04-22, 6:57 pm

In article <656ceab3b40ca67402dee93c57b62807@mj.zielgra.de>,
Michael Jaritz <ewiglich@abwesend.de> wrote:
>Hello,
>this is my input:
>
>event;TimeA ;TimeB
>-----;-----------------------------;----------------------------
>foo ;19-APR-07 12.13.54.916000 AM ;19-APR-07 12.13.55.108000 AM
>bar ;19-APR-07 02.41.10.869000 PM ;19-APR-07 02.41.10.895000 PM
> ^^^ ^^^
> `----- always "000" ------´
>
>I need the difference between TimeA and TimeB in milliseconds.


1) Use gawk
2) Do whatever conversions are necessary to convert your text into the
form required by mktime().
3) Use mktime()
4) Add in the milliseconds, subtract the results.

Vassilis

2007-04-22, 6:57 pm


=CE=9F/=CE=97 Michael Jaritz =CE=AD=CE=B3=CF=81=CE=B1=CF=88=CE=B5:[co
lor=darkred]
> Hello,
> this is my input:
>
> event;TimeA ;TimeB
> -----;-----------------------------;----------------------------
> foo ;19-APR-07 12.13.54.916000 AM ;19-APR-07 12.13.55.108000 AM
> bar ;19-APR-07 02.41.10.869000 PM ;19-APR-07 02.41.10.895000 PM
> ^^^ ^^^
> `----- always "000" ------=C2=B4
>
> I need the difference between TimeA and TimeB in milliseconds.
>
> This is my idea:
>
> BEGIN {
> ts1 =3D "19-APR-07 12.13.54.916000 AM"
> ts2 =3D "19-APR-07 12.13.55.108000 AM"
> diff =3D change_timestamp(ts2) - change_timestamp(ts1)
> print ts2
> print ts1
> print "diff: "diff" msec"
> print ""
> ts1 =3D "31-DEC-06 11.59.59.999000 PM"
> ts2 =3D "01-JAN-07 12.00.00.001000 AM"
> diff =3D change_timestamp(ts2) - change_timestamp(ts1)
> print ts2
> print ts1
> print "diff: "diff" msec"
> }
>
> function change_timestamp(ts) {
> month =3D gensub( /^...(...).+$/, "\\1", "g", ts )
> month =3D numeric_month(month)
> hour =3D gensub( /^.+ (..).+$/, "\\1", "g", ts )
> milliseconds =3D gensub( /^.+(...)... ..$/, "\\1", "g", ts )
> appendix =3D gensub( /^.+(..)$/, "\\1", "g", ts )
> if( appendix =3D=3D "PM" )
> hour =3D gensub( /^0/, "", 1, hour ) + 12
> if( appendix =3D=3D "AM" )
> hour =3D gensub( /12/, "00", 1, hour )
> datespec =3D gensub( /^(..)-...-(..) ...(..).(..)....... ..$/,
> "20\\2 | \\1 | \\3 \\4", "g", ts )
> sub( /\|/, month, datespec )
> sub( /\|/, hour, datespec )
> seconds_since_1970 =3D mktime( datespec )
> msec_since_1970 =3D (seconds_since_1970 * 1000) + milliseconds
> return msec_since_1970
> }
>
> function numeric_month(str) {
> switch (str) {
> case "JAN":
> ret =3D "01"; break
> case "FEB":
> ret =3D "02"; break
> case "MAR":
> ret =3D "03"; break
> case "APR":
> ret =3D "04"; break
> case "MAY":
> ret =3D "05"; break
> case "JUN":
> ret =3D "06"; break
> case "JUL":
> ret =3D "07"; break
> case "AUG":
> ret =3D "08"; break
> case "SEP":
> ret =3D "09"; break
> case "OCT":
> ret =3D "10"; break
> case "NOV":
> ret =3D "11"; break
> case "DEC":
> ret =3D "12"; break
> default:
> ret =3D "00"
> }
> return ret
> }
>
> I think it works, but its not elegant.
> Are there easier, maybe faster possibilities?
>
> Michael[/color]

I would, following my instict, use less of gensub and more of substr
(in case you really care about efficiency). Also, I would use the
proposed month/array solution:

BEGIN {
numeric_month["JAN"] =3D "01"
numeric_month["FEB"] =3D "02"
numeric_month["MAR"] =3D "03"
numeric_month["APR"] =3D "04"
numeric_month["MAY"] =3D "05"
numeric_month["JUN"] =3D "06"
numeric_month["JUL"] =3D "07"
numeric_month["AUG"] =3D "08"
numeric_month["SEP"] =3D "09"
numeric_month["OCT"] =3D "10"
numeric_month["NOV"] =3D "11"
numeric_month["DEC"] =3D "12"

ts1 =3D "19-APR-07 12.13.54.916000 AM"
ts2 =3D "19-APR-07 12.13.55.108000 AM"
ds1 =3D datespec(ts1)
ds2 =3D datespec(ts2)
print 1000 * (ds2 - ds1)
}

function datespec(date, mon, pm, hour, ms) {
mon =3D numeric_month[substr(date, 4, 3)]
pm =3D (substr(date, length(date) - 1) =3D=3D "PM") ? 12 : 0
hour =3D substr(date, 11, 2) + pm
ms =3D "0." substr(date, 20, 6)
date =3D gensub(/^(..)-(...)-(..) (..)\.(..)\.(..)\.(...)000 (..)$/,
"20\\3 @ \\1 @ \\5 \\6", "", date)
sub(/@/, mon, date)
sub(/@/, hour, date)
return mktime(date) + ms
}

I really liked the sub trick. Don't forget local function arguments.

Another solution -- you only can attest if it's feasible -- is to
change the source of input to produce mktime() format.

Yet another, would be to find seconds and milliseconds and perform the
desired subtraction, since I guess from the input that the time
runlength is small. Of course, you fall back to the main algorithm in
case the times are in different days.

Vassilis

Thomas Weidenfeller

2007-04-23, 3:56 am

Michael Jaritz wrote:
> BEGIN {


monthnum["JAN"] = 1
monthnum["FEB"] = 2
monthnum["MAR"] = 3
monthnum["APR"] = 4
monthnum["MAY"] = 5
monthnum["JUN"] = 6
monthnum["JUL"] = 7
monthnum["AUG"] = 8
monthnum["SEP"] = 9
monthnum["OCT"] = 10
monthnum["NOV"] = 11
monthnum["DEC"] = 12


> ts1 = "19-APR-07 12.13.54.916000 AM"
> ts2 = "19-APR-07 12.13.55.108000 AM"
> diff = change_timestamp(ts2) - change_timestamp(ts1)
> print ts2
> print ts1
> print "diff: "diff" msec"
> print ""
> ts1 = "31-DEC-06 11.59.59.999000 PM"
> ts2 = "01-JAN-07 12.00.00.001000 AM"
> diff = change_timestamp(ts2) - change_timestamp(ts1)
> print ts2
> print ts1
> print "diff: "diff" msec"
> }
>
> function change_timestamp(ts) {


split(ts, a, "[ . -]")
if(a[8] == "PM") {
a[4] += 12
} else if(a[4] == 12) {
a[4] = 0
}
return mktime(a[3] + 2000 " " monthnum[a[2]] " " a[1] " " a[4]
" " a[5] " " a[6]) * 1000 + a[7] / 1000


> }



/Thomas
Ed Morton

2007-04-23, 7:57 am

Michael Jaritz wrote:
> Hello,
> this is my input:
>
> event;TimeA ;TimeB
> -----;-----------------------------;----------------------------
> foo ;19-APR-07 12.13.54.916000 AM ;19-APR-07 12.13.55.108000 AM
> bar ;19-APR-07 02.41.10.869000 PM ;19-APR-07 02.41.10.895000 PM
> ^^^ ^^^
> `----- always "000" ------´
>
> I need the difference between TimeA and TimeB in milliseconds.
>
> This is my idea:
>
> BEGIN {
> ts1 = "19-APR-07 12.13.54.916000 AM"
> ts2 = "19-APR-07 12.13.55.108000 AM"
> diff = change_timestamp(ts2) - change_timestamp(ts1)
> print ts2
> print ts1
> print "diff: "diff" msec"
> print ""
> ts1 = "31-DEC-06 11.59.59.999000 PM"
> ts2 = "01-JAN-07 12.00.00.001000 AM"
> diff = change_timestamp(ts2) - change_timestamp(ts1)
> print ts2
> print ts1
> print "diff: "diff" msec"
> }
>
> function change_timestamp(ts) {
> month = gensub( /^...(...).+$/, "\\1", "g", ts )
> month = numeric_month(month)
> hour = gensub( /^.+ (..).+$/, "\\1", "g", ts )
> milliseconds = gensub( /^.+(...)... ..$/, "\\1", "g", ts )
> appendix = gensub( /^.+(..)$/, "\\1", "g", ts )
> if( appendix == "PM" )
> hour = gensub( /^0/, "", 1, hour ) + 12
> if( appendix == "AM" )
> hour = gensub( /12/, "00", 1, hour )
> datespec = gensub( /^(..)-...-(..) ...(..).(..)....... ..$/,
> "20\\2 | \\1 | \\3 \\4", "g", ts )
> sub( /\|/, month, datespec )
> sub( /\|/, hour, datespec )
> seconds_since_1970 = mktime( datespec )
> msec_since_1970 = (seconds_since_1970 * 1000) + milliseconds
> return msec_since_1970
> }
>
> function numeric_month(str) {
> switch (str) {
> case "JAN":
> ret = "01"; break
> case "FEB":
> ret = "02"; break
> case "MAR":
> ret = "03"; break
> case "APR":
> ret = "04"; break
> case "MAY":
> ret = "05"; break
> case "JUN":
> ret = "06"; break
> case "JUL":
> ret = "07"; break
> case "AUG":
> ret = "08"; break
> case "SEP":
> ret = "09"; break
> case "OCT":
> ret = "10"; break
> case "NOV":
> ret = "11"; break
> case "DEC":
> ret = "12"; break
> default:
> ret = "00"
> }
> return ret
> }
>
> I think it works, but its not elegant.
> Are there easier, maybe faster possibilities?
>
> Michael


This will print the number of seconds between 2 date/time values given
in some non-standard format:

function cvttime(t, a) {
split(t,a,"[/:]")
match("JanFebMarAprMayJunJulAugSepOctNovDec",a[2])
a[2] = sprintf("%02d",(RSTART+2)/3)
return( mktime(a[3]" "a[2]" "a[1]" "a[4]" "a[5]" "a[6]) )
}
BEGIN{
t1="01/Dec/2005:00:04:42"
t2="01/Dec/2005:17:14:12"
print cvttime(t2) - cvttime(t1)
}

Ed.
Kenny McCormack

2007-04-23, 7:57 am

In article < kfednQZd_Mv7PLHbnZ2dnUVZ_uXinZ2d@comcast
.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>This will print the number of seconds between 2 date/time values given
>in some non-standard format:
>
>function cvttime(t, a) {
> split(t,a,"[/:]")
> match("JanFebMarAprMayJunJulAugSepOctNovDec",a[2])
> a[2] = sprintf("%02d",(RSTART+2)/3)
> return( mktime(a[3]" "a[2]" "a[1]" "a[4]" "a[5]" "a[6]) )
>}
>BEGIN{
>t1="01/Dec/2005:00:04:42"
>t2="01/Dec/2005:17:14:12"
>print cvttime(t2) - cvttime(t1)
>}
>
> Ed.


Exactly! As I mentioned in an earlier post, using mktime() is clearly
the way to go. There's a reason the function was added to the language,
and I think the reason was specifically so that it would no longer be
necessary to write pages and pages of hack code (as seen in other posts
in this thread) to, yet again, re-invent an old wheel.

P.S. Again, as I mentioned in my earlier post, adding the milliseconds
functionality to the above is trivial and is left as an exercise...

Thomas Weidenfeller

2007-04-23, 9:56 pm

Kenny McCormack wrote:
> Exactly! As I mentioned in an earlier post, using mktime() is clearly
> the way to go.


Since you now repeated the statement, a small hint. You missed the
following line in the original code:

> seconds_since_1970 = mktime( datespec )


Most of the OP's code was a hack to assemble datespec.

/Thomas

Kenny McCormack

2007-04-23, 9:56 pm

In article <f0idho$q4f$1@news.al.sw.ericsson.se>,
Thomas Weidenfeller <nobody@ericsson.invalid> wrote:
>Kenny McCormack wrote:
>
>Since you now repeated the statement, a small hint. You missed the
>following line in the original code:
>
>
>Most of the OP's code was a hack to assemble datespec.


I see. Very interesting. Yes, you are absolutely right; when I saw all
that hack code, I assumed they were solving the problem directly (*), not
doing the sensible thing of letting mktime() do the heavy lifting. I
further assumed (incorrectly, given that they were also using the
GAWK-specific gensub() function) that the usual Usenet "Don't use good
tools - gotta stay 'standard', blah, blah, blah" was in play.

(*) Something that happens all too often in these groups.

Anyway, as Ed has shown, you don't need pages and pages of code to do
it. Hence my assumption that the pages and pages of code was an attempt
to solve it directly.

Michael Jaritz

2007-04-23, 9:56 pm

Thomas Weidenfeller wrote:
[color=darkred]
>Michael Jaritz wrote:
> monthnum["JAN"] = 1
> monthnum["FEB"] = 2
> monthnum["MAR"] = 3
> monthnum["APR"] = 4
> monthnum["MAY"] = 5
> monthnum["JUN"] = 6
> monthnum["JUL"] = 7
> monthnum["AUG"] = 8
> monthnum["SEP"] = 9
> monthnum["OCT"] = 10
> monthnum["NOV"] = 11
> monthnum["DEC"] = 12
> split(ts, a, "[ . -]")
> if(a[8] == "PM") {
> a[4] += 12
> } else if(a[4] == 12) {
> a[4] = 0
> }
> return mktime(a[3] + 2000 " " monthnum[a[2]] " " a[1] " " a[4]
>" " a[5] " " a[6]) * 1000 + a[7] / 1000

Thanks for this short and fast solution. First I'm wondering about the
characterclass as fieldseparator, I read the manual again and the penny
has dropped.

Michael
Michael Jaritz

2007-04-23, 9:56 pm

Vassilis schrieb:

>I would, following my instict, use less of gensub and more of substr
>(in case you really care about efficiency). Also, I would use the
>proposed month/array solution:
>
>BEGIN {
> numeric_month["JAN"] = "01"
> numeric_month["FEB"] = "02"
> numeric_month["MAR"] = "03"
> numeric_month["APR"] = "04"
> numeric_month["MAY"] = "05"
> numeric_month["JUN"] = "06"
> numeric_month["JUL"] = "07"
> numeric_month["AUG"] = "08"
> numeric_month["SEP"] = "09"
> numeric_month["OCT"] = "10"
> numeric_month["NOV"] = "11"
> numeric_month["DEC"] = "12"
>
> ts1 = "19-APR-07 12.13.54.916000 AM"
> ts2 = "19-APR-07 12.13.55.108000 AM"
> ds1 = datespec(ts1)
> ds2 = datespec(ts2)
> print 1000 * (ds2 - ds1)
>}
>
>function datespec(date, mon, pm, hour, ms) {

^^^^^^^^^^^^^^^^^^
> mon = numeric_month[substr(date, 4, 3)]
> pm = (substr(date, length(date) - 1) == "PM") ? 12 : 0
> hour = substr(date, 11, 2) + pm
> ms = "0." substr(date, 20, 6)
> date = gensub(/^(..)-(...)-(..) (..)\.(..)\.(..)\.(...)000 (..)$/,
>"20\\3 @ \\1 @ \\5 \\6", "", date)
> sub(/@/, mon, date)
> sub(/@/, hour, date)
> return mktime(date) + ms
>}
>
>I really liked the sub trick. Don't forget local function arguments.


local function arguments = the marked vars above?
Is it important to functionality? Or is it for me & you to don't loose
the overview?

>Another solution -- you only can attest if it's feasible -- is to
>change the source of input to produce mktime() format.


I have no bearing on the inputformat.

>Yet another, would be to find seconds and milliseconds and perform the
>desired subtraction, since I guess from the input that the time
>runlength is small. Of course, you fall back to the main algorithm in
>case the times are in different days.


This is a good idea. I will think about it, thanks.

Michael
Vassilis

2007-04-23, 9:56 pm


=CF/=C7 Michael Jaritz =DD=E3=F1=E1=F8=E5:
> Vassilis schrieb:
>
> ^^^^^^^^^^^^^^^^^^
>
> local function arguments =3D the marked vars above?
> Is it important to functionality? Or is it for me & you to don't loose
> the overview?


Yup, those. There are important, though in a small script it would
probably make no difference.
Using function arguments as local variables is good practice, since
you avoid the effect of using the same name for a global and a local
variable, a condition rather disturbing.

>
>
> I have no bearing on the inputformat.
>
>
> This is a good idea. I will think about it, thanks.
>
> Michael


Character class is a notation, used in regexps, that matches a single
character found (or in case of [^ not found) in the character set: [a-
zA-Z] matches any single letter, upper or lower. In Thomas' script [ .
-] matches
a space or a dot or a minus. Do read the manual or google for more
info.

Having seen a few solutions/suggestions, if I were you, I'd follow Ed
Morton's script, with minor adjustments.
Ed's (almost?) always right ;-)

Vassilis

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com