Home > Archive > Tcl > September 2006 > Questions on SEEK
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
|
|
|
|
| Robert Heller 2006-09-01, 7:02 pm |
| At Fri, 01 Sep 2006 17:12:17 GMT "Tom Conner" <tconner@olopha.net> wrote:
>
> Two questions regarding SEEK.
>
> 1. Why does s crash if the file size is smaller than the number of bytes
> being s ed? For example:
>
> s ./testFile -200 end
>
> crashes if the file is less than 200 bytes in size. Expected behavior would
> be to just return whatever bytes are in the file.
I'm presuming s eventually calls fs (3) , which in turn calls
ls (2):
From man 2 ls :
ERRORS
EBADF fildes is not an open file descriptor.
ESPIPE fildes is associated with a pipe, socket, or FIFO.
EINVAL whence is not one of SEEK_SET, SEEK_CUR, SEEK_END, or the
resulting file offset would be negative.
note------------------------------------------^
EOVERFLOW
The resulting file offset cannot be represented in an off_t.
% exec touch /tmp/empty.file
% set fp [open /tmp/empty.file r]
file3
% s $fp -200 end
error during s on "file3": invalid argument
% s $fp 0 end
% s $fp -1 end
error during s on "file3": invalid argument
"invalid argument" == EINVAL
I would expect that 'catch' could be your friend:
proc safes {fp offset {origin start}} {
global errorInfo errorCode
if {[catch "s $fp $offset $origin" message]} {
if {[regexp {: invalid argument$} "$message"] > 0} {
s $fp 0 start
} else {
error "$message" $errorInfo $errorCode
}
}
}
>
> 2. Is there a performance hit when s ing large files? In other words,
> does the OS know where the end of the file is, or does it need to go through
> each line until it finds the end.
>
>
>
--
Robert Heller -- 978-544-6933
Deepwoods Software -- Linux Installation and Administration
http://www.deepsoft.com/ -- Web Hosting, with CGI and Database
heller@deepsoft.com -- Contract Programming: C/C++, Tcl/Tk
| |
| Tom Conner 2006-09-01, 7:02 pm |
|
"Robert Heller" <heller@deepsoft.com> wrote in message
news:bd14c$44f877ba$404a99a1$23499@news.news-service.com...
> At Fri, 01 Sep 2006 17:12:17 GMT "Tom Conner" <tconner@olopha.net> wrote:
>
In response to another answer, yes my example is wrong. In the actual code
I am using a channelId.
[color=darkred]
>
> I'm presuming s eventually calls fs (3) , which in turn calls
> ls (2):
>
> From man 2 ls :
>
> ERRORS
> EBADF fildes is not an open file descriptor.
>
> ESPIPE fildes is associated with a pipe, socket, or FIFO.
>
> EINVAL whence is not one of SEEK_SET, SEEK_CUR, SEEK_END, or
the
> resulting file offset would be negative.
> note------------------------------------------^
I wonder why returning an error was chosen instead of just returning a
pointer to the beginning of the file if the reverse offset is larger than
the file size.
>
> EOVERFLOW
> The resulting file offset cannot be represented in an off_t.
>
> % exec touch /tmp/empty.file
> % set fp [open /tmp/empty.file r]
> file3
> % s $fp -200 end
> error during s on "file3": invalid argument
> % s $fp 0 end
> % s $fp -1 end
> error during s on "file3": invalid argument
>
> "invalid argument" == EINVAL
>
> I would expect that 'catch' could be your friend:
>
I used [file size] instead to determine whether to read from the beginning,
or the end, of the file.
Thanks for the answers.
| |
|
|
|
|
|
|
| Donal K. Fellows 2006-09-01, 7:02 pm |
| Tom Conner wrote:
> 1. Why does s crash if the file size is smaller than the number of bytes
> being s ed?
When you say "crash" do you mean throw an error? This is because the
[s ] command is supposed to throw an error when the underlying
syscall returns some error condition (or if the arguments are
malformatted, but that doesn't apply here). If that's what's happening,
that's just how things are; write your code to cope. :-)
If the whole tclsh executable is crashing, that's bad.
> 2. Is there a performance hit when s ing large files? In other words,
> does the OS know where the end of the file is, or does it need to go through
> each line until it finds the end.
I've never heard of an OS that didn't implement the ls () syscall
efficiently (on non-serial media; tapes are a whole 'nother thing).
Internally, disks are a random access[*] collection of sectors, each
notionally holding 512 bytes, and s ing is easy and simple. Indeed,
it can be much quicker than spooling through, since if you've got a few
tens of gigs of data, going straight to the bit you really want is a
great short cut.
The only di vantage of [s ] is that it works entirely at the byte
level, and so is far more suited to record-oriented data than for text.
Combining s ing with line-oriented input is not simple, especially if
you allow unbounded line lengths...
Donal.
[* Actually things are much more complex than this. But it's a good
model. ]
| |
| Darren New 2006-09-01, 7:02 pm |
| Donal K. Fellows wrote:
> I've never heard of an OS that didn't implement the ls () syscall
> efficiently (on non-serial media; tapes are a whole 'nother thing).
The Atari 800 OS springs to mind. Each sector was chained to the next in
that sector's data, like asingly linked list, so s ing essentially
required reading thru the file. Appending worked by writing a new chain
of sectors, and then when you closed the file it read thru the first
file to the end and then adjusted the pointer. And since each read was
accompanied by a "beep" to let you know, it could get pretty annoying.
--
Darren New / San Diego, CA, USA (PST)
This octopus isn't tasty. Too many
tentacles, not enough chops.
| |
| Tom Conner 2006-09-01, 7:02 pm |
|
"Donal K. Fellows" <donal.k.fellows@man.ac.uk> wrote in message
news:1157143609.444436.162680@b28g2000cwb.googlegroups.com...
> Tom Conner wrote:
>
> When you say "crash" do you mean throw an error?
Yes. Poor semantics on my part. Sorry.
>
> I've never heard of an OS that didn't implement the ls () syscall
> efficiently (on non-serial media; tapes are a whole 'nother thing).
> Internally, disks are a random access[*] collection of sectors, each
> notionally holding 512 bytes, and s ing is easy and simple. Indeed,
> it can be much quicker than spooling through, since if you've got a
> few tens of gigs of data, going straight to the bit you really want
> is a great short cut.
>
I wasn't sure how an OS (Linux in this case) dealt with files, but, thanks
to everyone, I now have a good understanding. Testing s on a "large"
(500K) file didn't register any CPU increase (as seen by top), or time
increase as compared to a "normal" (in this environment) size file of around
100 bytes.
|
|
|
|
|