Code Comments
Programming Forum and web based access to our favorite programming groups.Is there an easy / elegant way to read from a file, then process the stream with a set of filters and pipe back to the same file? For example: Let's say I have a file B C D and I want to add an A at the top to get a A B C D I'd like to do: ( echo "A" ; cat t ) > t but that spews an"cat: t: input file is output file" This is a trivial toy example but I find myself running into this sort of situation every so often. I can always work-around by using a temp file but I'm just wondering if I'm missing an obvious metaphor... -- Rahul
Post Follow-up to this messageRahul wrote: > I'd like to do: > ( echo "A" ; cat t ) > t > > but that spews an"cat: t: input file is output file" > > This is a trivial toy example but I find myself running into this sort > of situation every so often. I can always work-around by using a temp > file but I'm just wondering if I'm missing an obvious metaphor... There is a recent post where a similar question was asked (subject: "shell script replacing original file"). See the discussion there. In short, from what I understand, it's generally unsafe since programs in a pipeline are executed in parallel, and you don't have any guarantee that the file will be completely read before the redirection starts writing to it (especially if it's a large file), and thus what you want can't be done (without using a temporary file), but read that thread for the details. -- All the commands are tested with bash and GNU tools, so they may use nonstandard features. I try to mention when something is nonstandard (if I'm aware of that), but I may miss something. Corrections are welcome.
Post Follow-up to this messageOn Tue, 01 Apr 2008 21:40:20 +0200, pk wrote:
> Rahul wrote:
>
>
> There is a recent post where a similar question was asked (subject:
> "shell script replacing original file"). See the discussion there. In
> short, from what I understand, it's generally unsafe since programs in a
> pipeline are executed in parallel, and you don't have any guarantee that
> the file will be completely read before the redirection starts writing
> to it (especially if it's a large file), and thus what you want can't be
> done (without using a temporary file), but read that thread for the
> details.
It can be done, the usual reason why it isn't done is that if the
operation is interrupted you are not in a good position to recover.
exec < t # give a new name (file descriptor 0 in this case) to t
rm t # remove the old name
{ echo A ; cat ; } > t # generate the new content.
Post Follow-up to this messageMIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Rahul <nospam@nospam.invalid> wrote: > Is there an easy / elegant way to read from a file, then process the > stream with a set of filters and pipe back to the same file? For > example: That's what sponge is for. It's part of moreutils <http://kitenet.net/~joey/code/moreutils.html> Florian -- <http://www.florian-diesch.de/> ----------------------------------------------------------------------- ** Hi! I'm a signature virus! Copy me into your signature, please! ** -----------------------------------------------------------------------
Post Follow-up to this messageIcarus Sparry wrote:
> On Tue, 01 Apr 2008 21:40:20 +0200, pk wrote:
>
>
>
>
> It can be done, the usual reason why it isn't done is that if the
> operation is interrupted you are not in a good position to recover.
It's also bad if you attempt that while the file system is full.
>
> exec < t # give a new name (file descriptor 0 in this case) to t
> rm t # remove the old name
> { echo A ; cat ; } > t # generate the new content.
Janis
Post Follow-up to this messageIcarus Sparry wrote:
> It can be done, the usual reason why it isn't done is that if the
> operation is interrupted you are not in a good position to recover.
>
> exec < t # give a new name (file descriptor 0 in this case) to t
Try this on the command line.
> rm t # remove the old name
> { echo A ; cat ; } > t # generate the new content.
Ah sure. But then it's not a single command or pipeline. And it seems to me
that this is just another (more disguised) way to use a temporary file,
isnt'it? If the "exec < t" is executed in the same command group or
pipeline as the other commands, the file is deleted and the script hangs.
--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.
Post Follow-up to this messagepk wrote: > Icarus Sparry wrote: > > > Try this on the command line. Ok, I think now I understand that better. For this problem, it seems it's enough to use a descriptor other than 0: exec 5< t This works (with cat <&5) because the OS creates a different (new) file on disk for t, while cat still reads from the old file (which has not been deleted since it's still open), correct? If that is correct, then the shell is working with two files, just like when a temporary file is used. As for the single pipeline problem, is it however correct that the preliminary descriptor must be opened using a stand-alone operation and the command must not be part of a pipeline (or compound command) ending with "> t", otherwise the shell will wipe the file first (or at some unpredictable time)? Thanks for any answer. -- All the commands are tested with bash and GNU tools, so they may use nonstandard features. I try to mention when something is nonstandard (if I'm aware of that), but I may miss something. Corrections are welcome.
Post Follow-up to this messagepk wrote: > As for the single pipeline problem, is it however correct that the > preliminary descriptor must be opened , and the original file removed, > using a stand-alone operation ...etc.
Post Follow-up to this messageOn 3 Apr., 14:48, pk <p...@pk.invalid> wrote: > pk wrote: > > > > > Ok, I think now I understand that better. For this problem, it seems it's > enough to use a descriptor other than 0: > > exec 5< t > > > This works (with cat <&5) because the OS creates a different (new) file on > disk for t, while cat still reads from the old file (which has not been > deleted since it's still open), correct? If that is correct, then the shel l > is working with two files, just like when a temporary file is used. > > As for the single pipeline problem, is it however correct that the > preliminary descriptor must be opened using a stand-alone operation and th e > command must not be part of a pipeline (or compound command) ending with " > > t", otherwise the shell will wipe the file first (or at some unpredictable > time)? > > Thanks for any answer. I want to bring, again, to attention that this is an extereemly unsafe construct - e.g., you will lose your file completely if the filesystem is full - and it should therefore not be used or advertised. Without need one would buy risks that can be avoided in the first place by using safe shell constructs. Use explicit temporaries and remove the original file only if the modified one could be created completely. Is that really so unappealing to do, and instead to favour a theoretically interesting but inherently unsafe and obfuscating construct? Janis
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.