For Programmers: Free Programming Magazines  


Home > Archive > PERL Miscellaneous > August 2005 > Can I Force Perl to Bypass File Write Buffers?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Can I Force Perl to Bypass File Write Buffers?
Hal Vaughan

2005-08-30, 3:56 am

I'm using Perl 5.6.1 (and in some cases 5.8) on Linux. I've noticed that
when I'm processing files, that Perl writes in blocks, so it'll process a
number of items, and instead of the file having one line at a time written
to it, it'll get a whole block at once suddenly written to the disk.

Is there any way to avoid this and force Perl to write each line as I use a
"print" statement to output the line? I log (in MySQL) each item as I
finish it, so if power fails or the program is aborted, the system can pick
up right where it left off. Because of the buffers, the log is ahead of
what is written to the file, which would mean I'd lose the data between
what's written and what's logged.

Thanks!

Hal
Simon Taylor

2005-08-30, 3:56 am

Hal Vaughan wrote:
> I'm using Perl 5.6.1 (and in some cases 5.8) on Linux. I've noticed that
> when I'm processing files, that Perl writes in blocks, so it'll process a
> number of items, and instead of the file having one line at a time written
> to it, it'll get a whole block at once suddenly written to the disk.
>
> Is there any way to avoid this and force Perl to write each line as I use a
> "print" statement to output the line?


You'll need to disable buffering by setting $| to non-zero.
See $| in

perldoc perlvar

and also checkout

perldoc -f select

This sample should do what you want:

#!/usr/bin/perl
use strict;
use warnings;

open (OUTPUT, '>', 'sample') or die "Could not create file: $!";
my $fd = select(OUTPUT);
$| = 1;
select($fd);
for (0..20) {
print OUTPUT "some data...\n";
sleep 2;
}
close OUTPUT;


Regards,

Simon Taylor


--
www.perlmeme.org
Anno Siegel

2005-08-30, 7:56 am

Simon Taylor <simon@unisolve.com.au> wrote in comp.lang.perl.misc:
> Hal Vaughan wrote:
>
> You'll need to disable buffering by setting $| to non-zero.


[good advice snipped]

Just one note: "$| = 1" doesn't disable buffering, it enables auto-flushing.
The buffer(s) remain in place and active, but after each print-statement
the buffer is automatically emptied (presumably into the next buffer down
the line). You still have character buffering (and you want it).

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
xhoster@gmail.com

2005-08-30, 6:58 pm

Hal Vaughan <hal@thresholddigital.com> wrote:
> I'm using Perl 5.6.1 (and in some cases 5.8) on Linux. I've noticed that
> when I'm processing files, that Perl writes in blocks, so it'll process a
> number of items, and instead of the file having one line at a time
> written to it, it'll get a whole block at once suddenly written to the
> disk.


To answer the question you asked, check out the variable $|.

To answer the question you didn't ask, your method isn't very good. If you
are truly concerned about data integrity, use a transactional database for
both the data and the log, and make sure both data write and log write are
in the same transaction. Or make your program, upon restarting, tail the
existing data file and figure out where to pick up based solely on the data
file, and dispense with the logging altogether. Or do both--write the data
into a database, and have the entry in the database by its own log.

>
> Is there any way to avoid this and force Perl to write each line as I use
> a "print" statement to output the line? I log (in MySQL) each item as I
> finish it, so if power fails or the program is aborted, the system can
> pick up right where it left off. Because of the buffers, the log is
> ahead of what is written to the file, which would mean I'd lose the data
> between what's written and what's logged.



Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
Hal Vaughan

2005-08-30, 6:58 pm

Anno Siegel wrote:

> Simon Taylor <simon@unisolve.com.au> wrote in comp.lang.perl.misc:
>
> [good advice snipped]
>
> Just one note: "$| = 1" doesn't disable buffering, it enables
> auto-flushing. The buffer(s) remain in place and active, but after each
> print-statement the buffer is automatically emptied (presumably into the
> next buffer down
> the line). You still have character buffering (and you want it).
>
> Anno


I have $| = 1 set, since I had to redirect output to a file for debugging
and needed the errors to sync with the output, but it doesn't seem to make
a difference in the problem I'm talking about. You seem to be the only
person that has pointed out this doesn't effect the buffers directly.

Hal
Hal Vaughan

2005-08-30, 6:58 pm

xhoster@gmail.com wrote:

> Hal Vaughan <hal@thresholddigital.com> wrote:
>
> To answer the question you asked, check out the variable $|.


Thanks. I've used it and it helps with syncing out put so if I redirect
output to a file, the error messages and other output is synced, but it
doesn't seem to help here.

> To answer the question you didn't ask, your method isn't very good. If
> you are truly concerned about data integrity, use a transactional database
> for both the data and the log, and make sure both data write and log write
> are
> in the same transaction. Or make your program, upon restarting, tail the
> existing data file and figure out where to pick up based solely on the
> data
> file, and dispense with the logging altogether. Or do both--write the
> data into a database, and have the entry in the database by its own log.


I seriously thought about putting the info into a database, but there were a
number of reasons I didn't. Part is because different programs on
different systems can use this, and it works better to make the directory
shared through NFS and I'd rather share that than the database. I've also
got a stream of data coming in, and it has been working much better to save
it to a capture file. Trying to break it up into chunks so it could be put
into a database as it comes in would be a nightmare.

Thanks!

Hal
Tad McClellan

2005-08-30, 6:58 pm

Hal Vaughan <hal@thresholddigital.com> wrote:

> Perl writes in blocks,


> Is there any way to avoid this and force Perl to write each line as I use a
> "print" statement to output the line?



Your Question is Asked Frequently:

perldoc -q buffer

How do I flush/unbuffer an output filehandle? Why must I do this?


You must have missed it when you checked the Perl FAQ before
posting to the Perl newsgroup.


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com