Home > Archive > PERL Miscellaneous > October 2006 > unable to calculate large file size
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
unable to calculate large file size
|
|
| himagauri@gmail.com 2006-10-18, 7:00 pm |
| Hi
My perl script is a very big script and performs a lot of complex
tasks. One of the tasks within the script is to calulate the size of a
file as follows:
$size = -s $filename;
$size = -1 unless defined($size);
When the file size is larger than 1GB for example 16376862124 bytes or
16 GB, the script returns $size = -1
For a file size less than 1 GB for example 775752944 bytes,
the correct file size is returned.
When I run another sample script performing just the above simple task
of calculating file size, the correct size of 16376862124 bytes is
returned for the larger file after a long time-about 40 minutes. The
problem arises when the above task is within the larger script.
I'm using Perl Version 5.8.3
If I do a perl -V, I do get the following option in the summary:
uselargefiles=define
So where could the problem lie?
Please suggest a solution to my problem.
-Thanks,
Regards,
Gauri
| |
| Big and Blue 2006-10-18, 7:00 pm |
| himagauri@gmail.com wrote:
>
> $size = -s $filename;
> $size = -1 unless defined($size);
>
> When the file size is larger than 1GB for example 16376862124 bytes or
> 16 GB, the script returns $size = -1
So you've only tested (or at least only reported) < 1GB and 16GB. What
about some intermediate values, say 1.5GB, 3GB, 6GB?
> When I run another sample script performing just the above simple task
> of calculating file size, the correct size of 16376862124 bytes is
> returned for the larger file after a long time-about 40 minutes. The
> problem arises when the above task is within the larger script.
So, your simple 2 line script takes 40 minutes to size one file, but you
don't consider *that* to be a problem?
> So where could the problem lie?
> Please suggest a solution to my problem.
As has been noted - please give more detail, particularly what OS this
is on. It reminds me of a problem sizing files on VAX systems when they
were beinf accessed as NFS servers. That OS had various types of file,
including variable length record files (IIRC). So in order to find out the
"real" size of the file you actually did have to read each record
individually and add up all of the record lengths. This made "ls -l" from a
Unix system takes *ages* for such files. (FWIW: eventually the server code
was changed to allow approximate sizing...)
--
Just because I've written it doesn't mean that
either you or I have to believe it.
| |
| himagauri@gmail.com 2006-10-18, 10:01 pm |
| The OS is Suse Linux and perl version is 5.8.3.
Point is that I would later have to work on files >16 GB. Hence I
haven't tested for intermediate values.
Initially I thought the '-s' operator doesn't work for file sizes >1
GB. Hence I worked out a sample program and noticed that it works.
Then I thought probably I'm running out of memory..but not sure about
it.
The file in question is being transferred from MVS to Linux using FTP.
The file on MVS has a record format of variable block (RECFM=VB).
I need to calculate the file size on Linux to check if the number of
bytes transferred to linux is equal to the file size on MVS.
does Linux OS have variable length files? How do I define them? how
else can I calculate the file size in perl besides -s operator?
what about File::Stat? is it more efficient as compared to -s?
-Gauri
Big and Blue wrote:
> himagauri@gmail.com wrote:
>
> So you've only tested (or at least only reported) < 1GB and 16GB. What
> about some intermediate values, say 1.5GB, 3GB, 6GB?
>
>
> So, your simple 2 line script takes 40 minutes to size one file, but you
> don't consider *that* to be a problem?
>
>
> As has been noted - please give more detail, particularly what OS this
> is on. It reminds me of a problem sizing files on VAX systems when they
> were beinf accessed as NFS servers. That OS had various types of file,
> including variable length record files (IIRC). So in order to find out the
> "real" size of the file you actually did have to read each record
> individually and add up all of the record lengths. This made "ls -l" from a
> Unix system takes *ages* for such files. (FWIW: eventually the server code
> was changed to allow approximate sizing...)
>
>
> --
> Just because I've written it doesn't mean that
> either you or I have to believe it.
| |
| Michele Dondi 2006-10-19, 7:58 am |
| (post edited and rearranged for clarity)
On 18 Oct 2006 19:48:19 -0700, himagauri@gmail.com wrote:
[color=darkred]
>Point is that I would later have to work on files >16 GB. Hence I
>haven't tested for intermediate values.
1) *Please* do not top-post (see remark above);
2) Then why don't you do it now? Just to know what's going on. I'm
writing some sample code for you but I can't terminate it right now...
Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{po
p^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
| |
| xhoster@gmail.com 2006-10-19, 6:58 pm |
| himagauri@gmail.com wrote:
> The OS is Suse Linux and perl version is 5.8.3.
>
> Point is that I would later have to work on files >16 GB. Hence I
> haven't tested for intermediate values.
The point is that to fix a problem it helps to know what the problem is.
Testing for intermediate sizes might help figure that out.
Since you are using linux, you probably have access to strace. strace
a simple perl program that does nothing but run -s on a large file and
see where it is spending its time.
$ strace perl -le 'print -s "foo2"'
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
| |
| Big and Blue 2006-10-19, 9:58 pm |
| himagauri@gmail.com wrote:
> Initially I thought the '-s' operator doesn't work for file sizes >1
> GB. Hence I worked out a sample program and noticed that it works.
So the problem is something else which is not related to size...
> Then I thought probably I'm running out of memory..but not sure about
> it.
: time perl -le 'print -s " star_wreck_in_the_pirkinning_subtitled_x
vid.avi"'
567558788
0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+445minor)pagefaults 0swaps
So, 567MB sized in 0 time with virtually no reads. Sizing a file doesn't
use memory - you just stat the entry in the directory and read its size. If
that were not the case then "ls -l" would be an I/O nightmare.
> The file in question is being transferred from MVS to Linux using FTP.
It would help if you explained more things up front...
> The file on MVS has a record format of variable block (RECFM=VB).
...Hmmm - just like the VMS file I mentioned yesterday.
> I need to calculate the file size on Linux to check if the number of
> bytes transferred to linux is equal to the file size on MVS.
How do you get the file size of the MVS file into the perl program?
(Since I presume hat the check is being done there).
> does Linux OS have variable length files?
No.
Two things:
a) Your original problem of:
$size = -s $filename;
$size = -1 unless defined($size);
would produce -1 if you happened to chdir for some reason out of the
directory containing $filename (unless it is a fully-qualified path).
b)
> ...the correct size of 16376862124 bytes is
> returned for the larger file after a long time-about 40 minutes.
Is this file on a local file system? If it is on a network file system,
what type of system is it on?
--
Just because I've written it doesn't mean that
either you or I have to believe it.
| |
| Peter J. Holzer 2006-10-21, 6:59 pm |
| On 2006-10-19 02:48, himagauri@gmail.com <himagauri@gmail.com> wrote:
> The OS is Suse Linux and perl version is 5.8.3.
>
> Point is that I would later have to work on files >16 GB. Hence I
> haven't tested for intermediate values.
> Initially I thought the '-s' operator doesn't work for file sizes >1
> GB.
If SuSE has for some reason built perl without large file support, -s
won't work for files larger than 2 GB. 1 GB would be a strange file size
limit.
You can check whether perl has large file support by invoking
/usr/bin/perl -e 'print -s "foo", "\n"'
It should print something like
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
^^^^^^^^^^^^^^^^^^^^
If you don't have large file support, you can't handle files larger than
2GB. It may help to compile perl yourself, but your OS and libraries
need to support large files, too.
hp
--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sy min WSR | > ist?
| | | hjp@hjp.at | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
| |
| Michele Dondi 2006-10-21, 6:59 pm |
| On Sat, 21 Oct 2006 18:59:56 +0200, "Peter J. Holzer"
<hjp-usenet2@hjp.at> wrote:
> /usr/bin/perl -e 'print -s "foo", "\n"'
>
>It should print something like
>
> useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
> ^^^^^^^^^^^^^^^^^^^^
Really?!? ;-)
How 'bout
perl -V
perl -V:uselargefiles
instead?
Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{po
p^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
| |
| Peter J. Holzer 2006-10-21, 6:59 pm |
| On 2006-10-21 17:59, Michele Dondi <bik.mido@tiscalinet.it> wrote:
> On Sat, 21 Oct 2006 18:59:56 +0200, "Peter J. Holzer"
><hjp-usenet2@hjp.at> wrote:
>
>
> Really?!? ;-)
Oops. Wrong line in the clipboard. Should have been
perl -V | grep large
instead.
> perl -V:uselargefiles
This is of course even better but requires you to know the exact
spelling of the option, while for the grep you only need to remember
that it has something to do with "large".
hp
--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sy min WSR | > ist?
| | | hjp@hjp.at | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
| |
| Josef Moellers 2006-10-23, 3:59 am |
| Peter J. Holzer wrote:
> On 2006-10-19 02:48, himagauri@gmail.com <himagauri@gmail.com> wrote:
>=20
>=20
>=20
> If SuSE has for some reason built perl without large file support, -s
> won't work for files larger than 2 GB. 1 GB would be a strange file siz=
e
> limit.
I'm running SuSE Professional 9.2
perl -V
Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
=2E..
cc=3D'cc', ccflags =3D'-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS=20
-fno-strict-aliasing -pipe -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=3D64',=
=2E..
ls size=3D8
I use a perl script to process DVB-S files and it sure handles files=20
larger than 4GB!
--=20
Josef M=C3=B6llers (Pinguinpfleger bei FSC)
If failure had no penalty success would not be a prize
-- T. Pratchett
| |
| Peter J. Holzer 2006-10-23, 3:59 am |
| On 2006-10-21 16:59, Peter J. Holzer <hjp-usenet2@hjp.at> wrote:
> On 2006-10-19 02:48, himagauri@gmail.com <himagauri@gmail.com> wrote:
>
> If SuSE has for some reason built perl without large file support, -s
> won't work for files larger than 2 GB. 1 GB would be a strange file
> size limit.
>
> You can check whether perl has large file support by invoking
Sorry about that - I see now that you included that information already
in your first posting. So I don't see a perl-related reason why -s
shouldn't work.
hp
--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sy min WSR | > ist?
| | | hjp@hjp.at | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
| |
| Michele Dondi 2006-10-30, 7:09 pm |
| On 19 Oct 2006 14:00:48 +0200, Michele Dondi <bik.mido@tiscalinet.it>
wrote:
>
>1) *Please* do not top-post (see remark above);
>2) Then why don't you do it now? Just to know what's going on. I'm
>writing some sample code for you but I can't terminate it right now...
I'm afraid I'm a bit late... but I conjured up the following
(untested), hth:
#!/usr/bin/perl
use strict;
use warnings;
use constant { K => 2**10, M => 2**20, G=> 2**30 };
use Time::HiRes 'time';
sub printout {
local ($\,$,)=("\n","\t");
print @_;
}
my $chunk="\0" x M;
printout qw/size "fake" "real"/;
for (1..32) {
{
open my $fh1, '>:raw', 'test1' or
die "Can't open test file: $!\n";
open my $fh2, '>:raw', 'test2' or
die "Can't open test file: $!\n";
s $fh1, $_*G-1, 0 or
die "Can't s file: $!\n";
print $fh1 "\0";
print $fh2 $chunk for 1 .. $_*K;
}
printout "$_ Gb", map { my $start=time;
-s "test$_"; time-$start } 1,2;
}
unlink qw/test1 test2/;
__END__
Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{po
p^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
|
|
|
|
|