Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

counting number of occurrences of every possible substring in multiple files
I am trying to write a program that reads multiple files and prints out the
number of occurrences of n-length byte sequences across these files. the
value of n must be specified on the command-line.

Since I'll be dealing with binary files, I want the ASCII codes of the
characters printed out.

e.g. for n=2 and the following 3 files, contents shown as integers,

f1 = {33, 84, 55}, f2 = {84, 55, 12}, f3 = {33, 84, 55}

I want output like this:
3    84 55
2    33 84

I'll be dealing with files up to about one megabyte in size. Efficiency is
not critical, and it does not matter, say, if a length-2 sequence is a
substring of a length-3, or a more frequently occurring sequence. Values of
n will not go above 10.




Report this thread to moderator Post Follow-up to this message
Old Post
C3
09-30-04 01:05 AM


Re: counting number of occurrences of every possible substring in multiple files
C3 <> wrote:
> I am trying to write a program that reads multiple files and prints out th
e
> number of occurrences of n-length byte sequences across these files. the
> value of n must be specified on the command-line.
>
> Since I'll be dealing with binary files,


perldoc -f binmode


> I want the ASCII codes of the
> characters printed out.


Huh?

If it is a text file, then it contains ASCII codes.

If it is a binary file, then it may contain some other encoding.

Anyway,

perldoc -f chr
perldoc -f ord


> e.g. for n=2 and the following 3 files, contents shown as integers,
>
> f1 = {33, 84, 55}, f2 = {84, 55, 12}, f3 = {33, 84, 55}
>
> I want output like this:
> 3    84 55
> 2    33 84
>
> I'll be dealing with files up to about one megabyte in size. Efficiency is
> not critical, and it does not matter, say, if a length-2 sequence is a
> substring of a length-3, or a more frequently occurring sequence. Values o
f
> n will not go above 10.


Did you mean to ask a question?

What is it that you need help with?

Are you asking for someone to write a program to your specification
for you? It kind of sounds that way...


--
Tad McClellan                          SGML consulting
tadmc@augustmail.com                   Perl programming
Fort Worth, Texas

Report this thread to moderator Post Follow-up to this message
Old Post
Tad McClellan
09-30-04 01:05 AM


Re: counting number of occurrences of every possible substring in multiple files
"C3" <_> wrote in message
news:415ac218$0$20582$afc38c87@news.optusnet.com.au...
> I am trying to write a program that reads multiple files and prints
out the
> number of occurrences of n-length byte sequences across these files.
the
> value of n must be specified on the command-line.
>
> Since I'll be dealing with binary files, I want the ASCII codes of the
> characters printed out.
>
> e.g. for n=2 and the following 3 files, contents shown as integers,
>
> f1 = {33, 84, 55}, f2 = {84, 55, 12}, f3 = {33, 84, 55}
>
> I want output like this:
> 3    84 55
> 2    33 84
>
> I'll be dealing with files up to about one megabyte in size.
Efficiency is
> not critical, and it does not matter, say, if a length-2 sequence is a
> substring of a length-3, or a more frequently occurring sequence.
Values of
> n will not go above 10.

Do you realize that no where in here did you ask a question?  What is it
you need help with?  What part are you stuck on?  What have you tried so
far, and how did your attempt fail to work correctly?

Paul Lalli



Report this thread to moderator Post Follow-up to this message
Old Post
Paul Lalli
09-30-04 01:05 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

PERL Miscellaneous archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 05:40 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.