Home > Archive > Fortran > January 2006 > handling text
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| igthibau 2006-01-28, 7:01 pm |
| Hello everyone !
I am writing a prog to read a text file in order to pick out its contents :
the words contained within, and put them in vectors for later analysis.
so I wish to take individual words from a file and save each in a arrays.
Does anyone know if there is an easi(er) way , through fortran (f77) or else
??? to do that ??? (apart from writing a routine, which, with f77's
ease of handling text - according to what I know of it - might be fun
(not!)).
thanks for any help
G
| |
| Gary L. Scott 2006-01-28, 7:01 pm |
| igthibau wrote:
> Hello everyone !
> I am writing a prog to read a text file in order to pick out its contents :
> the words contained within, and put them in vectors for later analysis.
> so I wish to take individual words from a file and save each in a arrays.
> Does anyone know if there is an easi(er) way , through fortran (f77) or else
> ??? to do that ??? (apart from writing a routine, which, with f77's
> ease of handling text - according to what I know of it - might be fun
> (not!)).
I'm fairly familiar with text handling in C, REXX, Perl, Expect, Basic,
Jovial, Ada, COBOL, IBM's Script/GML and similar text/document
programming languages (one on Harris VOS that I've forgotten the name
of), and a few others. I don't see that Fortran's all that deficient in
comparison, as a general purpose high-level language. Each has it's
quirks, advantages, and di vantages. I do wish Fortran had a native
variable length string as a minor convenience (mainly so I don't have to
use trim almost everytime I want to concatenate), but I've never been
significantly limited by not having it. Script/GML (superior precursor
to SGML) is amazing though...
>
> thanks for any help
>
> G
>
>
>
--
Gary Scott
mailto:garyscott@ev1.net
Fortran Library: http://www.fortranlib.com
Support the Original G95 Project: http://www.g95.org
-OR-
Support the GNU GFortran Project: http://gcc.gnu.org/fortran/index.html
Why are there two? God only knows.
If you want to do the impossible, don't hire an expert because he knows
it can't be done.
-- Henry Ford
| |
| e p chandler 2006-01-28, 7:01 pm |
|
igthibau wrote:
> Hello everyone !
> I am writing a prog to read a text file in order to pick out its contents :
> the words contained within, and put them in vectors for later analysis.
> so I wish to take individual words from a file and save each in a arrays.
> Does anyone know if there is an easi(er) way , through fortran (f77) or else
> ??? to do that ??? (apart from writing a routine, which, with f77's
> ease of handling text - according to what I know of it - might be fun
> (not!)).
>
> thanks for any help
>
> G
1. Does an existing system command or routine such as Unix "wc" do what
you want?
2. How about languages which have built in facilities to handle text
such as Perl, AWK, SNOBOL, etc.?
3. Fortran 9x is better suited than Fortran 77. It has
a. more built in character functions
b. easier to create data structures
c. better constructs for flow of control
d. libraries which handle variable length strings and so forth.
4. Fortran 77 does let you index through fixed length character stings
character by character. This is something you can do in almost any
programming language. See
"Software Tools" or "Software Tools in Pascal" by Kernighan and
Plauger.
-- Elliot
| |
| Dr Ivan D. Reid 2006-01-28, 9:57 pm |
| On Sat, 28 Jan 2006 18:23:08 +0100, igthibau <igthibau@wanadoo.fr>
wrote in <43dba87f$0$18323$8fcfb975@news.wanadoo.fr>:
> Hello everyone !
> I am writing a prog to read a text file in order to pick out its contents :
> the words contained within, and put them in vectors for later analysis.
> so I wish to take individual words from a file and save each in a arrays.
> Does anyone know if there is an easi(er) way , through fortran (f77) or else
> ??? to do that ??? (apart from writing a routine, which, with f77's
> ease of handling text - according to what I know of it - might be fun
> (not!)).
I'd prefer to use awk or one of its successors such as perl; with
awk you can take each word of an input string and increment a count for
that word, creating the array element with a default initial value of 0
if that word hasn't been encountered before. Something like:
{ for (i=1; i<=NF; i++) {count[$i]++;}
}
END {for (word in count) {print word,count[word];}
}
--
Ivan Reid, Electronic & Computer Engineering, ___ CMS Collaboration,
Brunel University. Ivan.Reid@[brunel.ac.uk|cern.ch] Room 40-1-B12, CERN
KotPT -- "for stupidity above and beyond the call of duty".
| |
| glen herrmannsfeldt 2006-01-28, 9:57 pm |
| Dr Ivan D. Reid wrote:
(snip)
> I'd prefer to use awk or one of its successors such as perl; with
> awk you can take each word of an input string and increment a count for
> that word, creating the array element with a default initial value of 0
> if that word hasn't been encountered before. Something like:
>
> { for (i=1; i<=NF; i++) {count[$i]++;}
> }
> END {for (word in count) {print word,count[word];}
> }
Yes, I like awk for this type of problem.
The important part is a hash table, which I don't believe that Fortran
includes yet.
Last time I did work counting I started in awk, but when that was too
slow (I had 15GB to count) I went to Java. Using StringTokenizer to
find words and Hashtable keep track of them it isn't so hard.
With JIT on it runs reasonably fast. (It took some hours to do
15GB, though computers are faster now.)
If you have a hashtable routine for Fortran it probably isn't so
hard to do.
-- glen
|
|
|
|
|