For Programmers: Free Programming Magazines  


Home > Archive > Fortran > October 2004 > Reading Tab Separated Files









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Reading Tab Separated Files
Alan Ross

2004-10-24, 3:56 pm

Dear All,

I am currently learning Fortran and am stuck on a problem. I have
scoured the web and this group for a solution but to no avail.
Hopefully someone would be able to point me in the right direction.

I am attempting to read in a formatted file of integer values
separated by tabs. There are three values per line. eg.

180 807 479
27 785 88
145 50 429
317 5 805
131 77 402

I am attempting to use the following code:

integer :: i1, i2, i3
open(unit = 10, file = "list.dat", form = "formatted", action =
"read")
do
read(10, *) i1, i2, i3
write(*,*) i1, i2, i3
end do
close(10)

According to the postings in this group, the free form format * should
recognise the tabs as separating the integers (btw. I am using NAGware
F95 compiler). I get the error message "Invalid input for integer
editing"

I also experimented with format statements namely fmt="(i3, t5, i3,
t9, i3)" but since the number of integer characters before the tab
varies, I have problems. So it only outputs the first line and ignores
the rest.

Could anyone please help push me in the right direction. I have spent
days on this problem now.

Thanks,

Alan
Richard Maine

2004-10-24, 3:56 pm

alan@rebeluk.com (Alan Ross) writes:

> According to the postings in this group, the free form format * should
> recognise the tabs as separating the integers...


Umm. That depends on what you mean by "should". There are at least 3
possible meanings of it here.

1. Should according to the standard. If that's the intended meaning,
that is just wrong. The standard does not even mention the tab
character.

2. Personal opinion of what would be nice.... which has little to do
with the way things actually are.

3. Should as in has been observed to work with some particular
compiler. I'm sure some compilers do this. I don't recall
whether NAG is one. (I do use NAG as my most common compiler,
but I never use tabs, so I don't off-hand know... and I don't
have a license key for my home system at the moment, so I
can't check here).

> I also experimented with format statements namely fmt="(i3, t5, i3,
> t9, i3)" but since the number of integer characters before the tab
> varies, I have problems. So it only outputs the first line and ignores
> the rest.
>
> Could anyone please help push me in the right direction. I have spent
> days on this problem now.


There is no standard-conforming, portable way to deal with tabs
directly in a format. You've noted the problem with explicit formats,
and recognition of tabs by list-directed reads (the * format) is
nonstandard.

There are 3 basic approaches (that I can think of off-hand.

1. Change the tabs to something else (blanks will do) external to the
Fortran program. There are lots of ways to do that, depending on
the operating susyem. Some of them are pretty trivial. Of course,
having this as a separate external step can be an operational
nuisance. It sure is simple if it doesn't cause operational
problems.

2. Read the entire line into a single character variable using an
explicit (a) format (don't use list-directed, which might or
might not stop at tabs and definitely will stop at blanks).

Then parse that character variable. See the string intrinsics
for help. A lot of peopel keep collections of parsing
subroutines to help with tasks like this.

This is what I usually do for most input-parsing-related stuff.

3. As with (2), read the line into a single character variable.

Then change all the tabs in the variable into blanks (use
the scan or index intrinsic to find the tabs).

Then use a list-directed internal read. (Valid only in f90
or later, but your compiler is).

--
Richard Maine
email: my last name at domain
domain: summertriangle dot net
Dan Tex1

2004-10-24, 3:56 pm

From: alan@rebeluk.com (Alan Ross)

>Dear All,


>I am currently learning Fortran and am stuck on a problem. I have
>scoured the web and this group for a solution but to no avail.
>Hopefully someone would be able to point me in the right direction.
>
>I am attempting to read in a formatted file of integer values
>separated by tabs. There are three values per line. eg.
>
> 180 807 479
> 27 785 88
> 145 50 429
> 317 5 805
> 131 77 402
>
>I am attempting to use the following code:
>
> integer :: i1, i2, i3
> open(unit = 10, file = "list.dat", form = "formatted", action =
>"read")
> do
> read(10, *) i1, i2, i3
> write(*,*) i1, i2, i3
> end do
> close(10)
>
>According to the postings in this group, the free form format * should
>recognise the tabs as separating the integers (btw. I am using NAGware
>F95 compiler). I get the error message "Invalid input for integer
>editing"


I don't recall anyone here saying that tabs act as separators.

>I also experimented with format statements namely fmt="(i3, t5, i3,
>t9, i3)" but since the number of integer characters before the tab
>varies, I have problems. So it only outputs the first line and ignores
>the rest.


Tabs are not officially recognized characters by Fortran. Any particular
compiler "may" recognize them, however, that isn't guaranteed. So... it's a
compiler dependant thing. I'm sure others here can give you more in-depth
definitive explanations if needed. Additionally, I know several people here
are familiar with how NAG handles tabs. Of course... you might want to look in
your documentation that comes with your compiler.

You might also read up on use of the "T" edit descriptor. Your use of it above
seems a bit off. On a side note, I tend to think that most compilers that can
handle tabs will read them as a single character rather than as several blank
spaces.

Here's one way to do what you want ( below code not compile-tested ):
This assumes that your compiler is capable of seeing the Tab character.
It also assumes that the Tab character is indeed a single-character entity.
( On Windows systems, the above assumptions are virtually guaranteed ).
There are other ways, but... this way is simple to understand & implement.
Additionally, you can easily alter this code to do other things with your data.

Character(120) :: line
Character(1) :: tab
Integer :: i1, i2, i3
tab = ACHAR(9)
DO
! read your entire line of data into a character variable
Read(10,'(a)') line
Do i = 1, LEN(line)
! replace each tab in the line with a space.
IF ( line(i:i)==tab ) line(i:i)=' '
End Do
! read integers directly from your character variable
! "line" is an "internal file" with a name rather than a disk file with a
unit #.
Read( line,* ) i1, i2, i3
Write( *,* ) i1, i2, i3
END DO
! Code to Open/Close your file not shown.

Dan :-)

>Could anyone please help push me in the right direction. I have spent
>days on this problem now.
>
>Thanks,
>
>Alan



James Van Buskirk

2004-10-24, 3:56 pm

"Alan Ross" <alan@rebeluk.com> wrote in message
news:ea321877.0410240829.602a57de@posting.google.com...

> I am attempting to read in a formatted file of integer values
> separated by tabs. There are three values per line. eg.


> 180 807 479
> 27 785 88
> 145 50 429
> 317 5 805
> 131 77 402


> I am attempting to use the following code:


> integer :: i1, i2, i3
> open(unit = 10, file = "list.dat", form = "formatted", action =
> "read")
> do
> read(10, *) i1, i2, i3
> write(*,*) i1, i2, i3
> end do
> close(10)


As you have seen, list-directed formatting doesn't necessarily
work for a tab-delimited file. Also your code even if it worked
would crash due to an end of file during read. Assuming each
record consisted of three tabs each followed by the desired
inputs, the following works:

program tabs
implicit none
character(100) line ! Assuming input lines are <= 100 characters
integer tab1, tab2, tab3
integer i1, i2, i3

open(10,file='list.dat')
do
read(10,'(a)',end=10) line
tab1 = scan(line,achar(9))
tab2 = scan(line(tab1+1:),achar(9))+tab1
tab3 = scan(line(tab2+1:),achar(9))+tab2
read(line(tab1+1:tab2-1),*) i1
read(line(tab2+1:tab3-1),*) i2
read(line(tab3+1:),*) i3
write(*,*) i1, i2, i3
end do
10 continue
end program tabs

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end


beliavsky@aol.com

2004-10-24, 8:56 pm


alan@rebeluk.com (Alan Ross) wrote:
>Dear All,
>
>I am currently learning Fortran and am stuck on a problem. I have
>scoured the web and this group for a solution but to no avail.
>Hopefully someone would be able to point me in the right direction.
>
>I am attempting to read in a formatted file of integer values
>separated by tabs. There are three values per line. eg.
>
> 180 807 479
> 27 785 88
> 145 50 429
> 317 5 805
> 131 77 402
>
>I am attempting to use the following code:
>
> integer :: i1, i2, i3
> open(unit = 10, file = "list.dat", form = "formatted", action =
>"read")
> do
> read(10, *) i1, i2, i3
> write(*,*) i1, i2, i3
> end do
> close(10)
>
>According to the postings in this group, the free form format * should
>recognise the tabs as separating the integers (btw. I am using NAGware
>F95 compiler). I get the error message "Invalid input for integer
>editing"
>
>I also experimented with format statements namely fmt="(i3, t5, i3,
>t9, i3)" but since the number of integer characters before the tab
>varies, I have problems. So it only outputs the first line and ignores
>the rest.
>
>Could anyone please help push me in the right direction. I have spent
>days on this problem now.


I dislike tabs, because of the type of problem you are encountering. Whenever
possible, I convert the data files to comma or space delimited. With the
Unix 'tr' utility (which has also been ported to Windows) one can replace
tabs with spaces with a one-line script. I would tell you how, but I am not
close to one of my own PC's.

For a Fortran solution, I suggest reading the thread I started with the subject
"recognizing a tab character in input" in this newsgroup -- use Google.



----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Alan Ross

2004-10-24, 8:56 pm

James Van Buskirk wrote:

> "Alan Ross" <alan@rebeluk.com> wrote in message
> news:ea321877.0410240829.602a57de@posting.google.com...
>
>
>
> As you have seen, list-directed formatting doesn't necessarily
> work for a tab-delimited file. Also your code even if it worked
> would crash due to an end of file during read. Assuming each
> record consisted of three tabs each followed by the desired
> inputs, the following works:
>
> program tabs
> implicit none
> character(100) line ! Assuming input lines are <= 100 characters
> integer tab1, tab2, tab3
> integer i1, i2, i3
>
> open(10,file='list.dat')
> do
> read(10,'(a)',end=10) line
> tab1 = scan(line,achar(9))
> tab2 = scan(line(tab1+1:),achar(9))+tab1
> tab3 = scan(line(tab2+1:),achar(9))+tab2
> read(line(tab1+1:tab2-1),*) i1
> read(line(tab2+1:tab3-1),*) i2
> read(line(tab3+1:),*) i3
> write(*,*) i1, i2, i3
> end do
> 10 continue
> end program tabs



Thanks for the quick response. I have tried the above code, but I get the
following error when I try to run the program:

Invalid input for integer editing
Program terminated by fatal I/O error
Aborted

Any ideas?

Cheers,

Alan
James Van Buskirk

2004-10-24, 8:56 pm

"Alan Ross" <alan@rebeluk.com> wrote in message
news:YCVed.149846$BI5.70013@fe2.news.blueyonder.co.uk...

> Invalid input for integer editing
> Program terminated by fatal I/O error
> Aborted


> Any ideas?


Post the actual file someplace that we can download it and
find out what the actual format is -- this simply can't be
determined from your description.

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end


Alan Ross

2004-10-24, 8:56 pm

James Van Buskirk wrote:

> "Alan Ross" <alan@rebeluk.com> wrote in message
> news:YCVed.149846$BI5.70013@fe2.news.blueyonder.co.uk...
>
>
>
> Post the actual file someplace that we can download it and
> find out what the actual format is -- this simply can't be
> determined from your description.
>


I have posted the file at http://alan.rebeluk.com/objectlist.dat

Cheers,

Alan
James Van Buskirk

2004-10-25, 3:59 am

"Alan Ross" <alan@rebeluk.com> wrote in message
news:AZWed.22166$i02.8329@fe1.news.blueyonder.co.uk...

> I have posted the file at http://alan.rebeluk.com/objectlist.dat


OK, now that we know the data layout we can parse the file:

program tabs
implicit none
character(100) line ! Assuming input lines are <= 100 characters
integer tab1, tab2
integer i1, i2, i3

open(10,file='objectlist.dat')
! Skip header line
read(10,'()')
do
read(10,'(a)',end=10) line
! Find the two tabs
tab1 = scan(line,achar(9))
tab2 = scan(line(tab1+1:),achar(9))+tab1
! Parse the three numbers
read(line(:tab1-1),*) i1
read(line(tab1+1:tab2-1),*) i2
read(line(tab2+1:),*) i3
! Print results
write(*,*) i1, i2, i3
end do
10 continue
end program tabs

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end


Alan Ross

2004-10-25, 3:59 am

James Van Buskirk wrote:

> "Alan Ross" <alan@rebeluk.com> wrote in message
> news:AZWed.22166$i02.8329@fe1.news.blueyonder.co.uk...
>
>
> OK, now that we know the data layout we can parse the file:


That did the job. Many thanks for your time and your patience.

Cheers,

Alan
David Frank

2004-10-25, 8:56 am


"Alan Ross" <alan@rebeluk.com> wrote in message
news:AZWed.22166$i02.8329@fe1.news.blueyonder.co.uk...
>
> I have posted the file at http://alan.rebeluk.com/objectlist.dat
>
> Cheers,
>
> Alan


re: using unix utility (gag) to process your file.

Fortran'ers keep ignoring the language's powerful text parsing ability.
Your file's last record is empty ( CR LF ) which is a problem unless
provided for as I have done in below test program which successfully
processes your file from link above using minimal syntax.

! --------------------------
program test
character(80) :: s
character :: a(80) ; equivalence (a,s)
open (1,file='test.txt')
read (1,*) ! skip record #1
do
read (1,'(a)',end=101) s ! read string record
where (a==char(09)) a = ' ' ! remove tabs from string
read (s,*,end=101) i1,i2,i3 ! exit last record == CR LF
write (*,'(3i4)') i1,i2,i3 ! can redirect -> detab file
end do
101 stop
end program


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com