For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > June 2006 > Trouble parsing text with bioperl









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Trouble parsing text with bioperl
jeffcullis@gmail.com

2006-06-28, 6:57 pm

Hi, just having some problems running the following bioperl script.
It's supposed to download all the sequences in a file using the gi
numbers. But when I run it it only downloads one sequence of the five
in each line of the file, and that's it. When I uncomment the commented
line, the five sequences are downloaded no problem! I've even printed
out each element of @gi_nums and they correspond to the correct numbers
so I have no idea why they aren't all being downloaded. Any help on
this would be great.

#!/usr/bin/perl -w
use Bio::DB::GenPept;
open GIS, "<gis_5" or die "Can't open gi file";
my $gb = new Bio::DB::GenPept();

while($line = <GIS> ) {
@gi_nums = split(' ', $line);
#@gi_nums = ["78096912", "78096910", "78096909", "78096907",
"78096905"];

my $seqio = $gb->get_Stream_by_id(@gi_nums);
while( my $seq = $seqio->next_seq ) {
print "seq id is ", $seq->display_id, "\n";
}
}

The file "gis_5" has the following contents:
78096912 78096910 78096909 78096907 78096905
82653972 82653970 82653968 82653967 82653965

Paul Lalli

2006-06-28, 6:57 pm


jeffcullis@gmail.com wrote:
> Hi, just having some problems running the following bioperl script.
> It's supposed to download all the sequences in a file using the gi
> numbers. But when I run it it only downloads one sequence of the five
> in each line of the file, and that's it. When I uncomment the commented
> line, the five sequences are downloaded no problem! I've even printed
> out each element of @gi_nums and they correspond to the correct numbers
> so I have no idea why they aren't all being downloaded. Any help on
> this would be great.
>
> #!/usr/bin/perl -w
> use Bio::DB::GenPept;
> open GIS, "<gis_5" or die "Can't open gi file";
> my $gb = new Bio::DB::GenPept();
>
> while($line = <GIS> ) {
> @gi_nums = split(' ', $line);


This would make @gi_nums be one array containing (if you're right about
your data) five elements.

> #@gi_nums = ["78096912", "78096910", "78096909", "78096907",
> "78096905"];


This would make @gi_nums be one array containing one element. That one
element would be a reference to an array that contains five elements.

>
> my $seqio = $gb->get_Stream_by_id(@gi_nums);


I have no experience with BioPerl, so I have no idea what data this
method is looking for. If your commented line "works", then I would
suggest making your uncommented line match it - make it an array
containing one reference to an array, rather than an array containing
five elements:

@gi_nums = [ split(' ', $line) ];

Paul Lalli

jeffcullis@gmail.com

2006-06-29, 7:57 am


Paul Lalli wrote:
> jeffcullis@gmail.com wrote:
>
> This would make @gi_nums be one array containing (if you're right about
> your data) five elements.
>
>
> This would make @gi_nums be one array containing one element. That one
> element would be a reference to an array that contains five elements.
>
>
> I have no experience with BioPerl, so I have no idea what data this
> method is looking for. If your commented line "works", then I would
> suggest making your uncommented line match it - make it an array
> containing one reference to an array, rather than an array containing
> five elements:
>
> @gi_nums = [ split(' ', $line) ];
>
> Paul Lalli


Thanks Paul, that one change made everything work. I still have no idea
why the method needs strange input like that.

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com