For Programmers: Free Programming Magazines  


Home > Archive > Matlab > January 2008 > read a fasta-file from NCBI









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author read a fasta-file from NCBI
Corinna Schmitt

2008-01-31, 5:27 am

Hallo together,
I want to read in a fasta-file. which looks like this:

>gi|47118324|dbj|BA000018.3| Staphylococcus aureus subsp.

aureus N315 DNA, complete genome
CGATTAAAGATAGAAATACA...

I can now read in this file with the commands:

source=input('txt-File?-','s');
xyz=repmat('%s',1,1);
fid=fopen(source);
inputData=textscan(fid,xyz);
fclose(fid);

It works. The result is a cell array which looks like

>gi|47118324|dbj|BA000018.3|

Staphylococcus
aureus
subsp.
aureus
N315
DNA,
complete
genome
CGATTAAAGATAGAAATACA...

Because of the fasta-format the header information (in the
example above cells 1-9) differs. So I am not able to say
for the ongoing program please start in cell array at entry
10. There also not exit a special word to say ok here
starts my needed information. The only thing I know is that
infront of my wanted information part is a \n. How can I
add this in the readin function, so that I just get in
inputData the information CGATTAAAGATAGAAATACA...? Result
should also be a cell-array.

Any idea?

Tahnks, Corinna
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com