Home > Archive > C > June 2006 > data mining
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| vmishra85@gmail.com 2006-06-24, 7:58 am |
| i have to find the common simillar subsequence among the large
no.s(>21,000) of input sequences, and not neccesarily present in all
the input sequences.subsequence need not be the exact one.it can vary
upto 20%.
can anybody help to design the algorithm for the above problem.
Take it as a challenge.
| |
| Vladimir Oka 2006-06-24, 7:58 am |
|
vmishra85@gmail.com wrote:
> i have to find the common simillar subsequence among the large
> no.s(>21,000) of input sequences, and not neccesarily present in all
> the input sequences.subsequence need not be the exact one.it can vary
> upto 20%.
>
> can anybody help to design the algorithm for the above problem.
> Take it as a challenge.
This is a question more suited to comp.programming; here you're only
likely to get advice once you try to implement it in C, and experience
C language problems.
PS
I feel something biotechy about it (looking for a gene in different DNA
strands). Maybe there lies an alternative path to the answer. Are there
any Usenet groups dedicated to it?
| |
|
| Vladimir Oka wrote:
> vmishra85@gmail.com wrote:
>
> This is a question more suited to comp.programming; here you're only
> likely to get advice once you try to implement it in C, and experience
> C language problems.
>
> PS
> I feel something biotechy about it (looking for a gene in different
> DNA strands). Maybe there lies an alternative path to the answer. Are
> there any Usenet groups dedicated to it?
My misses is a Population Geneticist, and yes, it sounds like the sort of
thing she does with DNA sequences in C++ ... when casually asked about this,
she replied 'needleman-wunsch' - I didn't s clarification - it usually
does my head in if I do.
--
==============
Not a pedant
==============
| |
| osmium 2006-06-24, 7:58 am |
| <vmishra85@gmail.com> wrote:
>i have to find the common simillar subsequence among the large
> no.s(>21,000) of input sequences, and not neccesarily present in all
> the input sequences.subsequence need not be the exact one.it can vary
> upto 20%.
>
> can anybody help to design the algorithm for the above problem.
> Take it as a challenge.
There is a newsgroup devoted to the specialty of genetic algorithms, it is
comp.ai.genetic. But I think your chances of getting any detailed help,
even there, are well under 1%.
| |
| Juuso Hukkanen 2006-06-24, 7:59 am |
| On 21 Jun 2006 03:50:58 -0700, "vmishra85@gmail.com"
<vmishra85@gmail.com> wrote:
>i have to find the common simillar subsequence among the large
>no.s(>21,000) of input sequences, and not neccesarily present in all
>the input sequences.subsequence need not be the exact one.it can vary
>upto 20%.
Many C data-mining code examples are shown in
http://www.cosc.canterbury.ac.nz/ta...manuscript3.pdf
Juuso Hukkanen
(to reply by e-mail set addresses month and year to correct)
| |
| Your Uncle 2006-06-28, 6:56 pm |
|
"Juuso Hukkanen" <juuso_12_2003@tele3d.net> wrote in message
news:am4o9295a4us0naltlh791ctuvpm8sp2fo@
4ax.com...
> On 21 Jun 2006 03:50:58 -0700, "vmishra85@gmail.com"
> <vmishra85@gmail.com> wrote:
>
>
> Many C data-mining code examples are shown in
> http://www.cosc.canterbury.ac.nz/ta...manuscript3.pdf
I'd love to hear anything you have to say about this topic. I believe it
has forensic dimensions. bfx
|
|
|
|
|