For Programmers: Free Programming Magazines  


Home > Archive > C > June 2006 > data mining









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author data mining
vmishra85@gmail.com

2006-06-24, 7:58 am

i have to find the common simillar subsequence among the large
no.s(>21,000) of input sequences, and not neccesarily present in all
the input sequences.subsequence need not be the exact one.it can vary
upto 20%.

can anybody help to design the algorithm for the above problem.
Take it as a challenge.

Vladimir Oka

2006-06-24, 7:58 am


vmishra85@gmail.com wrote:
> i have to find the common simillar subsequence among the large
> no.s(>21,000) of input sequences, and not neccesarily present in all
> the input sequences.subsequence need not be the exact one.it can vary
> upto 20%.
>
> can anybody help to design the algorithm for the above problem.
> Take it as a challenge.


This is a question more suited to comp.programming; here you're only
likely to get advice once you try to implement it in C, and experience
C language problems.

PS
I feel something biotechy about it (looking for a gene in different DNA
strands). Maybe there lies an alternative path to the answer. Are there
any Usenet groups dedicated to it?

pemo

2006-06-24, 7:58 am

Vladimir Oka wrote:
> vmishra85@gmail.com wrote:
>
> This is a question more suited to comp.programming; here you're only
> likely to get advice once you try to implement it in C, and experience
> C language problems.
>
> PS
> I feel something biotechy about it (looking for a gene in different
> DNA strands). Maybe there lies an alternative path to the answer. Are
> there any Usenet groups dedicated to it?


My misses is a Population Geneticist, and yes, it sounds like the sort of
thing she does with DNA sequences in C++ ... when casually asked about this,
she replied 'needleman-wunsch' - I didn't s clarification - it usually
does my head in if I do.


--
==============
Not a pedant
==============


osmium

2006-06-24, 7:58 am

<vmishra85@gmail.com> wrote:

>i have to find the common simillar subsequence among the large
> no.s(>21,000) of input sequences, and not neccesarily present in all
> the input sequences.subsequence need not be the exact one.it can vary
> upto 20%.
>
> can anybody help to design the algorithm for the above problem.
> Take it as a challenge.


There is a newsgroup devoted to the specialty of genetic algorithms, it is
comp.ai.genetic. But I think your chances of getting any detailed help,
even there, are well under 1%.


Juuso Hukkanen

2006-06-24, 7:59 am

On 21 Jun 2006 03:50:58 -0700, "vmishra85@gmail.com"
<vmishra85@gmail.com> wrote:

>i have to find the common simillar subsequence among the large
>no.s(>21,000) of input sequences, and not neccesarily present in all
>the input sequences.subsequence need not be the exact one.it can vary
>upto 20%.


Many C data-mining code examples are shown in
http://www.cosc.canterbury.ac.nz/ta...manuscript3.pdf


Juuso Hukkanen
(to reply by e-mail set addresses month and year to correct)
Your Uncle

2006-06-28, 6:56 pm


"Juuso Hukkanen" <juuso_12_2003@tele3d.net> wrote in message
news:am4o9295a4us0naltlh791ctuvpm8sp2fo@
4ax.com...
> On 21 Jun 2006 03:50:58 -0700, "vmishra85@gmail.com"
> <vmishra85@gmail.com> wrote:
>
>
> Many C data-mining code examples are shown in
> http://www.cosc.canterbury.ac.nz/ta...manuscript3.pdf

I'd love to hear anything you have to say about this topic. I believe it
has forensic dimensions. bfx


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com