For Programmers: Free Programming Magazines  


Home > Archive > Open Source Software > May 2006 > Re: Algorithms for finding weak patterns in a data stream









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: Algorithms for finding weak patterns in a data stream
narvi

2006-05-11, 4:14 am

Generic Usenet Account wrote:
> Generic Usenet Account wrote:
>
> Before I get slammed again, let me clarify the definition further. A
> weak pattern is a collection (not set, since set implies ordering) of
> symbols, not necessarily contiguous, which repeats itself more than
> once in a data stream. Even though contiguity of symbols within a
> pattern is not important, the order of the symbols within the pattern
> is important.
>
> For example, in the data stream 2-4-7-8-0-1-2-3-5-4-6-0-7 (slighly
> modified version of the original data stream i.e.
> 2-4-7-8-0-1-2-3-5-4-6-7-0) the weak pattern 2-4-7-0 does not exist.
> However, the weak patterns 2-4-7 and 2-4-0 exist in the data stream.
>
> Needless to say, the algorithm does not assume prior knowledge of these
> patterns. The algorithm should be able to discover them.
>
> -Song

Is the entire data stream known before screening for weak pattern? If
so, we can take 50% of the length of the stream, and append each symbol
with a (\d)* in the middle and compare it with the rest of the string.
(50% is kinda obvious I guess). We can keep reducing the pattern size
and continue matching.

If the stream is going to change.. we can use the largest pattern and
increase size till 50% from that to match for newer patterns. However,
I don't think this can be done efficiently in real time.

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com