| Generic Usenet Account 2006-05-10, 7:07 pm |
|
Generic Usenet Account wrote:
> Sorry for the confusion. Let me give a "stronger" definition for
> "weak" patterns:)
> A weak pattern is a set of symbols, not necessarily contiguous, which
> repeats itself more than once in a data stream. For example, in the
> data stream 2-4-7-8-0-1-2-3-5-4-6-7-0, the weak pattern 2-4-7-0 is
> rpeated twice.
>
> I would also like to re-characterize the problem as finding the longest
> "weak" pattern in a data stream, given the definition of "weak" pattern
> provided above.
>
> Hope this helps!
> Song
Before I get slammed again, let me clarify the definition further. A
weak pattern is a collection (not set, since set implies ordering) of
symbols, not necessarily contiguous, which repeats itself more than
once in a data stream. Even though contiguity of symbols within a
pattern is not important, the order of the symbols within the pattern
is important.
For example, in the data stream 2-4-7-8-0-1-2-3-5-4-6-0-7 (slighly
modified version of the original data stream i.e.
2-4-7-8-0-1-2-3-5-4-6-7-0) the weak pattern 2-4-7-0 does not exist.
However, the weak patterns 2-4-7 and 2-4-0 exist in the data stream.
Needless to say, the algorithm does not assume prior knowledge of these
patterns. The algorithm should be able to discover them.
-Song
|