Home > Archive > Functional > May 2007 > Number of balanced binary trees
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Number of balanced binary trees
|
|
| Jon Harrop 2007-05-20, 7:06 pm |
|
How many AVL-balanced binary tree structures are there for a given number of
nodes?
Ultimately, I'd like to know the complexity of pattern matching over trees
as sequences, e.g. if the pattern match {1,2,3,..} corresponds to matching
all sets containing those "n" elements then how long does it take to find a
match for a given "n"?
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/produc...journal/?usenet
| |
| Joachim Durchholz 2007-05-21, 4:20 am |
| Jon Harrop schrieb:
> How many AVL-balanced binary tree structures are there for a given number of
> nodes?
It's already worst-case exponential for perfectly balanced binary trees:
for each node with an uneven number of subnodes, there are two valid
tree configurations.
In the worst case, each subtree has two configurations, so you get
combinatorial(?) explosion.
This is just a first approximation. If a subtree has an odd number of
subnodes, one of its subtrees must have an even number of nodes.
However, the sub-subnodes can have odd subnode counts again.
I'd expect the net result to be exponential anyway, though a more
precise analysis would be necessary to really establish this.
> Ultimately, I'd like to know the complexity of pattern matching over trees
> as sequences, e.g. if the pattern match {1,2,3,..} corresponds to matching
> all sets containing those "n" elements then how long does it take to find a
> match for a given "n"?
If you can have the match algorithm traverse the tree, you don't need to
explore all possible variants and get linear behaviour.
Not sure whether I answered the right questions,
Jo
| |
| Dirk Thierbach 2007-05-21, 10:05 pm |
| Jon Harrop <jon@ffconsultancy.com> wrote:
> How many AVL-balanced binary tree structures are there for a given number of
> nodes?
http://www.research.att.com/~njas/sequences/A006265
> Ultimately, I'd like to know the complexity of pattern matching over trees
> as sequences, e.g. if the pattern match {1,2,3,..} corresponds to matching
> all sets containing those "n" elements then how long does it take to find a
> match for a given "n"?
I have trouble understanding this. For example, what patterns are
allowed? Why is matching restricted to AVL trees (and not any of the
other umpteen variations of search trees)?
- Dirk
| |
| Jon Harrop 2007-05-21, 10:05 pm |
| Dirk Thierbach wrote:
> Jon Harrop <jon@ffconsultancy.com> wrote:
>
> http://www.research.att.com/~njas/sequences/A006265
Thanks.
>
> I have trouble understanding this. For example, what patterns are
> allowed? Why is matching restricted to AVL trees (and not any of the
> other umpteen variations of search trees)?
I wrote a macro that expands a new kind of pattern:
| [:1; 2:] -> ...
into patterns over AVL trees:
| Node(Node(Empty, 1, Empty), 2, Empty)
| Node(Empty, 1, Node(Empty, 2, Empty)) -> ...
Obviously it is more efficient to generate only valid balanced trees rather
than all (unbalanced) trees. I just wondered how much more efficient.
Now I'd like to know if this is significantly faster than doing it
programmatically at run-time, e.g. by folding over the tree.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/produc...journal/?usenet
| |
| Daniel C. Wang 2007-05-22, 8:04 am |
| Jon Harrop wrote:
{stuff deleted}
> Obviously it is more efficient to generate only valid balanced trees rather
> than all (unbalanced) trees. I just wondered how much more efficient.
http://www.research.att.com/~njas/sequences/A000108
The Catalan numbers describe the number of different trees with "n"
nodes. It grows exponentially faster than general the AVL sequence.
| |
| Dirk Thierbach 2007-05-22, 8:04 am |
| Jon Harrop <jon@ffconsultancy.com> wrote:
> Dirk Thierbach wrote:
[color=darkred]
[color=darkred]
[color=darkred]
> I wrote a macro that expands a new kind of pattern:
>
> | [:1; 2:] -> ...
>
> into patterns over AVL trees:
>
> | Node(Node(Empty, 1, Empty), 2, Empty)
> | Node(Empty, 1, Node(Empty, 2, Empty)) -> ...
Ok, so the only allowed pattern is to find one subsequence anywhere in
a tree (seen as sequence). Correct? Wouldn't it be simpler to just
search for the first element of the pattern sequence, store the path
in a zipper and then make sure that the remaining subsequence matches?
That should work for all search trees which keep worst-case lookup
logarithmic (not only AVL trees), and should have time complexity
logarithmic in the size of the tree, and linear in the size of the
pattern subsequence.
I am not convinced yet that generating ordinary patterns from the
subsequence pattern isn't worse, because I could imagine that as the
subsequence pattern gets longer, the sheer number of ordinary patterns
after macro expansion makes it slow (even with compiler optimization).
> Obviously it is more efficient to generate only valid balanced trees
> rather than all (unbalanced) trees.
Sure, I was just surprised you chose AVL trees and not red-black trees,
or 2,3-trees, or whatever, or left it just open.
> Now I'd like to know if this is significantly faster than doing it
> programmatically at run-time, e.g. by folding over the tree.
What kind of folding did you have in mind? The naive one that I'd
think of first isn't probably very efficient.
- Dirk
| |
| Jon Harrop 2007-05-22, 7:06 pm |
| Dirk Thierbach wrote:
>
> Ok, so the only allowed pattern is to find one subsequence anywhere in
> a tree (seen as sequence). Correct?
No. I was looking for complete trees. I might consider subsequences in the
future.
> Wouldn't it be simpler to just
> search for the first element of the pattern sequence, store the path
> in a zipper and then make sure that the remaining subsequence matches?
That is simpler but I think it is probably also much slower. At least in
OCaml, just the polymorphism is a serious cost. With a macro, I can expand
the code into patterns that search specifically for certain types of
element.
> That should work for all search trees which keep worst-case lookup
> logarithmic (not only AVL trees), and should have time complexity
> logarithmic in the size of the tree, and linear in the size of the
> pattern subsequence.
>
> I am not convinced yet that generating ordinary patterns from the
> subsequence pattern isn't worse, because I could imagine that as the
> subsequence pattern gets longer, the sheer number of ordinary patterns
> after macro expansion makes it slow (even with compiler optimization).
The pattern match compiler should optimize this back into an efficient
search. However, there is clearly some silliness in my creating lots of
patterns only to try to optimize them away. Perhaps I could combine the
generator and optimizer to avoid the exponentially-complex intermediate.
That might not be so hard. At each level you search for "m" elements in the
left subtree and "n-m-1" elements in the right subtree.
>
> Sure, I was just surprised you chose AVL trees and not red-black trees,
> or 2,3-trees, or whatever, or left it just open.
Allowing other types of balanced tree widens the search, increases the
number of patterns and slows the matching down.
>
> What kind of folding did you have in mind? The naive one that I'd
> think of first isn't probably very efficient.
Exactly. I've no idea how I'd do that either. I'm really just playing with
trees to see if anything interesting can be done.
F# provides views (active patterns), so you could swap between different
tree representations using the same code without losing too much
efficiency.
For example, it would be interesting to take a rewrite engine and replace
the simple AVL tree (with elt, two children and height) in each node with a
tree that also stored the number of elements in the left subtree at each
node. The latter obviously makes indexing asymptotically faster but it adds
memory overhead and could slow things down. It would be interesting to see
which would be faster overall, and when.
Again, this isn't for a serious project.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/produc...journal/?usenet
| |
| Dirk Thierbach 2007-05-23, 4:22 am |
| Jon Harrop <jon@ffconsultancy.com> wrote:
> Dirk Thierbach wrote:
[color=darkred]
> No. I was looking for complete trees.
Ah. Sorry, I didn't understand that, that's why I was asking :-)
And I assume you also want to bind to variables as part of the
"sequence"?
[color=darkred]
> That is simpler but I think it is probably also much slower. At least in
> OCaml, just the polymorphism is a serious cost.
Why? The data type is boxed, anyway. So you're just doing structural
matching to "extract" the subsequence, and then compare for equality.
That's the same that you'd do with explicit pattern matching (or at
least that's how I imagine OCaml and Haskell do it.)
BTW, with this algorithm of course you cannot bind variables inside
the subsequence. If that's what you want, your original approach is
probably best.
[color=darkred]
> The pattern match compiler should optimize this back into an efficient
> search. However, there is clearly some silliness in my creating lots of
> patterns only to try to optimize them away. Perhaps I could combine the
> generator and optimizer to avoid the exponentially-complex intermediate.
I'd say that this would be better.
[color=darkred]
> Allowing other types of balanced tree widens the search,
Why? Are you sure that there are, say, more red-black trees with a given
number of nodes than AVL trees? And the same is true for any other
search tree compared to AVL trees? Any pointers to literature with
a proof of that? Would it be still true if you don't create all
possible trees, but instead do the hierarchical matching yourself as
described above? (Maybe the hierarchical patterns for some other kind
of search tree are more regular? At least the usual operations on
red-black trees are simpler than those on AVL trees, c.f. Okasaki).
- Dirk
| |
| Jon Harrop 2007-05-23, 7:07 pm |
| Dirk Thierbach wrote:
> Ah. Sorry, I didn't understand that, that's why I was asking :-)
> And I assume you also want to bind to variables as part of the
> "sequence"?
Exactly.
>
> Why? The data type is boxed, anyway. So you're just doing structural
> matching to "extract" the subsequence, and then compare for equality.
> That's the same that you'd do with explicit pattern matching (or at
> least that's how I imagine OCaml and Haskell do it.)
You can avoid dynamic dispatch to a trivial comparison function (e.g. int
compare).
> BTW, with this algorithm of course you cannot bind variables inside
> the subsequence. If that's what you want, your original approach is
> probably best.
I think you can. The example I gave should certainly work:
| Node(Node(Empty, x, Empty), y, Empty)
| Node(Empty, x, Node(Empty, y, Empty)) -> ...
>
> I'd say that this would be better.
I'm not quite sure how it could be done but I'll certainly bear it in mind.
>
>
> Why? Are you sure that there are, say, more red-black trees with a given
> number of nodes than AVL trees?
The union of the two is certainly at least as big, i.e. if my code is to
handle AVL and RB trees.
> And the same is true for any other
> search tree compared to AVL trees? Any pointers to literature with
> a proof of that? Would it be still true if you don't create all
> possible trees, but instead do the hierarchical matching yourself as
> described above? (Maybe the hierarchical patterns for some other kind
> of search tree are more regular? At least the usual operations on
> red-black trees are simpler than those on AVL trees, c.f. Okasaki).
RB trees can't support fast union, AFAIK, which is why I don't use them.
There may well be other trees that do well. I was reading about 3-finger
trees recently...
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/produc...journal/?usenet
| |
| Chris F Clark 2007-05-23, 7:07 pm |
| Jon Harrop <jon@ffconsultancy.com> wrote at the beginnning of this thread:
> How many AVL-balanced binary tree structures are there for a given number of
> nodes?
>
> Ultimately, I'd like to know the complexity of pattern matching over trees
> as sequences, e.g. if the pattern match {1,2,3,..} corresponds to matching
> all sets containing those "n" elements then how long does it take to find a
> match for a given "n"?
>
> --
> Dr Jon D Harrop, Flying Frog Consultancy
> The F#.NET Journal
> http://www.ffconsultancy.com/produc...journal/?usenet
I've been following the thread with interest, but decided there there
is just something I'm missing, so I want to ask.
From this posting, you seem to want to treat the trees as having no
structure (other than the sorting of the sequence) which makes sense
for AVL or any other form of self-reorganizing tree, as the
"structure" of the tree has no information. However, as I read later
discussions, there are conversations where specific shapes of trees
are mentioned.
Is removing that structure your intent? Do you want to specify just
the sequence in your pattern and have your tool/macro/pattern matching
extension map that sequence into all the possible trees?
Next, does your extension also expand out all the possible trees?
And, because of that, is that why you are concerned with the number of
trees--to understand the combinatorial explosion?
I apologize if these questions aren't helpful, but I'm just trying to
follow the conversation as the topic is interesting and I would like
to comprehend it better....
| |
| Jon Harrop 2007-05-23, 7:07 pm |
| Chris F Clark wrote:
> From this posting, you seem to want to treat the trees as having no
> structure (other than the sorting of the sequence) which makes sense
> for AVL or any other form of self-reorganizing tree, as the
> "structure" of the tree has no information. However, as I read later
> discussions, there are conversations where specific shapes of trees
> are mentioned.
>
> Is removing that structure your intent?
Yes. I think that is a very nice way of putting it.
I want my macro to completely hide the fact that the user is manipulating a
tree, so they write code like:
| [:a; b; c:] -> ...
but the macro expands this and the compiler sees pattern matches that
contain explicit tree structures.
> Do you want to specify just
> the sequence in your pattern and have your tool/macro/pattern matching
> extension map that sequence into all the possible trees?
Exactly. Just for fixed-sized trees at first but I'd also like to handle
other cases later. For example, sequences with a given first element:
| [:a; ..:]
or using a new cons operator:
| a ::: t -> ...
where "t" is another sequence (a balanced binary tree with the first element
removed).
> Next, does your extension also expand out all the possible trees?
In this case, yes. For the above extensions it would have to do something
cleverer.
> And, because of that, is that why you are concerned with the number of
> trees--to understand the combinatorial explosion?
Exactly, yes. There are two motivations for this:
1. Add pattern matching over trees to an ML.
2. Keep it fast.
> I apologize if these questions aren't helpful, but I'm just trying to
> follow the conversation as the topic is interesting and I would like
> to comprehend it better....
Not at all. I'm really just thinking out loud.
I have long thought that the O(1) matching imposed by ML-style patterns
could be usefully extended to O(log n) because that is fast enough in
practice. I'll probably try doing this in F# using active patterns...
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/produc...journal/?usenet
| |
| Dirk Thierbach 2007-05-23, 7:07 pm |
| Jon Harrop <jon@ffconsultancy.com> wrote:
> Dirk Thierbach wrote:
[color=darkred]
[color=darkred]
> You can avoid dynamic dispatch to a trivial comparison function
> (e.g. int compare).
Sorry, I don't get it. In both cases you need to do a structural
comparison. AFAIK, equality in OCaml is special and works on all
types, so I am not even sure if it is dynamically dispatched at
all. If it is, then certainly having a concrete pattern will allow the
compiler a better opportunity to specialize. But it can do this in the
other case as well, either by having specialised versions of the
function that does the comparison (say, for integers), or by inlining
this function, and then doing the specialization. I know that GHC can
do both, and I'd say OCaml should be potentially able to do that, too.
In any case, I wouldn't be dead sure beforehand that one is much
slower. I'd at least test it to actually determine the overhead :-)
>
> I think you can. The example I gave should certainly work:
>
> | Node(Node(Empty, x, Empty), y, Empty)
> | Node(Empty, x, Node(Empty, y, Empty)) -> ...
Sorry, misunderstanding. I was trying to say that with your approach,
it will work, just as you outlined it above. With my approach, it won't
work, because one cannot bind to variables outside a pattern.
So it seems we're saying the same thing.
[color=darkred]
> The union of the two is certainly at least as big, i.e. if my code is to
> handle AVL and RB trees.
Sorry, there again seems to be some communication problem. I was
wondering why you picked AVL trees and dismissed immediately all other
types of search trees as "uninteresting" or "worse". If it's just a gut
feeling, and you use AVL trees because you like AVL trees, then that's
of course ok. If OTOH there's a formal reason why AVL trees are better
in this respect than all other search trees, then I haven't heard about
that, and I'd be curious. That's why I am asking.
- Dirk
| |
| Vesa Karvonen 2007-05-23, 7:07 pm |
| Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
[...]
> Sorry, there again seems to be some communication problem. I was
> wondering why you picked AVL trees and dismissed immediately all other
> types of search trees as "uninteresting" or "worse". If it's just a gut
> feeling, and you use AVL trees because you like AVL trees, then that's
> of course ok. If OTOH there's a formal reason why AVL trees are better
> in this respect than all other search trees, then I haven't heard about
> that, and I'd be curious. That's why I am asking.
See Wikipedia:
http://en.wikipedia.org/wiki/ AVL_t...tree#Operations . This
property can be used to advantage when designing more complex data
structures based on balanced trees. In particular, one can combine a
balanced tree with a heap and get a treap. See Wikipedia:
http://en.wikipedia.org/wiki/Treap. The constant number of rotations
property is important in such a design as rotations no longer take
constant time due to having to maintain the heap order. Treaps are
useful in, for example, some geometric sweep line algorithms. See
Wikipedia: http://en.wikipedia.org/wiki/Sweep_line_algorithm .
-Vesa Karvonen
| |
| dbenson@eecs.wsu.edu 2007-05-23, 7:07 pm |
| On May 20, 4:30 pm, Jon Harrop <j...@ffconsultancy.com> wrote:
> How many AVL-balanced binary tree structures are there for a given number of
> nodes?
I don't know, becuase RedBlack trees are better.
Wolfram's <b>Mathworld</b> describes the revcurrence formulas for
counting the number of RedBlack trees.
| |
| dbenson@eecs.wsu.edu 2007-05-23, 7:07 pm |
| On May 20, 4:30 pm, Jon Harrop <j...@ffconsultancy.com> wrote:
> How many AVL-balanced binary tree structures are there for a given number of
> nodes?
I don't know, becuase RedBlack trees are better.
Wolfram's <b>Mathworld</b> describes the revcurrence formulas for
counting the number of RedBlack trees.
| |
| Vesa Karvonen 2007-05-23, 7:07 pm |
| dbenson@eecs.wsu.edu wrote:
> On May 20, 4:30 pm, Jon Harrop <j...@ffconsultancy.com> wrote:
[color=darkred]
> I don't know, becuase RedBlack trees are better.
No, they are not! Whether AVL-trees or RB-trees are better depends on
the requirements and characteristics of the application. RB-trees
are, in general, taller than AVL-trees, which makes lookup slower.
(This probably also means that there are more RB-trees than AVL-trees
of size n, but I don't know this for fact offhand.) OTOH, insertion
and deletion on RB-trees require only a constant number of rotations,
which can lead to improved asymptotic complexity bounds for more
complex data structures (e.g. dynamic treaps) and faster overall
execution times on typical operation sequences (but not for sequences
dominated by lookups) compared to AVL-trees.
-Vesa Karvonen
| |
| dbenson@eecs.wsu.edu 2007-05-23, 10:07 pm |
| On May 23, 3:46 pm, Vesa Karvonen <vesa.karvo...@cs.helsinki.fi>
wrote:
> ... (but not for sequences
> dominated by lookups) ...
Yes. Being but a mostly-functional programmer, in this case I would
consider hash tables.
| |
| dbenson@eecs.wsu.edu 2007-05-23, 10:07 pm |
| On May 23, 3:46 pm, Vesa Karvonen <vesa.karvo...@cs.helsinki.fi>
wrote:
> ... (but not for sequences
> dominated by lookups) ...
Yes. Being but a mostly-functional programmer, in this case I would
consider hash tables.
| |
| dbenson@eecs.wsu.edu 2007-05-23, 10:07 pm |
| On May 23, 3:46 pm, Vesa Karvonen <vesa.karvo...@cs.helsinki.fi>
wrote:
> ... (but not for sequences
> dominated by lookups) ...
Yes. In that case, being only a mostly-functional programmer, I would
also consider using a hash table.
| |
| Jon Harrop 2007-05-23, 10:07 pm |
| Dirk Thierbach wrote:
> Sorry, I don't get it. In both cases you need to do a structural
> comparison.
Essentially, my approach avoids the overhead of a polymorphic function.
> AFAIK, equality in OCaml is special and works on all
> types, so I am not even sure if it is dynamically dispatched at
> all.
If the type is atomic and statically known then equality is specialized.
Otherwise (e.g. a tuple of ints or a polymorphic function) it is not.
>
> Sorry, there again seems to be some communication problem. I was
> wondering why you picked AVL trees and dismissed immediately all other
> types of search trees as "uninteresting" or "worse". If it's just a gut
> feeling, and you use AVL trees because you like AVL trees, then that's
> of course ok. If OTOH there's a formal reason why AVL trees are better
> in this respect than all other search trees, then I haven't heard about
> that, and I'd be curious. That's why I am asking.
From my point of view, the main advantage of RB trees is that they consume
less memory and the main advantage of AVL trees is that they offer
union/join with asymptotically better performance.
For example, if I want to join a sequence to itself (e.g. Join[{1,2,3}
{1,2,3}] -> {1,2,3,1,2,3}) then an AVL tree can do this faster than an RB
tree because the height information in the nodes tells you that no
balancing are necessary.
This is used in the Set module of the OCaml and F# to provide fast set
theoretic operations union, intersection and difference.
In my case, my sequences are a functional equivalent of arrays that allow
fast concatenation and splitting.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/produc...journal/?usenet
| |
| Adrian Hey 2007-05-24, 4:19 am |
| Vesa Karvonen wrote:
> Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
> [...]
>
> See Wikipedia:
> [url]http://en.wikipedia.org/wiki/ AVL_tree#Comparison_to_other_structures[
/url] .
> However, the reason to prefer RB-trees to AVL-trees is that insertion
> and deletion in RB-trees require only a constant number of rotations
> (and O(log n) in AVL-trees). See Wikipedia:
For insertion AVL trees will require at most 1 rotation.
For deletion they may require O(log n) rotations worst case.
But for purely functional (i.e. immutable) implementation I think
this is largely irrelevant as the real cost of both operations is
the construction of O(log n) new heap records, whether or not
re-balancing rotation is required.
In this respect AVL trees seem a better choice than red-black trees
as in the common "pathological" case of ordered insertions, the
insertion path for red-black trees seems to be about 40% longer than
for AVL trees. This is an empirical figure BTW, I can't prove this.
http://groups.google.co.uk/group/co...4a422ea04ff1217
Regards
--
Adrian Hey
| |
| Dirk Thierbach 2007-05-24, 4:19 am |
| Jon Harrop <jon@ffconsultancy.com> wrote:
> Dirk Thierbach wrote:
[color=darkred]
> Essentially, my approach avoids the overhead of a polymorphic function.
Yes, and in the other case (with an explicit function) you can avoid
the overhead as well (with a bit of help from the compiler). So
where's the difference?
- Dirk
| |
| Dirk Thierbach 2007-05-24, 7:07 pm |
| Vesa Karvonen <vesa.karvonen@cs.helsinki.fi> wrote:
> [url]http://en.wikipedia.org/wiki/ AVL_tree#Comparison_to_other_structures[
/url] .
That only gives some general comparison, pointing out the different
factors for the asymptotical complexity, but doesn't say "if you want
to reduce the number of shapes of some search tree, AVL trees are
best".
> However, the reason to prefer RB-trees to AVL-trees is that insertion
> and deletion in RB-trees require only a constant number of rotations
> (and O(log n) in AVL-trees).
Well, yes, of course there are "external" reasons why one or the other
or a search may be better with respect to other operations, but that
wasn't the question. :-)
So I suppose the answer is "no, there's no particular reason to prefer
AVL trees over other search tress as 'patterns', it's just that it's
nicer for the other parts of the application".
- Dirk
| |
| Jon Harrop 2007-05-24, 7:07 pm |
| Adrian Hey wrote:
> In this respect AVL trees seem a better choice than red-black trees
> as in the common "pathological" case of ordered insertions, the
> insertion path for red-black trees seems to be about 40% longer than
> for AVL trees. This is an empirical figure BTW, I can't prove this.
The OCaml benchmarks that I've seen agreed.
--
Dr Jon D Harrop, Flying Frog Consultancy
The F#.NET Journal
http://www.ffconsultancy.com/produc...journal/?usenet
| |
| Vesa Karvonen 2007-05-24, 7:07 pm |
| Dirk Thierbach <dthierbach@usenet.arcornews.de> wrote:
> Vesa Karvonen <vesa.karvonen@cs.helsinki.fi> wrote:
[color=darkred]
> That only gives some general comparison, pointing out the different
> factors for the asymptotical complexity, but doesn't say "if you want
> to reduce the number of shapes of some search tree, AVL trees are
> best".
Indeed. I just wanted to point out the different asymptotic factors
as relevant background information.
[color=darkred]
> Well, yes, of course there are "external" reasons why one or the other
> or a search may be better with respect to other operations, but that
> wasn't the question. :-)
Again, my intent was to provide relevant information on the properties
of the data structures. I have no idea what criteria Jon Harrop used
to make the choice.
> So I suppose the answer is "no, there's no particular reason to prefer
> AVL trees over other search tress as 'patterns', it's just that it's
> nicer for the other parts of the application".
I'm not aware of an enumeration of all possible (balanced) search tree
data structures, but when specifically comparing RB-trees and
AVL-trees for efficient pattern matching, AVL-trees seem to have the
advantage of having asymptotically fewer shapes. This is due to the
more rigid (and more expensive) balancing constraints.
Indeed, it would seem that every AVL-tree corresponds to a RB-tree. I
think that a key lemma is that in an AVL-tree the longest path from
root to a leaf (or null) is at most twice as long as the shortest such
path (easy to prove). From this it would seem to follow that it is
always possible to color the nodes of an AVL-tree so as to satisfy the
RB-tree requirements (constructive proof seems possible - I haven't
worked out the details). I leave further development of the proof to
the reader.
-Vesa Karvonen
| |
| Ben Rudiak-Gould 2007-05-25, 10:05 pm |
| Jon Harrop wrote:
> Ultimately, I'd like to know the complexity of pattern matching over trees
> as sequences, e.g. if the pattern match {1,2,3,..} corresponds to matching
> all sets containing those "n" elements then how long does it take to find a
> match for a given "n"?
I haven't read the whole thread, but I'm worried about this on complexity
grounds. Flattening the tree before matching takes O(n) operations and the
actual match then takes n comparisons (assuming it does match). But with the
approach you're suggesting it will take log n comparisons just to determine
whether the top element is on the list, and another 2(log n - 1) or so at
the next level, and so on. It looks like the cost will be around n log n
operations overall, and this might be more expensive in practice for
reasonably-sized n.
-- Ben
| |
|
|
|
|
|