For Programmers: Free Programming Magazines  


Home > Archive > VC STL > January 2006 > Which container I should choose for better performance?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Which container I should choose for better performance?
davihigh@gmail.com

2006-01-09, 11:10 pm

Hi Friends:

I am managing a large number of objects which has unique id <ULONG64>.
Which container i should choose to perform best find() performance?

Can choose from STL in vc2003 or Boost 1_33. Currently I am using
std::map<>. Your suggestion?

Rgds, David Xiao

tom_usenet@hotmail.com

2006-01-09, 11:10 pm


davihigh@gmail.com wrote:

> Hi Friends:
>
> I am managing a large number of objects which has unique id <ULONG64>.
> Which container i should choose to perform best find() performance?
>
> Can choose from STL in vc2003 or Boost 1_33. Currently I am using
> std::map<>. Your suggestion?


VC2003 has a hash_map class. If your ULONG64 values are fairly random,
that would probably be best. If they are not random, you'll need to
write a good hash function to make sure you capture as much variation
as possible. If you can't find a good hash function for the values,
then you should probably stick with std::map, which is nice in that is
offers *worst case* O(log n) finds (as opposed to hash_map, the
performance of which depends heavily on quality of the hash function).
Remember that log(n) is pretty good - a search of billions of values
only take, say, 20 times longer than a search of a handful of values.

Finally, if the values are decided up front, and you aren't performing
insertions and deletions, then a sorted vector (and binary searches
with lower_bound) is faster than std::map by a reasonable constant
factor (certainly 2x as fast I think).

I don't think boost has anything to help you here.

Tom

davihigh@gmail.com

2006-01-09, 11:10 pm

Hi Tom:

Thanks for reply. The key is not random, they seems increase regularly,
[1,2,3.....n].

Also there are heavy insert/remove/find operations. So it seems that
std::map<> will be my best choice?

Rgds, David

Stephen Howe

2006-01-09, 11:10 pm


> I am managing a large number of objects which has unique id <ULONG64>.
> Which container i should choose to perform best find() performance?


map gives O(log N) find performance. But the number of objects would have to
be large to beat vector O(N) because of map's overheap (poor locality of
reference).

Better still would be sorted vector if the objects stay static for a long
while.
You would use lower_bound() or equal_range().
You get O(log N)

And the hash containers can give a theoretical O(1) with a perfect hash
function.

Stephen Howe


Stephen Howe

2006-01-09, 11:10 pm

> Thanks for reply. The key is not random, they seems increase regularly,
> [1,2,3.....n].
>
> Also there are heavy insert/remove/find operations. So it seems that
> std::map<> will be my best choice?


Most likely. But I would measure things if uncertain.
For map every insert or remove corresponds to memory allocation/deallocation
which is not insignicant.
vector will settle down after a while. Despite the fact that find() on a
vector is O(N), it might give good performance.

Stephen Howe


Tom Widmer

2006-01-09, 11:10 pm

davihigh@gmail.com wrote:
> Hi Tom:
>
> Thanks for reply. The key is not random, they seems increase regularly,
> [1,2,3.....n].


If n isn't too large (e.g. it's less than 2^32), then this might be a
reasonable hash function:

class myhash_compare : public hash_compare<ULONG64> {
Pr comp;
public:
//increase this if you find the map getting too big
const size_t bucket_size = 3;
//make minimum size suitably big, to minimize rehashing
const size_t min_buckets = 100000;
size_t operator()(ULONG64 key) const
{
return static_cast<size_t>(key);
}
};

> Also there are heavy insert/remove/find operations. So it seems that
> std::map<> will be my best choice?


I'd measure it against hash_map and see for myself. std::vector is out
of the running, anyway.

A good way of increasing performance for node based containers like
std::map is to use a custom allocator, which will improve insert/remove
performance and possibly find performance, if your allocator increases
locality of reference. Take a look at boost's pool allocator - remember
to use a specific pool just for your map.

Tom
Rasmus Dyhr Frederiksen

2006-01-09, 11:10 pm

davihigh@gmail.com wrote:
> Hi Friends:
>
> I am managing a large number of objects which has unique id <ULONG64>.
> Which container i should choose to perform best find() performance?
>
> Can choose from STL in vc2003 or Boost 1_33. Currently I am using
> std::map<>. Your suggestion?
>
> Rgds, David Xiao
>


Hi David

Your choice of container depends on whether your set of objects is
static, or is constantly changing.

In simple cases, a sorted vector + equal_range (binary search) is a
compact and very fast solution.

For more dynamic scenarios, a map is useful, but it has a quite high
memory overhead pr. object.

regards,

Rasmus
Jason Winnebeck

2006-01-17, 8:03 am

davihigh@gmail.com wrote:
> Hi Tom:
>
> Thanks for reply. The key is not random, they seems increase regularly,
> [1,2,3.....n].
>
> Also there are heavy insert/remove/find operations. So it seems that
> std::map<> will be my best choice?
>
> Rgds, David
>


If the IDs are "packed" then why not just use an array or vector, since
a vector is a mapping of int of an index to some piece of data. If the
data is almost completely packed (or can be because you can reuse IDs)
this might be OK. If it is a little less packed but still very packed,
you could consider an array of pointers. If a vector-based solution is
possible then you are guaranteed a constant look-up time and good insert
time if the IDs are reused or always increasing -- the only problem here
is with vector reallocations. If it not very packed, then you'll
probably want to use the solutions presented in this thread.

Jason
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com