For Programmers: Free Programming Magazines  


Home > Archive > VC STL > February 2005 > sizeof std::list<mytype>









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author sizeof std::list<mytype>
Ingo Nolden

2005-02-20, 8:58 am



I want to nest lists with a recursion depth that is predefined by
template parameter. The code goes like follows:



template< typename CharT, unsigned OrderU >
class WordStore
{
public:
WordCounter<CharT>* Add( const CharT* word )
{
return _getList( word ).Add( word );
}
WordList<CharT>& GetList( const CharT* word )
{
unsigned char uc = 0;
uc = Char<CharT>( word[0] ).GetIndex( );
return _store[uc]._getList( word + sizeof( CharT ) );
}
private:
WordStore<CharT, OrderU - 1> _store[27];
};
template<typename CharT>
class WordStore<CharT, 0>
{
public:
WordList<CharT>& GetList( const CharT* word )
{
unsigned char uc = 0;
uc = Char<CharT>( word[0] ).GetIndex( );
return _lists[uc];
}
private:
WordList<CharT> _lists[27];
};

When I instanciate WordStore< char, 2 > everything is ok.
When I use 3 and above, the compiler runs through well. But at runtime I
get a stack overflow even before main( ) has been reached.

Can anyone tell whats the problem, or can anyone tell how much memory a
list does consume for itself and for each item that is stored into the
list ( besides the memory for the item itself )?

Tom Widmer

2005-02-21, 4:04 pm

Ingo Nolden wrote:
>
>
> I want to nest lists with a recursion depth that is predefined by
> template parameter. The code goes like follows:
>
> template< typename CharT, unsigned OrderU >
> class WordStore
> {
> public:
> WordCounter<CharT>* Add( const CharT* word )
> {
> return _getList( word ).Add( word );
> }
> WordList<CharT>& GetList( const CharT* word )
> {
> unsigned char uc = 0;
> uc = Char<CharT>( word[0] ).GetIndex( );
> return _store[uc]._getList( word + sizeof( CharT ) );
> }
> private:
> WordStore<CharT, OrderU - 1> _store[27];
> };
> template<typename CharT>
> class WordStore<CharT, 0>
> {
> public:
> WordList<CharT>& GetList( const CharT* word )
> {
> unsigned char uc = 0;
> uc = Char<CharT>( word[0] ).GetIndex( );
> return _lists[uc];
> }
> private:
> WordList<CharT> _lists[27];
> };
>
> When I instanciate WordStore< char, 2 > everything is ok.
> When I use 3 and above, the compiler runs through well. But at runtime I
> get a stack overflow even before main( ) has been reached.
>
> Can anyone tell whats the problem, or can anyone tell how much memory a
> list does consume for itself and for each item that is stored into the
> list ( besides the memory for the item itself )?


sizeof(Wordstore<char,3> )
= sizeof(WordStore<char,2> ) * 27
= sizeof(WordStore<char,1> ) * 27 * 27
= sizeof(WordStore<char,0> ) * 27 * 27 * 27
= sizeof(WordList<CharT> ) * 27 * 27 * 27 * 27 = sizeof(WordList<CharT> )
* 531441

At a guess, something like 8MB. Something that big you definitely don't
want on the stack, so instead you should create it on the heap with new.
However, I suspect that your data structure is a poor choice. What are
you using it for?

Tom
Ingo Nolden

2005-02-21, 9:02 pm

Tom Widmer wrote:
> Ingo Nolden wrote:
>
>
>
> sizeof(Wordstore<char,3> )
> = sizeof(WordStore<char,2> ) * 27
> = sizeof(WordStore<char,1> ) * 27 * 27
> = sizeof(WordStore<char,0> ) * 27 * 27 * 27
> = sizeof(WordList<CharT> ) * 27 * 27 * 27 * 27 = sizeof(WordList<CharT> )
> * 531441
>
> At a guess, something like 8MB. Something that big you definitely don't
> want on the stack, so instead you should create it on the heap with new.
> However, I suspect that your data structure is a poor choice. What are
> you using it for?
>

Perhaps I understand the problem now. I originally asked for the size,
because I didn't think that just the size on the stack may be a problem.
However, in the meanwhile I used it on the heap and it worked.
The calculation above does not count the memory that the list is
allocating dynamically right? This is what I originally wanted to know.
What is poor about my data structure anyway?
The goal is to find words more efficiently. At the cost of some memory I
use the first character(s) of the word as indices. This way I don't
have to loop over a huge list. And my tests proove that it can save a
lot of time.
Do you have a better Idea?

Ingo



> Tom

red floyd

2005-02-21, 9:02 pm

Ingo Nolden wrote:
> Tom Widmer wrote:
>
> Perhaps I understand the problem now. I originally asked for the size,
> because I didn't think that just the size on the stack may be a problem.
> However, in the meanwhile I used it on the heap and it worked.
> The calculation above does not count the memory that the list is
> allocating dynamically right? This is what I originally wanted to know.
> What is poor about my data structure anyway?
> The goal is to find words more efficiently. At the cost of some memory I
> use the first character(s) of the word as indices. This way I don't
> have to loop over a huge list. And my tests proove that it can save a
> lot of time.
> Do you have a better Idea?


What you're looking for is called a "trie". Google for it. It's a
recursive structure.

struct trie {
struct trie* children[27]; // use a const, not a magic number
// other stuff here
};


Ken Alverson

2005-02-21, 9:02 pm

"Ingo Nolden" <ingo.noldenN@SPAMrecurdyn.de> wrote in message
news:evwtg8EGFHA.428@TK2MSFTNGP15.phx.gbl...
> Perhaps I understand the problem now. I originally asked for the size,
> because I didn't think that just the size on the stack may be a problem.
> However, in the meanwhile I used it on the heap and it worked.
> The calculation above does not count the memory that the list is allocating
> dynamically right? This is what I originally wanted to know.
> What is poor about my data structure anyway?
> The goal is to find words more efficiently. At the cost of some memory I use
> the first character(s) of the word as indices. This way I don't have to loop
> over a huge list. And my tests proove that it can save a lot of time.
> Do you have a better Idea?


Depending on how efficiently you need to find words, a hash table may be
perfectly acceptable. You can make a hash function based on the first three
or four letters of the word and your collision rate will be similar to the
structure you describe.

If you want to keep using the type of structure you currently are, it's worth
noting that in all likelyhood, most of the 531,441 leaf nodes of your
structure are going to be empty. If you don't allocate all of them in the
first place, you won't have to pay the memory cost of having them out there.
Something as simple as changing the array of subnodes to a vector would move
the subnodes out of the structure and into their own heap space, and prevent
all of them from being allocated at the same time as the structure. However,
when you go to use a node, you would have to make sure it is allocated first.
You may find it is memory efficient for you to go more than 4 letters in this
case, perhaps 8 letters or even all letters in a word.

Ken


Ken Alverson

2005-02-21, 9:02 pm

"Ingo Nolden" <ingo.noldenN@SPAMrecurdyn.de> wrote in message
news:u7ieTUzFFHA.3824@TK2MSFTNGP10.phx.gbl...
>
> uc = Char<CharT>( word[0] ).GetIndex( );
> return _store[uc]._getList( word + sizeof( CharT ) );


BTW, this (word+sizeof(CharT)) is incorrect. If sizeof(CharT) is 2, this will
move the pointer ahead 4 bytes, since pointer arithmetic already takes into
account the size of the type of the pointer. No matter what the size of CharT
is, the pointer to the next CharT in the word is (word+1).

Ken


Tom Widmer

2005-02-22, 9:00 am

Ingo Nolden wrote:
> Tom Widmer wrote:
>
> Perhaps I understand the problem now. I originally asked for the size,
> because I didn't think that just the size on the stack may be a problem.


Such a recursive structure gets big, very quickly as the recursion depth
increases (well, it's exponential).

> However, in the meanwhile I used it on the heap and it worked.
> The calculation above does not count the memory that the list is
> allocating dynamically right?


Right, it was purely the memory held contiguously in the Wordstore object.

This is what I originally wanted to know.

You didn't provide enough information to guess this. What is
WordList<CharT>? How many words (std::strings?) and of what lengths does
it hold?

> What is poor about my data structure anyway?


That fact that all nodes are allocated even if they aren't used. And in
any case, if fast lookup speed is your only requirement, then you should
try hash maps before going to the trouble of a trie.

> The goal is to find words more efficiently. At the cost of some memory I
> use the first character(s) of the word as indices. This way I don't
> have to loop over a huge list. And my tests proove that it can save a
> lot of time.
> Do you have a better Idea?


Either a set, a hash set, a sorted array (or vector), or a trie (which
is similar to what you've done, but doesn't allocate memory until
needed). It depends on exactly what you are using it for, how big it
will get, when insertions occur, etc. I would be unsurprised is straight
std::set<std::string> wasn't fast enough for you - it has logarithmic
performance, which while not as fast as the constant performance of a
trie (or best case hash map), is still fast enough unless your store of
words gets to millions of entries (and there aren't even millions of
words in existance). The only thing you shouldn't try is a straight
list, since that gives you linear lookup time as you've discovered.

An algorithms and data structures book would be able to advise you further.

Tom
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com