Home > Archive > VC STL > April 2005 > Compiler bug, surfacing in a map?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Compiler bug, surfacing in a map?
|
|
| Carl Nettelblad 2005-04-15, 4:01 pm |
| Hello,
I am currently writing a tool which employs a rather large map in one
central part. The keys are of a custom type, basically 4 wrapped ints,
combined with some additional logic. Changing the storage from a map would
break a lot of things, but performance was still crucial. This got me
started on experimenting with the SSE2 intrinsics in various places.
My "key" class currently begins like this:
class word_sequence
{
public:
int data[4];
word_sequence()
{
for (int x = 0; x < 4; x++)
{
data[x] = -1;
}
}
word_sequence(const word_sequence& source)
{
__m128i dataval = _mm_loadu_si128((__m128i*) source.data);
_mm_storeu_si128((__m128i*) data, dataval);
}
Note the copy constructor. The map itself uses the Boost pool allocator,
with a very large initial size:
typedef map<word_sequence, pair<int, partvector >, less<word_sequence>,
fast_pool_allocator<pair<word_sequence, pair<int, partvector > >,
default_user_allocator_new_delete,
details::pool::null_mutex,
131072> > DATATYPE;
Now, this has worked fine all until I introduced this copy constructor. I
can change by hand so that I use this constructor for assignments in my
code, but not for the map. Then it works. I can also make the initial pool
size even larger. Then it also works. BUT; when the boost pool needs to
reallocate, I'll end up with an access violation a bit down the road, when
the insert method in the map rearranges the tree to satisfy the colouring
requirements. When debugging, it seems like the register that should
containt a pointer to the current node does not containt that pointer at
all, so I get a stray reference down to 0x00000064.
This only happens in a release build. I've tried some different optimization
settings, while still getting it. A pure debug build runs fine.
Is anyone aware of some special requirement on copy constructors in the
context of a STL map? Or a bug in boost::pool? (Or conditions when you're
not allowed to use SIMD intrinsics?) I know that hash_map could be better
and that I should create my own storage if I'm that picky about performance,
but that's not an option right now.
This happens with the Visual C++.NET 2003 compiler. Going to a debug build,
or removing the copy constructor (replacing it with a boiler-plate
member-by-member copy) fixes the problem.
Any insights would be welcome.
/Carl
| |
| Bo Persson 2005-04-15, 4:01 pm |
|
"Carl Nettelblad" <cnettel@hem.passagen.se.not.working> skrev i
meddelandet news:ei7T6udQFHA.3628@TK2MSFTNGP12.phx.gbl...
> Hello,
>
> I am currently writing a tool which employs a rather large map in one
> central part. The keys are of a custom type, basically 4 wrapped ints,
> combined with some additional logic. Changing the storage from a map
> would break a lot of things, but performance was still crucial. This
> got me started on experimenting with the SSE2 intrinsics in various
> places.
>
> My "key" class currently begins like this:
> class word_sequence
>
> {
>
> public:
>
> int data[4];
>
> word_sequence()
>
> {
>
> for (int x = 0; x < 4; x++)
>
> {
>
> data[x] = -1;
>
> }
>
> }
>
> word_sequence(const word_sequence& source)
>
> {
>
> __m128i dataval = _mm_loadu_si128((__m128i*) source.data);
>
> _mm_storeu_si128((__m128i*) data, dataval);
>
> }
>
The instrinsic requires that the data is properly aligned for an SSE2
instruction. This is not guaranteed by the map.
A debug build could include more data in the internal map node and, by
accident, provide a different alignment for the data member.
Bo Persson
| |
| Carl Nettelblad 2005-04-16, 4:00 pm |
|
"Bo Persson" <bop@gmb.dk> wrote in message
news:eSSU1LeQFHA.2132@TK2MSFTNGP09.phx.gbl...
>
> "Carl Nettelblad" <cnettel@hem.passagen.se.not.working> skrev i
> meddelandet news:ei7T6udQFHA.3628@TK2MSFTNGP12.phx.gbl...
>
> The instrinsic requires that the data is properly aligned for an SSE2
> instruction. This is not guaranteed by the map.
Nope, I use the *unaligned* intrinsic version. That one doesn't require
16-byte alignment (AFAIK). If I use the aligned version (_mm_store_si128,
not _mm_storeu_si128), just about every piece of code using that copy
constructor fails, to no surprise. It's also a different access violation in
that case, inside the copy constructor itself; with the fake address of
0xffffffff. This copy constructor is only two lines of inlined assembler in
the buynode helper to the insert method in the map and they seem to perform
exactly as intended.
I still suspect that the compiler messes up the register assignment in this
specific case.
/Carl
|
|
|
|
|