For Programmers: Free Programming Magazines  


Home > Archive > VC STL > February 2006 > stl map/set and multithreading









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author stl map/set and multithreading
He Shiming

2006-02-01, 8:04 am

Hi All,

I've got a question regarding using STL map and set class in a multithreaded
situation.

I'm developing a database program. The program fetches and parses millions
of records from the database and build a map/set to allow faster access to
records by certain rules. The type of map/set is:

map<long, set<long> > m_map;

It maps "category id" to a set of "record ids". So that I can easily get a
set of record IDs by accessing m_map[CATEGORY_ID] later on.

The fetch-and-parsing process is very long, and it takes more than 3 minutes
in usual cases. So you know, during this period, I wanted to give the user
some feedback on the UI, hopefully to show some already parsed records to
users.

So I put the parsing process in a background thread, it's doing all the
fetching job and stuff such as m_map[CATEGORY_ID].insert(RECORD_ID);. And
the UI access the map every few seconds, to fetch out record IDs. Of course,
first, it used map<long, set<long> >::const_iterator to remember
"CATEGORY_ID", and set<long>::const_iterator to remember the current
position of RECORD_ID in the set, do a little iteration, and fetch all the
relevant record IDs out. (well, during this process, the background thread
is still running)

I noticed that if I let the background thread finish before attempting to
refresh the UI, it works flawlessly. But if I do what's planned above,
refresh UI every few seconds, some how lots of records were slipped out. The
debug message shows 100 m_map[CATEGORY_ID].insert(RECORD_ID) insertions (of
course RECORD_IDs are all different), but the size of the set is only like
40 items or 30 items, it's random. I further checked the debug message and
found that the thread is running fine with no errors or exceptions, all
records are parsed the way they supposed to be.

Now, why is the number of set entries less than the number of insertions
(again, keys are all different)? My guess would be some kind of protection,
when another thread is accessing the set, all insertions are dismissed. Is
that true? How do I avoid this? I even tried pausing the thread while the UI
refresh process and got no luck.

I don't think it's a problem with compile settings, relevant options are all
checked. Any help will be much appreciated.

Thanks in advance,
--
He Shiming



Igor Tandetnik

2006-02-01, 8:04 am

"He Shiming" <mailbill(NOSPAM)@21cn.com.nospam> wrote in message
news:eYQYtxyJGHA.1728@TK2MSFTNGP09.phx.gbl
> I'm developing a database program. The program fetches and parses
> millions of records from the database and build a map/set to allow
> faster access to records by certain rules. The type of map/set is:
>
> map<long, set<long> > m_map;
>
> It maps "category id" to a set of "record ids". So that I can easily
> get a set of record IDs by accessing m_map[CATEGORY_ID] later on.
>
> So I put the parsing process in a background thread, it's doing all
> the fetching job and stuff such as
> m_map[CATEGORY_ID].insert(RECORD_ID);. And the UI access the map
> every few seconds, to fetch out record IDs (well, during this process,
> the background
> thread is still running)


You can't do that. Or rather, you can't do that if you expect any kind
of predictable results. STL containers are not thread-safe, you can't
simultaneously write to one and read from it in another thread. You need
to protect against concurrent access.
--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going to
land, and it could be dangerous sitting under them as they fly
overhead. -- RFC 1925


JH Trauntvein

2006-02-01, 8:04 am


He Shiming (NOSPAM) wrote:
> Hi All,
>
> I've got a question regarding using STL map and set class in a multithreaded
> situation.
>
> I'm developing a database program. The program fetches and parses millions
> of records from the database and build a map/set to allow faster access to
> records by certain rules. The type of map/set is:
>
> map<long, set<long> > m_map;
>
> It maps "category id" to a set of "record ids". So that I can easily get a
> set of record IDs by accessing m_map[CATEGORY_ID] later on.
>
> The fetch-and-parsing process is very long, and it takes more than 3 minutes
> in usual cases. So you know, during this period, I wanted to give the user
> some feedback on the UI, hopefully to show some already parsed records to
> users.
>
> So I put the parsing process in a background thread, it's doing all the
> fetching job and stuff such as m_map[CATEGORY_ID].insert(RECORD_ID);. And
> the UI access the map every few seconds, to fetch out record IDs. Of course,
> first, it used map<long, set<long> >::const_iterator to remember
> "CATEGORY_ID", and set<long>::const_iterator to remember the current
> position of RECORD_ID in the set, do a little iteration, and fetch all the
> relevant record IDs out. (well, during this process, the background thread
> is still running)
>
> I noticed that if I let the background thread finish before attempting to
> refresh the UI, it works flawlessly. But if I do what's planned above,
> refresh UI every few seconds, some how lots of records were slipped out. The
> debug message shows 100 m_map[CATEGORY_ID].insert(RECORD_ID) insertions (of
> course RECORD_IDs are all different), but the size of the set is only like
> 40 items or 30 items, it's random. I further checked the debug message and
> found that the thread is running fine with no errors or exceptions, all
> records are parsed the way they supposed to be.
>
> Now, why is the number of set entries less than the number of insertions
> (again, keys are all different)? My guess would be some kind of protection,
> when another thread is accessing the set, all insertions are dismissed. Is
> that true? How do I avoid this? I even tried pausing the thread while the UI
> refresh process and got no luck.


You have one thread "writing" to the map at the same time as the gui
thread is "reading" from the map. There is no guaranteed intrinsic
synchronisation within the map container. Because of this, you need to
provide your own synchronisation. i.e., use critical sections or
mutexes to provide lock access to the map container.

Regards,

Jon Trauntvein

Brian Muth

2006-02-01, 7:10 pm

> You have one thread "writing" to the map at the same time as the gui
> thread is "reading" from the map.


I find that hard to believe.
Brian


Brian Muth

2006-02-01, 7:10 pm

Sorry, I mis-interpreted what you were saying, JH. Your assessment is
correct, of course.

B


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com