Home > Archive > Fortran > November 2004 > Problems with a multi-threaded program
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Problems with a multi-threaded program
|
|
| Arjen Markus 2004-11-18, 8:58 am |
| Hello all,
I have severe trouble making an as yet simple multi-threaded program
to work reliably.
The program does the following:
- It starts four threads
- Though ultimately they will have to work together, right now,
they simply run through a few loops, allocating and deallocating
local data, getting a random number and calling a sleep function
(to make sure the threads run simultaneously)
- No global or saved data are used
I have three (four) different platforms at my disposal:
- Sun/Solaris, using an old version of the Sun f90 compiler.
With the option -mt, the program runs smoothly and "correctly"
all the time (at least I have not seen any strange problems
after adding -mt to the compile and link statements)
- RH 7.2 and RH Enterprise Edition, both with Intel Fortran 8.0.
I use the option -reentrancy threaded (otherwise the program
immediately aborts because of recursive I/O).
This version works most of the time, but sometimes a thread
crashes, often in an internal write or a deallocate statement.
- Windows XP with Compaq Visual Fortran 6.6C.
I use a Windows version of the pthreads library that works
well in other contexts.
In this case, the program is very persistent in crashing,
even though I use the multithreaded runtim libraries.
Moreover the random numbers that I get have a tendency to
converge to 1.0
This all leads me to be very wary of multithreading.
Does anyone have suggestions to improve the reliability of
my program? (I could post the source code, but it is some
400 lines of code).
Regards,
Arjen
| |
|
| Arjen Markus wrote:
> Hello all,
>
> I have severe trouble making an as yet simple multi-threaded program
> to work reliably.
>
> The program does the following:
> - It starts four threads
> - Though ultimately they will have to work together, right now,
> they simply run through a few loops, allocating and deallocating
> local data, getting a random number and calling a sleep function
> (to make sure the threads run simultaneously)
> - No global or saved data are used
How do you manage threads? Using compiler parallel optimizations or
OpenMP?
>
> I have three (four) different platforms at my disposal:
>
> - Sun/Solaris, using an old version of the Sun f90 compiler.
> With the option -mt, the program runs smoothly and "correctly"
> all the time (at least I have not seen any strange problems
> after adding -mt to the compile and link statements)
>
> - RH 7.2 and RH Enterprise Edition, both with Intel Fortran 8.0.
> I use the option -reentrancy threaded (otherwise the program
> immediately aborts because of recursive I/O).
> This version works most of the time, but sometimes a thread
> crashes, often in an internal write or a deallocate statement.
>
> - Windows XP with Compaq Visual Fortran 6.6C.
> I use a Windows version of the pthreads library that works
> well in other contexts.
> In this case, the program is very persistent in crashing,
> even though I use the multithreaded runtim libraries.
> Moreover the random numbers that I get have a tendency to
> converge to 1.0
Hmm... compiler parallel support then. Take a look at OpenMP
(Intel support this). You can change number of threads conditionally etc
etc.
>
> This all leads me to be very wary of multithreading.
Start from simple loops using OpenMP parallel directives. Experiment
with number of threads.
>
> Does anyone have suggestions to improve the reliability of
> my program? (I could post the source code, but it is some
> 400 lines of code).
Hmm... I thing you should parallelize your program adding OpenMP
directives sequentially loop after loop. Problems will occur on
single loop level as well. Then post your limited examples :-)
Regards,
B52B (@pl.pl.pl doesn't exist but I'm watching)
>
> Regards,
>
> Arjen
| |
| Arjen Markus 2004-11-18, 8:58 am |
| B52B wrote:
>
>
> How do you manage threads? Using compiler parallel optimizations or
> OpenMP?
>
I use the pthreads library under UNIX/Linux and a port of that library
for Windows.
>
> Hmm... compiler parallel support then. Take a look at OpenMP
> (Intel support this). You can change number of threads conditionally etc
> etc.
>
Unfortunately, that is not an option, as it imposes one particular
kind of parallellism.
>
> Start from simple loops using OpenMP parallel directives. Experiment
> with number of threads.
>
The program's threads do not do much more than that - but as I said:
I need to use a more general kind of parallellism - something that
is not limited to do-loops.
Regards,
Arjen
| |
|
| Arjen Markus wrote:
> The program's threads do not do much more than that - but as I said:
> I need to use a more general kind of parallellism - something that
> is not limited to do-loops.
You said that:
> - Though ultimately they will have to work together, right now,
> they simply run through a few loops, allocating and deallocating
----------------------------------^
> local data, getting a random number and calling a sleep function
> (to make sure the threads run simultaneously)
> - No global or saved data are used
So I don't understand what do you mean writing:
> I need to use a more general kind of parallelism - something that
> is not limited to do-loops.
Never mind. My advise is:
Understand your runtime/OS threads behavior on the loop level.
Regards,
B52
>
> Regards,
>
> Arjen
| |
| Arjen Markus 2004-11-18, 8:58 am |
| B52B wrote:
>
>
> So I don't understand what do you mean writing:
>
>
> Never mind. My advise is:
>
> Understand your runtime/OS threads behavior on the loop level.
>
Okay, some more details about what this test program ultimately
ought to do:
- It is meant to test a library designed for storing and exchanging data
in a structured way - either between threads or different processes
- This library must be made threadsafe.
- Right now the test program's threads will do the following:
- two threads each store 10 arrays in this library
- two other threads try to get these arrays out again
- the first two must wait for this process to finish and then
do a second round of storing and the last two do a second
round of retrieving
- As the library is not threadsafe yet, I use a couple of dummy
routines to take care of the exchange part - they simply
allocate an array, fill it with predefined values and pass
that back. The receiving threads can examine the values, decide
all is well and free the array.
This program is meant to exercise the library's crucial routines,
but right now everything is handled with local variables and
so on - so the only thing multithreaded about it is that
several threads are running in the same process, no synchronisation
is necessary.
Still, from the behaviour of the program I must conclude that the
underlying runtime libraries are not threadsafe (on at least three
of the four platforms).
Note on OpenMP: as it is my task to make the library threadsafe
I have no idea nor any control over the way it will be used in
actual applications. Hence OpenMP is not a solution to my problem.
Regards,
Arjen
| |
| Ian Bush 2004-11-18, 8:58 am |
| Arjen Markus wrote:
>
> The program's threads do not do much more than that - but as I said:
> I need to use a more general kind of parallellism - something that
> is not limited to do-loops.
>
OpenMP does a bit more than do loops. Does the parallel sections stuff not
address what you require ?
Ian
| |
| Jugoslav Dujic 2004-11-18, 8:58 am |
| Arjen Markus wrote:
| Hello all,
|
| I have severe trouble making an as yet simple multi-threaded program
| to work reliably.
|
| - Windows XP with Compaq Visual Fortran 6.6C.
| I use a Windows version of the pthreads library that works
| well in other contexts.
| In this case, the program is very persistent in crashing,
| even though I use the multithreaded runtim libraries.
| Moreover the random numbers that I get have a tendency to
| converge to 1.0
Potentially stupid question: have you used /recursive (or
/automatic -- AFAIK they do 99.99% the same thing) switch on
CVF -- that's the very prerequisite for doing almost anything
with threads?
--
Jugoslav
___________
www.geocities.com/jdujic
Please reply to the newsgroup.
You can find my real e-mail on my home page above.
| |
| Arjen Markus 2004-11-18, 8:58 am |
| Jugoslav Dujic wrote:
>
> Arjen Markus wrote:
> | Hello all,
> |
> | I have severe trouble making an as yet simple multi-threaded program
> | to work reliably.
> |
> | - Windows XP with Compaq Visual Fortran 6.6C.
> | I use a Windows version of the pthreads library that works
> | well in other contexts.
> | In this case, the program is very persistent in crashing,
> | even though I use the multithreaded runtim libraries.
> | Moreover the random numbers that I get have a tendency to
> | converge to 1.0
>
> Potentially stupid question: have you used /recursive (or
> /automatic -- AFAIK they do 99.99% the same thing) switch on
> CVF -- that's the very prerequisite for doing almost anything
> with threads?
>
Yes, this made no difference :(
What is more, the random numbers kept being very non-random.
Regards,
Arjen
| |
| Arjen Markus 2004-11-18, 8:58 am |
| Ian Bush wrote:
>
> Arjen Markus wrote:
>
> OpenMP does a bit more than do loops. Does the parallel sections stuff not
> address what you require ?
>
> Ian
No, because I will not control the very applications that are going to
use
this library ... Perhaps this _is_ something to keep in mind, though.
Thanks
for reminding me of this feature of OpenMP.
Regards,
Arjen
| |
|
| Arjen Markus wrote:
>
> Note on OpenMP: as it is my task to make the library threadsafe
> I have no idea nor any control over the way it will be used in
> actual applications. Hence OpenMP is not a solution to my problem.
I am not convinced. You can hide library threads management from end
users. All you need is to manipulate arrays, pause threads etc etc.
You can do this on high level (OpenMP) or on low level (pthreads etc).
Since you don't like OpenMP approach you must play with threads
internals which is (in my opinion) non portable and ugly.
Regards,
B52
>
> Regards,
>
> Arjen
| |
| Jugoslav Dujic 2004-11-18, 8:58 am |
| Arjen Markus wrote:
| Jugoslav Dujic wrote:
||
|| Arjen Markus wrote:
||| Hello all,
|||
||| I have severe trouble making an as yet simple multi-threaded program
||| to work reliably.
|||
||| - Windows XP with Compaq Visual Fortran 6.6C.
||| I use a Windows version of the pthreads library that works
||| well in other contexts.
||| In this case, the program is very persistent in crashing,
||| even though I use the multithreaded runtim libraries.
||| Moreover the random numbers that I get have a tendency to
||| converge to 1.0
||
|| Potentially stupid question: have you used /recursive (or
|| /automatic -- AFAIK they do 99.99% the same thing) switch on
|| CVF -- that's the very prerequisite for doing almost anything
|| with threads?
||
|
| Yes, this made no difference :(
|
| What is more, the random numbers kept being very non-random.
Would you mind sending me your sources? I'm curious (although
I can't make any promises). I've just downloaded pthreads from
http://sources.redhat.com/pthreads-win32/ -- is it the right
place?
Please see my signature.
--
Jugoslav
___________
www.geocities.com/jdujic
Please reply to the newsgroup.
You can find my real e-mail on my home page above.
| |
| Arjen Markus 2004-11-18, 8:58 am |
| B52B wrote:
>
> Arjen Markus wrote:
>
>
> I am not convinced. You can hide library threads management from end
> users. All you need is to manipulate arrays, pause threads etc etc.
> You can do this on high level (OpenMP) or on low level (pthreads etc).
>
> Since you don't like OpenMP approach you must play with threads
> internals which is (in my opinion) non portable and ugly.
>
> Regards,
> B52
>
>
Oh, I would love to be able to use the OpenMP approach, but in this case
all the thread management will be inside the library.
Here is my main program:
program my_test_threads
use MY_THREADS
use TEST_FUNCTIONS
implicit none
type(MY_THREAD_DATA) :: data
call my_start_thread( 'THREAD 1 - DELIVER', thread1, data )
call my_start_thread( 'THREAD 2 - DELIVER', thread2, data )
call my_start_thread( 'THREAD 3 - CONSUME', thread3, data )
call my_start_thread( 'THREAD 4 - CONSUME', thread4, data )
call my_wait_threads
end program MY_TEST_THREADS
The test function (thread1 to thread4) do not attempt anything
threaded - that is the business of the library.
I can send you the source code - I have not posted it because
of the sheer volume.
Regards,
Arjen
| |
|
| Arjen Markus wrote:
> B52B wrote:
>
>
>
> Oh, I would love to be able to use the OpenMP approach, but in this case
> all the thread management will be inside the library.
>
> Here is my main program:
>
> program my_test_threads
> use MY_THREADS
> use TEST_FUNCTIONS
> implicit none
>
> type(MY_THREAD_DATA) :: data
>
> call my_start_thread( 'THREAD 1 - DELIVER', thread1, data )
> call my_start_thread( 'THREAD 2 - DELIVER', thread2, data )
> call my_start_thread( 'THREAD 3 - CONSUME', thread3, data )
> call my_start_thread( 'THREAD 4 - CONSUME', thread4, data )
>
> call my_wait_threads
>
>
> end program MY_TEST_THREADS
>
> The test function (thread1 to thread4) do not attempt anything
> threaded - that is the business of the library.
>
> I can send you the source code - I have not posted it because
> of the sheer volume.
OK then, below is my address (remove capital letters in domain part):
b1wojcik@REMcyf-kr.edu.pl
B52
>
> Regards,
>
> Arjen
| |
| Arjen Markus 2004-11-19, 8:58 am |
| Jugoslav Dujic wrote:
>
> |
> | What is more, the random numbers kept being very non-random.
>
> Would you mind sending me your sources? I'm curious (although
> I can't make any promises). I've just downloaded pthreads from
> http://sources.redhat.com/pthreads-win32/ -- is it the right
> place?
>
Thanks to Jugoslav, part of the riddle has been solved: two
serious flaws in the communication between Fortran and C
(I should have better checked that part of the program) that
affected the Windows version.
One other lesson learned: CVF's random_number generator behaves
oddly in a multithreaded environment - this was solved by
explicitly seeding it per thread.
This leaves the more awkward problem on Linux ...
Regards,
Arjen
| |
| Arjen Markus 2004-11-19, 8:58 am |
| Arjen Markus wrote:
>
> Jugoslav Dujic wrote:
>
>
> Thanks to Jugoslav, part of the riddle has been solved: two
> serious flaws in the communication between Fortran and C
> (I should have better checked that part of the program) that
> affected the Windows version.
>
> One other lesson learned: CVF's random_number generator behaves
> oddly in a multithreaded environment - this was solved by
> explicitly seeding it per thread.
>
> This leaves the more awkward problem on Linux ...
> Regards,
>
> Arjen
Further information on the Linux platform:
I have now used the "NPTL" version of the pthreads library -
this gives better results, in that there are no more segmentation
faults, but sometimes the rogram just hangs - presumably one
thread is not finishing its job....
Regards,
Arjen
| |
| Arjen Markus 2004-11-19, 3:58 pm |
| Arjen Markus wrote:
>
>
> Further information on the Linux platform:
> I have now used the "NPTL" version of the pthreads library -
> this gives better results, in that there are no more segmentation
> faults, but sometimes the rogram just hangs - presumably one
> thread is not finishing its job....
>
I think I have solved the problem! Apparently the option
-reentrancy threaded was not enough or not adequate. I have
declared my subroutines as recursive and this works!
(I will have to wind back the various changes to see what
was essential, but at least things _are_ working predictably)
Regards,
Arjen
|
|
|
|
|