Code Comments
Programming Forum and web based access to our favorite programming groups.Hi, This is related to repeated calls to small-size FFT using FFTW library in fortran. I have written a subroutine that only creates the plan the first time it is called. Then it works fine if I call it many times from one subroutine. But it would crash if I call it from another subroutine. I have to generate another plan to avoid the crash. Since the second call does exactly the same size fft, it would generate a simpler code if the old plan can be reused. My question is how I can reuse the old plan ? An example code is given below. It can be compiled and would crash. Thanks. Xiaogang !*************************************** ************************************ ! This code shows that repeated calls to small fft could crash. ! A second plan call would save the situation (not shown). ! Tested on itanium2 ifort 8.0 and 8.1 ! program main implicit real*8 (a-h,o-z) complex*16,allocatable :: vin1(:,:), vout1(:,:) n = 4 m = 1000 allocate(vin1(n,m), vout1(n,m)) vin1 = (1d0,0d0) write(*,*)'call sub1, sub1 does fft 1000 times on v(4)' call sub1(n,m, vin1, vout1) deallocate(vin1, vout1) call sub2(n,m) end !*************************************** ************************************ *** subroutine sub2(n,m) implicit real*8 (a-h,o-z) integer, intent(in) :: n,m complex*16, allocatable :: vin2(:,:), vout2(:,:) allocate(vin2(n,m), vout2(n,m)) vin2 = (1d0,0d0) write(*,*)'call sub1 from sub2, sub1 does fft 1000 times on v(4)' write(*,*)'it would crash using the old plan' call sub1(n,m, vin2, vout2) deallocate(vin2, vout2) end subroutine sub2 !*************************************** ************************************ *** ! Lesson 1: fftw will crash if vec1(:), vec2(:) are ! allocated/deallocated subroutine sub1(n,m,vin, vout) implicit real*8 (a-h,o-z) integer, intent(in) :: n,m complex*16, intent(in) :: vin(n,m) complex*16, intent(out) :: vout(n,m) complex*16 :: vec1(n), vec2(n) ! do same-size fft, 1000 times do i = 1, m vec1 = vin(:,i) call fft_sub(n, vec1, vec2) vout(:,i) = vec2 end do write(*,*)'Example input' do j=1,n write(*,'(i3,2f12.6)')j,vec1(j) end do write(*,*)'Example output after fftw' do j=1,n write(*,'(i3,2f12.6)')j,vec2(j) end do write(*,*) end subroutine sub1 !*************************************** ************************************ *** ! fft_sub is called repeatedly, plan is generated the first time ! this routine is called subroutine fft_sub(n, vin, vout) implicit real*8 (a-h,o-z) include 'fftw3.inc' complex*16, intent(inout) :: vin(n) complex*16, intent(out) :: vout(n) integer*8 plan data iflag/1/ save plan save iflag if (iflag == 1) then call dfftw_plan_dft_1d(plan,n,vin,vout,FFTW_F ORWARD,FFTW_ESTIMATE) write(*,*)'plan=',plan iflag = 0 end if call dfftw_execute(plan) end subroutine fft_sub
Post Follow-up to this messageHi Xiaogang, Xiaogang Wang wrote: > Hi, > > This is related to repeated calls to small-size FFT using FFTW > library in fortran. I have written a subroutine that only creates the > plan the first > time it is called. Then it works fine if I call it many times from > one > subroutine. But it would crash if I call it from another subroutine. > I have to generate another plan to avoid the crash. > Since the second call does exactly the same size fft, it would > generate > a simpler code if the old plan can be reused. > My question is how I can reuse the old plan ? > An example code is given below. It can be compiled and would crash. > O.K., I haven't done this but some of my colleagues have. It turns out that in latter versions of FFTW ( can't tell you starting from when ) that when evaluating the plan it uses the address that the array to be FFT'ed is at, so gaining a small amount of speed at the expenses of a fair degree of user inconvenience. So yes, if you allocate/deallocate between each call you will have to create a new plan. Right pain in the backside I know .... but that is the way it is, Ian
Post Follow-up to this messageIan Bush <I.J.Bush@nospam.dl.ac.uk> wrote in message news:<cnckor$fjq$1@mserv2.dl.ac.uk>... > > O.K., I haven't done this but some of my colleagues have. It turns > out that in latter versions of FFTW ( can't tell you starting from when ) > that when evaluating the plan it uses the address that the array to be > FFT'ed is at, so gaining a small amount of speed at the expenses of > a fair degree of user inconvenience. So yes, if you allocate/deallocate > between each call you will have to create a new plan. Right pain in the > backside I know .... but that is the way it is, > > Ian Hi, Ian, I can live with not using allocate/deallocate, because I find a solution: using AUTOMATIC arrays (see vec1 and vec2 in sub1). My real pain is that when I call sub1 from another subroutine (sub2 in my example), it does not work unless I create a new plan. I can call sub1 as many times as I want as long as I am calling it from the same subroutine. This is very painful because I want to call sub1 from different subrotines and do not want to transfer another parameter for creating a plan. Xiaogang Xiaogang
Post Follow-up to this messageIan Bush <I.J.Bush@nospam.dl.ac.uk> wrote... > O.K., I haven't done this but some of my colleagues have. It turns > out that in latter versions of FFTW ( can't tell you starting from when ) > that when evaluating the plan it uses the address that the array to be > FFT'ed is at, so gaining a small amount of speed at the expenses of > a fair degree of user inconvenience. So yes, if you allocate/deallocate > between each call you will have to create a new plan. Right pain in the > backside I know .... but that is the way it is, It's not to gain "a small amount of speed". It's to preserve correctness for SIMD machines, since then the plan depends upon the alignment of the array and most systems don't guarantee that all arrays of the same size will be 16-byte aligned. *If* you know that all of your arrays are 16-byte aligned, *or* you don't have a system with SIMD instructions, then there is a routine (fftw_execute_dft) that we provide to apply a given plan to another array of the same size. See also the FAQ (and the fine manual). Cordially, Steven G. Johnson
Post Follow-up to this messageSteven G. Johnson wrote: > It's not to gain "a small amount of speed". It's to preserve > correctness for SIMD machines, since then the plan depends upon the > alignment of the array and most systems don't guarantee that all > arrays of the same size will be 16-byte aligned. > > *If* you know that all of your arrays are 16-byte aligned, *or* you > don't have a system with SIMD instructions, then there is a routine > (fftw_execute_dft) that we provide to apply a given plan to another > array of the same size. > > See also the FAQ (and the fine manual). > > Cordially, > Steven G. Johnson Yet another reason I like c.l.f -- answers often come straight from the source. Thanks, Steven.
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.