Home > Archive > Fortran > June 2005 > wanted fast routines
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
wanted fast routines
|
|
| student 2005-06-06, 3:58 am |
|
Can anyone provide me with with a superfast routine to find inverses of
symmetric and positive definite matrices. I am currently using the
subroutine provided by numerical recipes website.
Also, is the intrinsic Matrix Multiplication function (MATMUL) optimal?
| |
| Steven G. Kargl 2005-06-06, 3:58 am |
| In article <1118013746.166644.228610@g47g2000cwa.googlegroups.com>,
"student" <adarsh@stat.tamu.edu> writes:
>
> Can anyone provide me with with a superfast routine to find inverses of
> symmetric and positive definite matrices. I am currently using the
> subroutine provided by numerical recipes website.
>
> Also, is the intrinsic Matrix Multiplication function (MATMUL) optimal?
http://www.netlib.org/
Look for ATLAS and LAPACK.
--
Steve
http://troutmask.apl.washington.edu/~kargl/
| |
| Brooks Moses 2005-06-06, 3:58 am |
| student wrote:
> Can anyone provide me with with a superfast routine to find inverses of
> symmetric and positive definite matrices. I am currently using the
> subroutine provided by numerical recipes website.
A rule of thumb that was passed to me with regards to numerical matrix
programming is that if your algorithm needs to actually find the inverse
of a matrix, you're programming it wrong -- and avoiding the matrix
inversion will make it much faster. This isn't always true, of course,
but it was claimed to be true nearly always.
- Brooks
--
The "bmoses-nospam" address is valid; no unmunging needed.
| |
| Rich Townsend 2005-06-06, 3:58 am |
| Brooks Moses wrote:
> student wrote:
>
>
>
> A rule of thumb that was passed to me with regards to numerical matrix
> programming is that if your algorithm needs to actually find the inverse
> of a matrix, you're programming it wrong -- and avoiding the matrix
> inversion will make it much faster. This isn't always true, of course,
> but it was claimed to be true nearly always.
>
> - Brooks
>
>
To be a little more specific for the OP, if you have the linear equation
A x = b,
where A is a matrix, and x and b are vectors or matrices, then it is
very inefficient to solve for x by (i) inverting A and then (ii)
multiplying b by A^-1. It is much better to use an algorithm that is
specifically targeted at solving linear equations, such as a
decomposition followed by a backsubstitution. This and related issues
are disussed in Numerical Recipes -- although when it comes to actual
code, you may be well served by avoiding NR and going for an appropriate
LAPACK routine.
cheers,
Rich
| |
| Tim Prince 2005-06-06, 3:58 am |
|
"student" <adarsh@stat.tamu.edu> wrote in message
news:1118013746.166644.228610@g47g2000cwa.googlegroups.com...
>
> Can anyone provide me with with a superfast routine to find inverses of
> symmetric and positive definite matrices. I am currently using the
> subroutine provided by numerical recipes website.
Evidently, most applications which require a conceptual inverse will be
better implemented by some solver which doesn't actually form the inverse.
Numerical Recipes has an introductory discussion for the case of
eigensystems, and I don't see them proposing any one method as fitting all
needs.
>
> Also, is the intrinsic Matrix Multiplication function (MATMUL) optimal?
Possibly, for a definition of optimality chosen by the individual who
implemented it for your compiler. If you use gfortran, you have the
opportunity to examine the source, form your own opinion, and try
variations. Remember that MATMUL is likely to need to perform well for
matrices with small sizes, while the programmer has the option to use
popular BLAS and other open source calls when those are known to be
suitable. A few compilers will choose a call to the vendor's optimized BLAS
when the size of the matrix is known to be large enough, but expand in line
when that is likely to be preferable.
| |
| glen herrmannsfeldt 2005-06-06, 3:58 am |
| student wrote:
> Also, is the intrinsic Matrix Multiplication function (MATMUL) optimal?
Matrix multiplication is sensitive to the cache of the processor it
runs on, and so is often not optimal. I believe there are some that
try to dynamically optimize for the cache, though. You might look
for those.
-- glen
| |
| David Frank 2005-06-06, 3:58 pm |
|
"student" <adarsh@stat.tamu.edu> wrote in message
news:1118013746.166644.228610@g47g2000cwa.googlegroups.com...
>
> Can anyone provide me with with a superfast routine to find inverses of
> symmetric and positive definite matrices. I am currently using the
> subroutine provided by numerical recipes website.
>
> Also, is the intrinsic Matrix Multiplication function (MATMUL) optimal?
>
For n < 100 below Fortran source for matrix inverse is BRIEFER/FASTER with
less ERROR
compared to any other Fortran source anyone will post.
! ---------------------------
SUBROUTINE Gauss (a,n) ! Invert matrix by Gauss method
IMPLICIT NONE
INTEGER :: n
REAL(8) :: a(n,n), b(n,n), temp(n), c, d
INTEGER :: ipvt(n), j, k, m
! - - - - - - - - - - - - - -
b = a
ipvt = [ 1:n ]
DO k = 1,n
m = k-1+MAXLOC(ABS(b(k:n,k)),dim=1)
IF (m /= k) THEN
ipvt( [m,k] ) = ipvt( [k,m] )
b( [m,k],:) = b( [k,m],:)
END IF
d = 1/b(k,k)
temp = b(:,k)
DO j = 1, n
c = b(k,j)*d
b(:,j) = b(:,j)-temp*c
b(k,j) = c
END DO
b(:,k) = temp*(-d)
b(k,k) = d
END DO
a(:,ipvt) = b
END SUBROUTINE Gauss
| |
| Giorgio Pastore 2005-06-06, 3:58 pm |
| David Frank wrote:
....
> For n < 100 below Fortran source ...
....
> ipvt = [ 1:n ]
....
> ipvt( [m,k] ) = ipvt( [k,m] )
> b( [m,k],:) = b( [k,m],:)
....
Uhm. Fortran source ?... with [..] ?? What does it mean ?
If I am not wrong, standard Fortran 95 doesn't allow [] or {}.
Giorgio
| |
|
| When I had to multiply some triangular matrices with a vector, I made
some test, and the best result I got, was mixing a do loop for one
index, and sum for the other.
| |
| David Frank 2005-06-06, 3:58 pm |
|
"Giorgio Pastore" <pastgio@univ.trieste.it> wrote in message
news:d81mdf$lqb$1@newsfeed.cineca.it...
> David Frank wrote:
> ...
>
> ...
> ...
> ...
>
> Uhm. Fortran source ?... with [..] ?? What does it mean ?
> If I am not wrong, standard Fortran 95 doesn't allow [] or {}.
>
>
> Giorgio
The is acceptable to CVF (a F95+ compiler)
I think replacing (/.../) with [...] is supported by a few other F95+
compilers, it definitely must
be accepted by a CURRENT STANDARD compiler..
ipvt = [ 1:n] creates a 1....n array and assigns it to ipvt
you can replace it with
ipvt = (/ (i,i=1,n) /)
again this is current standard fortran supported by a few F95+ compilers..
| |
| Richard E Maine 2005-06-06, 3:58 pm |
| In article <d81mdf$lqb$1@newsfeed.cineca.it>,
Giorgio Pastore <pastgio@univ.trieste.it> wrote:
> Uhm. Fortran source ?... with [..] ?? What does it mean ?
> If I am not wrong, standard Fortran 95 doesn't allow [] or {}.
Be aware that Dave has a definition of the word "standard" that is
"different" from that used by a lot of other people. I carefully avoid
saying "better" or "worse"; it is just different.
Some of us mean an ISO/ANSI/whatever-other-national-standard document
when we use the word "standard". Dave has specifically stated that this
isn't what he means and that he has a "better" definition.
In the case of the square brackets for array constructors, the matter
can get further by sufficiently "careful" choice of wording.
Allow me to elaborate, in hope of actually explaining instead of
obfuscating.
In particular, Dave is literally correct when he says that the brackets
are standard in the "current standard", even when using my definition of
"standard". Just be aware that the current standard in which that is
true is f2003 - not f95. Some f95 compilers also accept the brackets as
(call it whichever you like) an f2003 feature or an f95 extension. It
would not surprise me for more f95 compilers to start doing this, as it
is a pretty easy f2003 feature to add to an f95 compiler (at least I'd
assume so).
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain | experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| glen herrmannsfeldt 2005-06-06, 8:57 pm |
| Richard E Maine wrote:
(snip regarding [ and ])
> In particular, Dave is literally correct when he says that the brackets
> are standard in the "current standard", even when using my definition of
> "standard". Just be aware that the current standard in which that is
> true is f2003 - not f95. Some f95 compilers also accept the brackets as
> (call it whichever you like) an f2003 feature or an f95 extension. It
> would not surprise me for more f95 compilers to start doing this, as it
> is a pretty easy f2003 feature to add to an f95 compiler (at least I'd
> assume so).
And once they add it they will likely add "supporting some F2003
features" to the marketing literature!
There was a story once in the ethernet newsgroup about an ethernet
chip which was slightly off on one of the timing parameters. It was
decided that it wouldn't cause any incompatibilities, but it could still
be a marketing problem. The solution, it seems, was to claim ...
"meets or exceeds all ... standards".
-- glen
| |
| NuclearWizard 2005-06-07, 3:57 am |
| This is pretty fast. But instead of d=1/b(k,k), shouldn't you use
d=1._8/b(k,k)? CVF probably fixes this but I don't think all compilers
handle the INTEGER/REAL situation by converting the INTEGER to a REAL
of the same kind as the denomenator. But thanks for the post! I will
definitely add this to my collection (with the proper reference to the
author of course).
-Will
| |
| jamesgiles@att.net 2005-06-07, 3:57 am |
|
NuclearWizard wrote:
> This is pretty fast. But instead of d=1/b(k,k), shouldn't you use
> d=1._8/b(k,k)? CVF probably fixes this but I don't think all compilers
> handle the INTEGER/REAL situation by converting the INTEGER to a REAL
> of the same kind as the denomenator. [...]
The standard requires it.
--
J. Giles
| |
| Greg Lindahl 2005-06-07, 3:57 am |
| In article <1118113991.886183.238770@z14g2000cwz.googlegroups.com>,
NuclearWizard <william.wieselquist@gmail.com> wrote:
>This is pretty fast.
The fastest and most accurate way to invert a matrix is to not invert
it. Check out LU decomposition.
-- greg
| |
| Richard E Maine 2005-06-07, 4:02 pm |
| In article <1118115509.638152.225380@g14g2000cwa.googlegroups.com>,
jamesgiles@att.net wrote:
> NuclearWizard wrote:
>
> The standard requires it.
And I have never heard of a compiler that fails to do it per the
standard. I doubt that any such compiler exists.
(Questions of preferred style are separate.)
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain | experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| Dick Hendrickson 2005-06-07, 4:02 pm |
|
jamesgiles@att.net wrote:
>
> NuclearWizard wrote:
>
>
>
> The standard requires it.
>
Surely what he means is "convert at compile time rather
than generate run-time code." Probably all compilers do
this now, but they didn't in the far past.
Dick Hendrickson
| |
| NuclearWizard 2005-06-07, 8:58 pm |
| Oh. Good. (I guess I should read the standard.)
| |
| NuclearWizard 2005-06-07, 8:58 pm |
|
Greg Lindahl wrote:
> In article <1118113991.886183.238770@z14g2000cwz.googlegroups.com>,
> NuclearWizard <william.wieselquist@gmail.com> wrote:
>
>
> The fastest and most accurate way to invert a matrix is to not invert
> it. Check out LU decomposition.
>
> -- greg
If you substitute "solve a linear system" for "invert a matrix" above
and we are in complete agreement, but the discussion over whether an
inverse was really needed was already discussed, but as far as
computing inverses go, actually forming the matrix B that satisfies
x=B*f, for linear system Ax=f (nonsingular A of course), I still think
this one is pretty fast.
| |
| David Frank 2005-06-08, 8:58 am |
|
"NuclearWizard" <william.wieselquist@gmail.com> wrote in message
news:1118181279.210654.284970@z14g2000cwz.googlegroups.com...
>
>
> Greg Lindahl wrote:
>
> If you substitute "solve a linear system" for "invert a matrix" above
> and we are in complete agreement, but the discussion over whether an
> inverse was really needed was already discussed, but as far as
> computing inverses go, actually forming the matrix B that satisfies
> x=B*f, for linear system Ax=f (nonsingular A of course), I still think
> this one is pretty fast.
>
Be aware Lindahl is a broken record concerning using inverse..
Years ago we benchmarked a linear system solver here in CLF using inverse
vs. NOT using inverse and found that the least-squares summation data
processing takes 95% of the total time.
So IMO its easier to use an existing very fast inverse to compute the
coefficients after the summation is complete
vs. additional code to run somewhat BUT NOT SIGNIFICANTLY faster for the
total runtime.
If there others here doubt this we could repeat the experiment if someone
posts code for a fav "linear system"
method but Lindahl wont be doing that, he justs likes to XXXXX.
| |
| glen herrmannsfeldt 2005-06-08, 8:58 am |
| David Frank wrote:
(snip)
> Be aware Lindahl is a broken record concerning using inverse..
Along with almost everyone who knows anything about numerical
analysis.
> Years ago we benchmarked a linear system solver here in CLF using inverse
> vs. NOT using inverse and found that the least-squares summation data
> processing takes 95% of the total time.
> So IMO its easier to use an existing very fast inverse to compute the
> coefficients after the summation is complete
> vs. additional code to run somewhat BUT NOT SIGNIFICANTLY faster for the
> total runtime.
The question isn't time, but accuracy of the solutions.
For very small systems, it doesn't make all that much difference.
Numerical Recipes, among others, has a good description
of the problem, so I won't try to repeat it here.
> If there others here doubt this we could repeat the experiment if someone
> posts code for a fav "linear system"
> method but Lindahl wont be doing that, he justs likes to XXXXX.
Do you benchmark the solutions, or just how long it takes to
compute them, no matter how close they are to the right answer?
-- glen
| |
| David Frank 2005-06-08, 4:00 pm |
|
"glen herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message
news:DJydnWSXEL1PVjvfRVn-ug@comcast.com...
> David Frank wrote:
>
> (snip)
>
>
> Along with almost everyone who knows anything about numerical
> analysis.
>
>
> The question isn't time, but accuracy of the solutions.
> For very small systems, it doesn't make all that much difference.
>
> Numerical Recipes, among others, has a good description
> of the problem, so I won't try to repeat it here.
>
>
> Do you benchmark the solutions, or just how long it takes to
> compute them, no matter how close they are to the right answer?
>
> -- glen
>
>
I just whomped up a simple linear solver benchmark using random 0-based
data:
http://home.earthlink.net/~davegemini/linear.f90 for the COMPLETE
source
http://home.earthlink.net/~davegemini/linear.exe for a Windows exe
Of course valuations other than below could be applied for error/noise in
this test solution.
e.g. pentium 2.8ghz, CVF, time = 0.3438
coef = 0.0003 -0.0004 0.0001 0.0010 0.0010 -0.0003 -0.0017 0.0006 0.0007
0.0018 0.0000
noise = 0.0007
Lets see someone run a translation that doesnt use invert thru THEIR fav
linear solver
and report the time vs. my windows.exe download
| |
|
|
| Rich Townsend 2005-06-08, 4:00 pm |
| David Frank wrote:
> "glen herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message
> news:DJydnWSXEL1PVjvfRVn-ug@comcast.com...
>
>
>
>
> I just whomped up a simple linear solver benchmark using random 0-based
> data:
>
> http://home.earthlink.net/~davegemini/linear.f90 for the COMPLETE
> source
>
> http://home.earthlink.net/~davegemini/linear.exe for a Windows exe
>
> Of course valuations other than below could be applied for error/noise in
> this test solution.
>
> e.g. pentium 2.8ghz, CVF, time = 0.3438
> coef = 0.0003 -0.0004 0.0001 0.0010 0.0010 -0.0003 -0.0017 0.0006 0.0007
> 0.0018 0.0000
> noise = 0.0007
>
> Lets see someone run a translation that doesnt use invert thru THEIR fav
> linear solver
> and report the time vs. my windows.exe download
>
Well, I've just done that. On my Pentium M 1.6 GHz laptop, running
Gentoo Linux 2.6.11, and compiling using Intel Fortran 8.1.025, I find
the following execution times:
linear.f90 : 0.13213 +/- 2.2e-6
linear_lapack.f90: 0.13041 +/- 1.7e-6
linear.f90 is your program, and linear_lapack.f90 is a modification,
with the matrix inverter replaced by a LAPACK95/ATLAS linear solver
(source code attached). The errors are the standard deviation based on
100 runs. Since the times differ by O(1000) standard deviations, I
believe this result is significant. What this excercise proves is
debatable, but it is clear that your assertion regarding matrix
inversion is wholly unsupportable, and probably without any merit
whatsoever.
cheers,
Rich
| |
| Rich Townsend 2005-06-08, 4:00 pm |
| Rich Townsend wrote:
> David Frank wrote:
>
>
>
> Well, I've just done that. On my Pentium M 1.6 GHz laptop, running
> Gentoo Linux 2.6.11, and compiling using Intel Fortran 8.1.025, I find
> the following execution times:
>
> linear.f90 : 0.13213 +/- 2.2e-6
> linear_lapack.f90: 0.13041 +/- 1.7e-6
>
> linear.f90 is your program, and linear_lapack.f90 is a modification,
> with the matrix inverter replaced by a LAPACK95/ATLAS linear solver
> (source code attached). The errors are the standard deviation based on
> 100 runs. Since the times differ by O(1000) standard deviations, I
> believe this result is significant. What this excercise proves is
> debatable, but it is clear that your assertion regarding matrix
> inversion is wholly unsupportable, and probably without any merit
> whatsoever.
>
> cheers,
>
> Rich
Can I just add that the execution time of the 'linear' subroutine is
totally dominated by the setting up of the design matrix a(:,:), and
depends almost negligibly on the matrix inversion / linear solver code.
So much so, that if I move the cpu_time() calls so that they bracket
only the latter (rather than the whole subroutine), then I get an
apparent execution time of 0s.
Ergo, the whole benchmark stinks. If David Frank can come up with a more
realistic benchmark, then I'd be happy to repeat my comparison.
cheers,
Rich
| |
| James Van Buskirk 2005-06-08, 4:00 pm |
| "Rich Townsend" <rhdt@barVOIDtol.udel.edu> wrote in message
news:d8764l$rm1$1@scrotar.nss.udel.edu...
[color=darkred]
[color=darkred]
inverse[color=darkred]
the[color=darkred]
[color=darkred]
> Can I just add that the execution time of the 'linear' subroutine is
> totally dominated by the setting up of the design matrix a(:,:), and
> depends almost negligibly on the matrix inversion / linear solver code.
> So much so, that if I move the cpu_time() calls so that they bracket
> only the latter (rather than the whole subroutine), then I get an
> apparent execution time of 0s.
> Ergo, the whole benchmark stinks. If David Frank can come up with a more
> realistic benchmark, then I'd be happy to repeat my comparison.
It seems that you have proved exactly David's point quoted above.
I don't think it unusual for a programmer to profile his code and
just leave components that take negligible time in the form that
is to them most understandable, even if there is a faster method
available for such components.
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
| |
| Rich Townsend 2005-06-08, 4:00 pm |
| James Van Buskirk wrote:
> "Rich Townsend" <rhdt@barVOIDtol.udel.edu> wrote in message
> news:d8764l$rm1$1@scrotar.nss.udel.edu...
>
>
>
>
>
>
>
> inverse
>
>
> the
>
>
>
>
>
>
>
> It seems that you have proved exactly David's point quoted above.
> I don't think it unusual for a programmer to profile his code and
> just leave components that take negligible time in the form that
> is to them most understandable, even if there is a faster method
> available for such components.
>
Agreed -- I had skimmed over those specific remarks by David, instead
focusing on his criticism of Greg Lindahl. But the point stands that
calculating an inverse is almost always the wrong thing to do when
solving linear equations; it's just that in this particular case, the
extra overhead incurred by the inversion is small compared to the
execution costs of the rest of the code.
cheers,
Rich
| |
| Greg Lindahl 2005-06-08, 8:58 pm |
| In article <Xazpe.1047$VK4.788@newsread1.news.atl.earthlink.net>,
David Frank <dave_frank@hotmail.com> wrote:
>Years ago we benchmarked a linear system solver here in CLF using inverse
>vs. NOT using inverse and found that the least-squares summation data
>processing takes 95% of the total time.
You were probably comparing apples to oranges. The way to avoid the
inverse is to do LU-decomposition followed by back-substitution with
one or more RH sides. This has the same operation count as the
inverse, but is much more numerically well-behaved.
Alas, Numerical Recipes (at least the edition I'm looking at) doesn't
really explain it, but it does say to use LU decomposition followed by
back-substitution and that "There is no better way to do it."
One would think that after years of being criticized you would have
spent the time to understand what other people were saying.
-- greg
| |
| Arjen Markus 2005-06-09, 8:57 am |
| Rich Townsend wrote:
>
> James Van Buskirk wrote:
>
> Agreed -- I had skimmed over those specific remarks by David, instead
> focusing on his criticism of Greg Lindahl. But the point stands that
> calculating an inverse is almost always the wrong thing to do when
> solving linear equations; it's just that in this particular case, the
> extra overhead incurred by the inversion is small compared to the
> execution costs of the rest of the code.
>
This is one of the reasons creating and using benchmarks is not a
sinecure! You have to know what it is exactly you are measuring and
scaling the results is a horrible exercise with today's computers,
as caching seems to have the most influence on the performance
once you get into large enough problems.
Regards,
Arjen
|
|
|
|
|