Home > Archive > Matlab > January 2008 > goodness of fit for linear regression
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
goodness of fit for linear regression
|
|
|
| I am wondering how I can calculate the goodness of fit
measures R-square, adjusted R square, RMSE etc without using
curve fitting tool box, Any advice would be appreciated!
| |
| Bhanu Chandar 2008-01-10, 4:41 am |
| "vicky " <vivek_mutalik@yahoo.com> wrote in message
<fm4fte$546$1@fred.mathworks.com>...
> I am wondering how I can calculate the goodness of fit
> measures R-square, adjusted R square, RMSE etc without
using
> curve fitting tool box, Any advice would be appreciated!
You can use the correlation coefficient, the matlab comand
is corrcoef. Which compares the actual data with the fitted
data.
| |
| Greg Heath 2008-01-10, 4:41 am |
| On Jan 10, 2:02=A0am, "vicky " <vivek_muta...@yahoo.com> wrote:
> I am wondering how I can calculate the goodness of fit
> measures R-square, adjusted R square, RMSE etc without using
> curve fitting tool box, Any advice would be appreciated!
Implement the linear model using the backslash
operator.
Calculate the residuals and SSE
Plug SSE into the formulas for the GOF measures.
or you can use other regression functions
help regress
help regstats
help stepwisefit
Hope this helps.
Greg
| |
| Greg Heath 2008-01-10, 4:41 am |
| On Jan 10, 2:22=A0am, "Bhanu Chandar"
<bhanu.brahmanapa...@mathworks.com> wrote:
> "vicky " <vivek_muta...@yahoo.com> wrote in message
>
> <fm4fte$54...@fred.mathworks.com>...
>
> using
>
> You can use the correlation coefficient, the matlab comand
> is corrcoef. Which compares the actual data with the fitted
> data.
Although slope information is important,
you have to include intercept information
to adequately compare a linear fit.
Hope this helps.
Greg
| |
|
| Thanks for your suggestions.
The most important issue is, in my multiple linear equations
Y = BX, matrix X is not square (more rows than columns =
overdetermined problem) and is not a full rank. :(
So out of two of the infinitely many solutions using
'backslash' and 'pinv' functions, pinv function gives
minimum norm. In short, im using PINV function to get my
regression coefficients and calculate predicted 'Yhat'.
If i use 'regress' or 'regstat', they call backslash
operator to do the regression and the statistics output do
not make sense (-ve R square, even after i included constant
term). So i was wondering is there any function which i can
use like 'regress' or 'regstat' to obtain goodness fit
statistics while using PINV.
Thanks again,
Greg Heath <heath@alumni.brown.edu> wrote in message
<06d8e9ac-0b17-4734-b4e2-f36d78e1a8e3@k39g2000hsf.googlegroups.com>...
> On Jan 10, 2:02=A0am, "vicky " <vivek_muta...@yahoo.com>
wrote:
>
> Implement the linear model using the backslash
> operator.
> Calculate the residuals and SSE
> Plug SSE into the formulas for the GOF measures.
>
> or you can use other regression functions
>
> help regress
> help regstats
> help stepwisefit
>
> Hope this helps.
>
> Greg
| |
| Greg Heath 2008-01-12, 10:35 pm |
| On Jan 10, 1:16 pm, "vicky " <vivek_muta...@yahoo.com> wrote:
> Thanks for your suggestions.
> The most important issue is, in my multiple linear equations
> Y = BX, matrix X is not square (more rows than columns =
> overdetermined problem) and is not a full rank. :(
No.
size(Y) = [m 1] % m = number of observations
size(X) = [m n] % n = number of explanatory variables
Xa = [ones(m,1) X]; % augmented X
% (to account for the intercept)
Y = Xa*B + e; % e = linear model error
size(B) = [n+1 1] % intercept = B(1)
This is the standard scenario for linear regression
with m >= n = so that, usually, X is full rank.
> So out of two of the infinitely many solutions using
> 'backslash' and 'pinv' functions, pinv function gives
> minimum norm.
No.
If the system were underdetermined there would
probably be an infinite number of solutions.
However, since the system is overdetermined,
a solution probably does not exist.
However, do not worry. This is the standard
OLS (Ordinary Linear Regression) scenario.
The standard approach is to find a B that
minimizes || Y-Xa*B ||.
If X is full rank (and for m > n, it usually is,
both BACKSLASH and PINV will provide that solution.
If X is rank deficient (and it usually is not when
m > n), the solution is not unique and there will
be an infinite number of solutions.. Both BACKSLASH
and PINV will yield solutions. The BACKSLASH solution
will usually have a maximum number of zero coefficients
whereas the PINV solution will usually have no zero
coefficients but have a minimum norm.
> In short, im using PINV function to get my
> regression coefficients and calculate predicted 'Yhat'.
>
> If i use 'regress' or 'regstat', they call backslash
> operator to do the regression and the statistics output do
> not make sense (-ve R square, even after i included constant
> term). So i was wondering is there any function which i can
> use like 'regress' or 'regstat' to obtain goodness fit
> statistics while using PINV.
You don't need any MATLAB functions. Just
Calculate the residuals and SSE
Plug SSE into the formulas for the GOF measures
obtained from reference texts.
Hope this helps.
Greg
>
> Greg Heath <he...@alumni.brown.edu> wrote in message <06d8e9ac-0b17-4734-b4e2-f36d78e1a...@k39g2000hsf.googlegroups.com>...
>
> wrote:
>
>
>
>
>
>
> - Show quoted text -
| |
|
| Thanks for your comments and suggestions.
I ve a design matrix with m > n (more rows than columns) and
while solving this using backlash, i get the warning:
"matrix is rank deficient to within machine precision."
I think that, having Full rank is not related to the
dimension of the matrix but that is due to the variable
inter-relatedness (collinearity).
I am using PINV to solve the equations, though im not sure
how much faith i can keep in them!
I followed ur suggestion. From residual and SSE, ive
calculated other GOF parameters. Thanks again.
|
|
|
|
|