For Programmers: Free Programming Magazines  


Home > Archive > Matlab > January 2008 > goodness of fit for linear regression









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author goodness of fit for linear regression
vicky

2008-01-10, 4:41 am

I am wondering how I can calculate the goodness of fit
measures R-square, adjusted R square, RMSE etc without using
curve fitting tool box, Any advice would be appreciated!
Bhanu Chandar

2008-01-10, 4:41 am

"vicky " <vivek_mutalik@yahoo.com> wrote in message
<fm4fte$546$1@fred.mathworks.com>...
> I am wondering how I can calculate the goodness of fit
> measures R-square, adjusted R square, RMSE etc without

using
> curve fitting tool box, Any advice would be appreciated!


You can use the correlation coefficient, the matlab comand
is corrcoef. Which compares the actual data with the fitted
data.

Greg Heath

2008-01-10, 4:41 am

On Jan 10, 2:02=A0am, "vicky " <vivek_muta...@yahoo.com> wrote:
> I am wondering how I can calculate the goodness of fit
> measures R-square, adjusted R square, RMSE etc without using
> curve fitting tool box, Any advice would be appreciated!


Implement the linear model using the backslash
operator.
Calculate the residuals and SSE
Plug SSE into the formulas for the GOF measures.

or you can use other regression functions

help regress
help regstats
help stepwisefit

Hope this helps.

Greg
Greg Heath

2008-01-10, 4:41 am

On Jan 10, 2:22=A0am, "Bhanu Chandar"
<bhanu.brahmanapa...@mathworks.com> wrote:
> "vicky " <vivek_muta...@yahoo.com> wrote in message
>
> <fm4fte$54...@fred.mathworks.com>...
>
> using
>
> You can use the correlation coefficient, the matlab comand
> is corrcoef. Which compares the actual data with the fitted
> data.


Although slope information is important,
you have to include intercept information
to adequately compare a linear fit.

Hope this helps.

Greg
vicky

2008-01-10, 7:27 pm

Thanks for your suggestions.
The most important issue is, in my multiple linear equations
Y = BX, matrix X is not square (more rows than columns =
overdetermined problem) and is not a full rank. :(
So out of two of the infinitely many solutions using
'backslash' and 'pinv' functions, pinv function gives
minimum norm. In short, im using PINV function to get my
regression coefficients and calculate predicted 'Yhat'.

If i use 'regress' or 'regstat', they call backslash
operator to do the regression and the statistics output do
not make sense (-ve R square, even after i included constant
term). So i was wondering is there any function which i can
use like 'regress' or 'regstat' to obtain goodness fit
statistics while using PINV.

Thanks again,

Greg Heath <heath@alumni.brown.edu> wrote in message
<06d8e9ac-0b17-4734-b4e2-f36d78e1a8e3@k39g2000hsf.googlegroups.com>...
> On Jan 10, 2:02=A0am, "vicky " <vivek_muta...@yahoo.com>

wrote:
>
> Implement the linear model using the backslash
> operator.
> Calculate the residuals and SSE
> Plug SSE into the formulas for the GOF measures.
>
> or you can use other regression functions
>
> help regress
> help regstats
> help stepwisefit
>
> Hope this helps.
>
> Greg


Greg Heath

2008-01-12, 10:35 pm

On Jan 10, 1:16 pm, "vicky " <vivek_muta...@yahoo.com> wrote:
> Thanks for your suggestions.
> The most important issue is, in my multiple linear equations
> Y = BX, matrix X is not square (more rows than columns =
> overdetermined problem) and is not a full rank. :(


No.

size(Y) = [m 1] % m = number of observations
size(X) = [m n] % n = number of explanatory variables
Xa = [ones(m,1) X]; % augmented X
% (to account for the intercept)

Y = Xa*B + e; % e = linear model error

size(B) = [n+1 1] % intercept = B(1)

This is the standard scenario for linear regression
with m >= n = so that, usually, X is full rank.

> So out of two of the infinitely many solutions using
> 'backslash' and 'pinv' functions, pinv function gives
> minimum norm.


No.

If the system were underdetermined there would
probably be an infinite number of solutions.

However, since the system is overdetermined,
a solution probably does not exist.

However, do not worry. This is the standard
OLS (Ordinary Linear Regression) scenario.

The standard approach is to find a B that
minimizes || Y-Xa*B ||.

If X is full rank (and for m > n, it usually is,
both BACKSLASH and PINV will provide that solution.

If X is rank deficient (and it usually is not when
m > n), the solution is not unique and there will
be an infinite number of solutions.. Both BACKSLASH
and PINV will yield solutions. The BACKSLASH solution
will usually have a maximum number of zero coefficients
whereas the PINV solution will usually have no zero
coefficients but have a minimum norm.

> In short, im using PINV function to get my
> regression coefficients and calculate predicted 'Yhat'.
>
> If i use 'regress' or 'regstat', they call backslash
> operator to do the regression and the statistics output do
> not make sense (-ve R square, even after i included constant
> term). So i was wondering is there any function which i can
> use like 'regress' or 'regstat' to obtain goodness fit
> statistics while using PINV.


You don't need any MATLAB functions. Just

Calculate the residuals and SSE
Plug SSE into the formulas for the GOF measures
obtained from reference texts.

Hope this helps.

Greg

>
> Greg Heath <he...@alumni.brown.edu> wrote in message <06d8e9ac-0b17-4734-b4e2-f36d78e1a...@k39g2000hsf.googlegroups.com>...
>
> wrote:
>
>
>
>
>
>
> - Show quoted text -


vicky

2008-01-14, 4:38 am

Thanks for your comments and suggestions.

I ve a design matrix with m > n (more rows than columns) and
while solving this using backlash, i get the warning:
"matrix is rank deficient to within machine precision."

I think that, having Full rank is not related to the
dimension of the matrix but that is due to the variable
inter-relatedness (collinearity).

I am using PINV to solve the equations, though im not sure
how much faith i can keep in them!

I followed ur suggestion. From residual and SSE, ive
calculated other GOF parameters. Thanks again.
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com