Essays/Linear Regression

Linear regression is a statistical method of modeling the relationship between the dependent variable Y and independent X by estimating the coefficients $b_{0},...b_{p}$ of the linear form:

${\hat {y}}=b_{0}+b_{1}x_{1}+b_{2}x_{2}+...+b_{p}x_{p}$

where each terms $x_{i}$ is a certain expression with the original independent variables ( $X^{(1)}...X^{(k)}$ ). For example, it could be that $x_{1}=X,x_{2}=X^{2}$ .

Least Squares Method

In least squares method, the coefficients of linear regression are selected in a way to minimize the sum of squared deviations between observations and their estimates:

$\sum _{i=1..n}\left(Y_{i}-{\hat {y}}(X_{i})\right)^{2}\rightarrow min$

Surface Fit Example

As an example we will take a certain bi-quadratic form

$y(x_{1},x_{2})=1+0.2x_{1}^{2}+0.3x_{2}-0.4x_{2}^{2}$

then add a small amount of noise, to simulate observed data, and try to reconstruct the coefficients using the least squares method.

inline:lsq_form.png	inline:lsq_data.png	inline:lsq_estm.png
`'surface'plot X1;X2;FORM`	`'surface'plot X1;X2;DATA`	`'surface'plot X1;X2;COEF mp XMAT`

   load 'plot'
   mp =: +/ . *

      'X1 X2' =: |: ,"0/~ i:8
      $XMAT   =: 1 , X1 , (X1^2) , X2 , (X1*X2) ,: (X2^2)
6 17 17

      FORM    =: 1   0     0.2     0.3   0    _0.4 mp XMAT
      FORM    -: 1 + (0.2*X1^2) + (0.3*X2) + (_0.4*X2^2)
1

      NOISE   =: 4 * _0.5 + ($X1) ?.@$ 0
      $DATA   =: FORM + NOISE
17 17
         COEF  =: (,DATA) %. |:,"2 XMAT

Now we can compare the obtained coefficients with the original formula.

   0j4": COEF  ,: (,FORM) %. |:,"2 XMAT
1.0011 _0.0144 0.2005 0.3104 0.0024 _0.4013
1.0000  0.0000 0.2000 0.3000 0.0000 _0.4000

Additional regression analysis is provided in the 'stats' package.

   load 'stats'
   (|:}.,"2 XMAT) regression ,DATA

             Var.       Coeff.         S.E.           t
              0        1.00105        0.12654        7.91
              1       _0.01444        0.01375       _1.05
              2        0.20052        0.00316       63.55
              3        0.31036        0.01375       22.56
              4        0.00241        0.00281        0.86
              5       _0.40131        0.00316     _127.17

  Source     D.F.        S.S.          M.S.           F
Regression    5    27192.76720     5438.55344     4144.49
Error       283      371.36300        1.31224
Total       288    27564.13020

S.E. of estimate         1.14553
Corr. coeff. squared     0.98653

The $R^{2}$ index shows high degree of match between the observations and their estimates.

Essays/Linear Regression

Least Squares Method

Surface Fit Example

See Also

Navigation menu

Search