Linear Model

The scientific method is frequently used as a guided approach to learning. Linear statistical methods are widely used as part of this learning process.

Linear models describe a continuous response variable as a function of one or more predictor variables. They can help you understand and predict the behavior of complex systems or analyze experimental, financial, and biological data. Linear regression is a statistical method used to create a linear model.

Mathematical Modelling:

Let <span class=Y" align="absmiddle" /> be the dependent variable of dimension n \times 1X_{n \times p} be the independent variables, \theta_1, \theta_2,\theta_3,\textrm{...},\theta_p and \sigma^2 are unknown parameters, the the linear model can be written as:

<span class=Y_{n \times 1}=X_{n \times p} \theta_{p \times 1} + \epsilon_{n \times 1}" align="absmiddle" />

where E(\epsilon)=0 and D(\epsilon)=\sigma^2 {D(.) implying dispersion}


  • Prediction: Estimates of the individual parameters \beta_0, \beta_1, ..., \beta_k are of less importance for prediction than the overall influence of the x variables on y. However, good estimates are needed to achieve good prediction performance.
  • Data Description or Explanation: The scientist or engineer uses the estimated model to summarize or describe the observed data.
  • Parameter Estimation: The values of the estimated parameters may have theoretical implications for a postulated model.
  • Variable Selection or Screening: The emphasis is on determining the importance of each predictor variable in modeling the variation in <span class=y" align="absmiddle" />. The predictors that are associated with an important amount of variation in <span class=y" align="absmiddle" /> are retained; those that contribute little are deleted.
  • Control of Output: A cause-and-effect relationship between <span class=y" align="absmiddle" /> and the x variables is assumed. The estimated model might then be used to control the output of a process by varying the inputs. By systematic experimentation, it may be possible to achieve the optimal output.


Linear Model can mainly be classified in 3 types:

  • Simple linear regression: models using only one predictor
  • Multiple linear regression: models using multiple predictors
  • Multivariate linear regression: models for multiple response variables

Parameter Estimation:

We apply method of Least Squares to estimate the parameter <span class=\theta" align="absmiddle" />, which involves the minimization of the error sum of squares L, given by:

L=\epsilon'\epsilon=(<span class=y-X\theta)'(y-X\theta)=\sum_{i=1}^n(y_i-\sum x_{ij}\theta_j)^2" align="absmiddle" />

Differentiating L w.r.t. \theta and equating the derivative to 0, we obtain the following set of linear equations, also called the Normal Equations:

X'X <span class=\hat{\theta}=X'y" align="absmiddle" />

where <span class=\hat{\theta}" align="absmiddle" /> is an estimator of <span class=\theta" align="absmiddle" />, referred to as the least square estimate.

Predicted values are \hat{Y}=X\hat{\beta}=HY, where H=X(X'X)^{-1}X', is the hat matrix, which is idempotent, i.e H’H=I.

Exercise: Check that the normal equations are consistent, (i.e. admits a solutions) whatever be the rank of X.

(Hint: X'<span class=y \in C(X') \Rightarrow X'y \in C(X'X)" align="absmiddle" /> where C(X) means the column space of X)


Now, suppose, <span class=X \sim N(0,I_n)" align="absmiddle" />. Then X'AX \sim \chi^2_{(k)} iff A is idempotent, where k=rank(A).

Now, we may compute the mle (Maximum Likelihood Estimator) of \beta and \sigma^2. After some trivial calculations, we arrive at the following estimates:



assuming rank of X is p. (Otherwise, we can use the Generalized Inverse, but let’s not go into that in this post.)

Now, we can comment on the distributions of the estimates obtained.

\hat{\beta} \sim N(\beta,\sigma^2(X'X)^{-1})

(n-p)\frac{\hat{\sigma^2}}{\sigma^2} \sim \chi^2_{n-p}

R codes

Here are some useful codes for R software:

Multiple Linear Regression Example 
fit <- lm(y ~ x1 + x2 + x3, data=mydata)           #fit data
summary(fit)                                                            # show results


Other useful functions 
coefficients(fit)                                                        # model coefficients
confint(fit, level=0.95)                                          # Confidence Intervals for model parameters 
fitted(fit)                                                                  # predicted values
residuals(fit)                                                           # residuals
anova(fit)                                                                # anova table 
vcov(fit)                                                                   # covariance matrix for model parameters 
influence(fit)                                                          # regression diagnostics


No topic of Statistics is fully understood till it is applied on some real data. So one reading this should try to apply the given method to a real dataset for complete comprehension. You can use R or Matlab or Python, whichever suits you better.

You may find datasets in the UCI Database.

Leave a comment

Your email address will not be published. Required fields are marked *