Interpretation of parameter estimates (Reading: Faraway(2005, 1st edition), 3.7)

 

Here is an example to demonstrate the difficulty of parameter interpretation. Let us use the saving data introduced in previous lab.

> saving.x <- read.table("saving.x",header=T) # read the data into R

> p15 <- saving.x[,1]; p75 <- saving.x[,2]; inc <- saving.x[,3]; gro <- saving.x[,4]; sav <- saving.x[,5] # assign columns

 


In the following, we will

Let us start from the model with all predictors.

> g1 <- lm(sav~p15+p75+inc+gro) # fit the model with all four predictors

> summary(g1, cor=T) # take a look of the fitted model

Call:

lm(formula = sav ~ p15 + p75 + inc + gro)

 

Residuals:

    Min      1Q  Median      3Q     Max

-8.2423 -2.6859 -0.2491  2.4278  9.7509

 

Coefficients:

              Estimate Std. Error t value Pr(>|t|)   

(Intercept) 28.5666100  7.3544986   3.884 0.000334 ***

p15         -0.4612050  0.1446425  -3.189 0.002602 **

p75         -1.6915757  1.0835862  -1.561 0.125508   

inc         -0.0003368  0.0009311  -0.362 0.719296   

gro          0.4096998  0.1961961   2.088 0.042468 * 

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

Residual standard error: 3.803 on 45 degrees of freedom

Multiple R-Squared: 0.3385,     Adjusted R-squared: 0.2797

F-statistic: 5.756 on 4 and 45 DF,  p-value: 0.0007902

 

Correlation of Coefficients:

    (Intercept) p15   p75   inc 

p15 -0.98                        

p75 -0.81        0.77           

inc -0.17        0.18 -0.37     

gro -0.19        0.10 -0.05  0.26

Notice that

 

Now, let's drop p15 from the model.

Let us find out.

> g2 <- lm(sav~p75+inc+gro) # fit the model with predictors p75, inc, gro
> summary(g2, cor=T) # take a look of the fitted model

Call:

lm(formula = sav ~ p75 + inc + gro)

 

Residuals:

    Min      1Q  Median      3Q     Max

-8.0577 -3.2144  0.1689  2.4260 10.0763

 

Coefficients:

             Estimate Std. Error t value Pr(>|t|)   

(Intercept) 5.4875193  1.4276623   3.844  0.00037 ***

p75         0.9527173  0.7637785   1.247  0.21857   

inc         0.0001974  0.0010031   0.197  0.84482   

gro         0.4738062  0.2137268   2.217  0.03161 * 

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

Residual standard error: 4.164 on 46 degrees of freedom

Multiple R-Squared: 0.189,      Adjusted R-squared: 0.1361

F-statistic: 3.573 on 3 and 46 DF,  p-value: 0.02093

 

Correlation of Coefficients:

    (Intercept) p75   inc 

p75 -0.49                 

inc  0.06       -0.80     

gro -0.50       -0.21  0.24

 

Let's see what happens when we drop inc from the model.

> g3 <- lm(sav~p75+gro) # fit the model with p75 and gro
> summary(g3, cor=T) # take a look of the fitted model

Call:

lm(formula = sav ~ p75 + gro)

 

Residuals:

     Min       1Q   Median       3Q      Max

-8.02229 -3.29489  0.08895  2.45702 10.10688

 

Coefficients:

            Estimate Std. Error t value Pr(>|t|)   

(Intercept)   5.4695     1.4101   3.879 0.000325 ***

p75           1.0726     0.4563   2.351 0.022992 * 

gro           0.4636     0.2052   2.259 0.028562 * 

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

Residual standard error: 4.122 on 47 degrees of freedom

Multiple R-Squared: 0.1883,     Adjusted R-squared: 0.1538

F-statistic: 5.452 on 2 and 47 DF,  p-value: 0.007423

 

Correlation of Coefficients:

    (Intercept) p75 

p75 -0.73           

gro -0.53       -0.03

 

Let's try dropping gro.

> g4 <- lm(sav~p75) # fit the model only with p75

> summary(g4, cor=T) # take a look of the fitted model

Call:

lm(formula = sav ~ p75)

 

Residuals:

     Min       1Q   Median       3Q      Max

-9.26566 -3.22947  0.05428  2.33359 11.84979

 

Coefficients:

            Estimate Std. Error t value Pr(>|t|)   

(Intercept)   7.1517     1.2475   5.733  6.4e-07 ***

p75           1.0987     0.4753   2.312   0.0251 * 

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

Residual standard error: 4.294 on 48 degrees of freedom

Multiple R-Squared: 0.1002,     Adjusted R-squared: 0.08144

F-statistic: 5.344 on 1 and 48 DF,  p-value: 0.02513

 

Correlation of Coefficients:

    (Intercept)

p75 -0.87     

 

Let's compare the coefficient estimates and p-values for p75 throughout.

model estimates sign significant?
g1 -1.6915757 - no
g2 0.9527173 + no
g3 1.0726 + yes
g4 1.0987 + yes

We see that

 


Notice that

Consider a prediction made using each of the four models above:

> x0 <- data.frame(p15=32, p75=3, inc=700, gro=3) # assign values for prediction
> predict(g1,x0,interval="prediction"); predict(g2,x0,interval="prediction"); predict(g3,x0,interval="prediction"); predict(g4,x0,interval="prediction") # calculate predicted values and confidence intervals using the four models

          fit      lwr      upr

[1,] 9.726689 1.795553 17.65782

          fit      lwr      upr

[1,] 9.905299 1.225853 18.58475

          fit      lwr      upr

[1,] 10.07808 1.672549 18.48361

          fit      lwr      upr

[1,] 10.44777 1.701873 19.19366

We can see