Hypothesis Test (Reading: Faraway (2005, 1st edition), section 3.2)

Here is a data set that is built-in in Splus, called "saving.x". Details about the data can be found in Splus by the command "help(saving.x)", which returns the following results:

Savings Rates for Countries

SUMMARY:

The saving.x data set is originally from unpublished data of Arlie Sterling. It is a matrix with 50 rows representing countries and 5 columns representing variables. The data contain average saving rates over the time period of 1960-1970; the data are averaged to remove business cycle or other short-term fluctuations. Income is per-capita disposable income in U.S. dollars, growth is the percent rate of change in per capita disposable income, and savings rate is aggregate personal saving divided by disposable income.

SOURCE:

Belsley, D.A., Kuh, E., and Welsch, R.E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: John Wiley and Sons, Inc.

Let us read the data into R:

> saving.x <- read.table("saving.x", header=T) # the option "header" is a logical value indicating whether the file contains the names of the variables as its first line. It is set to "TRUE" if and only if the first row contains one fewer field than the number of columns.
> saving.x # take a look of the data

Pop15 Pop75 Disp Growth Savings

1 29.35 2.87 2329.68 2.87 11.43

2 23.32 4.41 1507.99 3.93 12.07

3 23.80 4.43 2108.47 3.82 13.17

... much deleted ...

50 47.20 0.66 242.69 5.08 4.71

The dataset has 50 observations and 5 variables. The variables are:

Pop15	the percentage population under 15
Pop75	the percentage population over 75
Disp	per-capita disposable income in US dollars
Growth	the percent rate of change in per capita disposable income
Savings	aggregate personal saving divided by disposable income

Let's assign the variables:

> p15 <- saving.x[,1] # save the 1st column as p15
> p75 <- saving.x[,2] # save the 2nd column as p75
> inc <- saving.x[,3] # save the 3rd column as inc
> gro <- saving.x[,4] # save the 4th column as gro
> sav <- saving.x[,5] # save the 5th column as sav

Now, let us fit a simple model, with the variable Savings as the response and the rest variables as predictors, i.e.,

sav_i	=	β₀ + β_p15p15_i + β_p75p75_i + β_incinc_i + β_grogro_i + ε_i,
		i=1, 2, ..., 50.

In the following, we demonstrate how the t-statistics and overall F-test can be obtained from R, or by doing the calculations directly.

> g <- lm(sav~p15+p75+inc+gro) # fit the model with sav as response and the rest as predictors
> summary(g) # make a note of the p-values for the overall F-test and the t-test H₀: b_p15=0

Call:
lm(formula = sav ~ p15 + p75 + inc + gro)

Residuals:
Min 1Q Median 3Q Max
-8.2423 -2.6859 -0.2491 2.4278 9.7509

Coefficients:
               Estimate Std. Error t value   Pr(>|t|)
(Intercept) 28.5666100   7.3544986    3.884   0.000334 ***
p15          -0.4612050   0.1446425   -3.189   0.002602 **
p75          -1.6915757   1.0835862   -1.561   0.125508
inc          -0.0003368   0.0009311   -0.362   0.719296
gro           0.4096998   0.1961961    2.088   0.042468 *
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

Residual standard error: 3.803 on 45 degrees of freedom
Multiple R-Squared: 0.3385, Adjusted R-squared: 0.2797
F-statistic: 5.756 on 4 and 45 DF, p-value: 0.0007902

(Q: There are many quantities and many information in the output. What have caught your attention? For different datasets, you might find their most important/useful information existing in different statistics. For this dataset, which statistics/quantities are particularly important?)

Notice that

because the overall F-test (H₀: β_p15=β_p75=β_inc=β_gro=0) has a small p-value, 0.0007902, the null hypothesis that none of the predictors are useful in explaining the response is rejected.
Q: Why the degrees of freedom for the overall F-test are 4 and 45?
although the overall F-test is significant, the R²(=0.3385) is not very high. Only 33.85% of the variation in Savings is explained by the predictors.
- However, it is difficult to judge whether such an R² is much lower than expectation when we don't have further background information about the data.
- Be aware of the relationship between R² and the test statistics of the overall F-test
  
  > (0.3385/(1-0.3385))*(45/4) # calculate the F test statistic from R²
  
  [1] 5.756803
  
  > (5.756*(4/45))/(1+5.756*(4/45)) # calculate R² from the F test statistic
  
  [1] 0.3384688
t-value = Estimate/(Std. Error) (Q: Why?)
because the p-value of the t-test H₀: β_p15=0 is very small, we can conclude that the null should be rejected. But, remember that
- the t-test of p15 is relative to the other predictors in the model, namely, p75, inc, gro, and
- it is not possible to look at the effect of p15 in isolation, i.e, simply stating the null as H₀: β_p15=0 is insufficient because information about what other predictors are not included
- the result of the test may be different if the predictors change
Q: Is the significance for the intercept, i.e, H₀: β₀=0, important?

Test of all the predictors

Now, let's do the calculation directly. First, the overall F-statistic --- we will perform the test of H₀: b_p15=b_p75=b_inc=b_gro=0 "by hand". For the test,

W:	sav=	b₀+ b_p15p15 + b_p75p75 + b_incinc + b_grogro + e, and
w:	sav=	b₀ + e.

> ip <- function(x,y) sum(x*y) # write a function for inner product, use "help("function")" to find out how to write a self-defined function in R
> ip(g$res, g$res) # the RSS_W

[1] 650.706

> g$df # the degrees of freedom of RSS_W, i.e., n-p

[1] 45

> ip(sav-mean(sav), sav-mean(sav)) # the RSS_w, which is TSS in this case

[1] 983.6283

> ((983.6283-650.706)/4)/(650.706/45) # the overall F-statistics

[1] 5.755865

> 1-pf(5.755865, 4, 45) # the p-value of the overall F-test, the command "pf" returns the cdf value of an F distribution

[1] 0.0007902025

A convenient way in R to calculate the F-statistics and its p-value is as follows:

> g1 <- lm(sav~1) # fit the null model w
> summary(g1) # take a look

Call:

lm(formula = sav ~ 1)

Residuals:

Min 1Q Median 3Q Max

-9.071 -2.701 0.839 2.947 11.429

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 9.6710 0.6336 15.26 <2e-16 ***

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.48 on 49 degrees of freedom

> anova(g1, g) # compare the two models W and w, the command "anova" computes the analysis of variance (or deviance) tables for one or more fitted models. Use "help(anova)" to get more information.

Analysis of Variance Table

Model 1: sav ~ 1
Model 2: sav ~ p15 + p75 + inc + gro
   Res.Df      RSS   Df   Sum of Sq       F     Pr(>F)
1      49   983.63
2      45   650.71    4      332.92 5.7559 0.0007902 ***
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

Check if it agrees with the result of the overall F-test in the summary of g.

In many textbooks, you can find the common format of ANOVA table as follows. From the above overall F-test, can you fill in its ANOVA table?

Source	Degree of Freedom	Sum of Square	Mean Square	F
Regression	p-1	SS_reg	SS_reg/(p-1)	F = [SS_reg/(p-1)]/[RSS/(n-p)]
Residual	n-p	RSS	RSS/(n-p)
Total	n-1=(p-1)+(n-p)	TSS=SS_reg+RSS

Testing just one predictor

Now, let's do the t-test, H₀: b_p15=0, by hand. For the test,

W:	sav=	b₀+ b_p15p15 + b_p75p75 + b_incinc + b_grogro + e, and
w:	sav=	b₀ + b_p75p75 + b_incinc + b_grogro +* e.

> g2 <- lm(sav~p75+inc+gro) # fit the model without p15
> ip(g2$res, g2$res) # the RSS_w

[1] 797.7234

> (797.7234-650.706)/(650.706/45) # the F-statistic

[1] 10.16708

> sqrt(10.16708) # t-statistics is square root of the F-statistics,

[1] 3.188586

Check if it agrees with the result in summary of g (except for the sign).

> 2*(1-pt(3.188586, 45)) # compute the tail probability --- the command "pt" returns the cdf value of a t distribution. (Q: why multiplied by 2? Ans: it's a two-sided test)

[1] 0.002602461

It should agree with the p-value in summary of g.

> anova(g2, g) # a convenient way to compare two models, one nested in another

Analysis of Variance Table

Model 1: sav ~ p75 + inc + gro

Model 2: sav ~ p15 + p75 + inc + gro

Res.Df RSS Df Sum of Sq F Pr(>F)

1 46 797.72

2 45 650.71 1 147.02 10.167 0.002602 **

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Testing a pair of predictors

We can find that the t-tests for b_p75 and b_inc are both insignificant from the summary of g (Q: Does this mean that both p75 and inc can be directly eliminated from the model?). We may therefore be interested in testing the pair of predictors: H₀: b_p75=b_inc=0. For the test,

W:	sav=	b₀+ b_p15p15 + b_p75p75 + b_incinc + b_grogro + e, and
w:	sav=	b₀ + b_p15p15 + b_grogro + e.

> g3 <- lm(sav~p15+gro) # fit the model without p75 and inc
> anova(g3, g) # a convenient way to compare two models, one nested in another

Analysis of Variance Table

Model 1: sav ~ p15 + gro

Model 2: sav ~ p15 + p75 + inc + gro

Res.Df RSS Df Sum of Sq F Pr(>F)

1 47 700.55

2 45 650.71 2 49.84 1.7234 0.1900

Because the p-value is not extreme, there is not enough evidence to reject H₀: b_p75=b_inc=0 when p15 and gro are in the model.

Testing a subspace w 　

We might be interested in testing the hypothesis that the effect of young and old people on the savings rate (with estimated value -0.4612 and -1.6915, respectively, in the summary of g) was the same, i.e., test H₀: b_p15=b_p75. For the test,

W:	sav=	b₀+ b_p15p15 + b_p75p75 + b_incinc + b_grogro + e, and
w:	sav=	b₀ + b_p15(p15+p75) + b_incinc + b_grogro +* e.

> g4 <- lm(sav~I(p15+p75)+inc+gro) # The function "I()" ensures that the argument is evaluated rather than interpreted as part of the model formula
> anova(g4, g) # a convenient way to compare two models, one nested in another

Analysis of Variance Table

Model 1: sav ~ I(p15 + p75) + inc + gro

Model 2: sav ~ p15 + p75 + inc + gro

Res.Df RSS Df Sum of Sq F Pr(>F)

1 46 673.62

2 45 650.71 1 22.92 1.5849 0.2146

We find that

the p-value of 0.2146 indicates that the null cannot be rejected here
it means that there is not evidence that young and old people need to be treated separately in the context of this particular model.

Testing a subset w 　

Suppose that we want to test whether one of the coefficients can be set to a particular value. For example, test H₀: b_gro=1 (the estimated value of b_gro is 0.4097 in the summary of g). For the test,

W:	sav=	b₀+ b_p15p15 + b_p75p75 + b_incinc + b_grogro + e, and
w:	sav=	b₀ + b_p15p15 + b_p75p75 + b_incinc + gro +* e.

Notice that now in w there is a fixed coefficient on the gro term. Such a fixed term in the regression equation is called an offset.

> g5 <- lm(sav~p15+p75+inc+offset(gro)) # An offset is a term to be added to a linear predictor. The command "offset" is used to include an offset into model
> anova(g5, g) # a convenient way to compare two models, one nested in another

Analysis of Variance Table

Model 1: sav ~ p15 + p75 + inc + offset(gro)

Model 2: sav ~ p15 + p75 + inc + gro

Res.Df RSS Df Sum of Sq F Pr(>F)

1 46 781.61

2 45 650.71 1 130.90 9.0524 0.004286 **

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We see that the p-value is large and the null hypothesis here is rejected.

An alternative is to use t-test:

> (0.4096998-1)/0.1961961 # t-statistic (Q: where in the summary of g can you find the estimate of b_gro and its standard error?)

[1] -3.008725

> 2*pt(-3.008725, 45) # p-value

[1] 0.004286205

> (-3.008725)^2 # agree with the F-statistic?

[1] 9.052426

Note that in the model g5, these is an offset term (i.e., gro). To fit a linear model with offset, can we use the response minus the offset (i.e., sav-gro) as a new response? That is, fit the model

sav - gro = b₀ + b_p15*p15 + b_p75*p75 + b_inc*inc + e.

Let us check how R handle the two models.

> g6 <- lm(I(sav-gro)~p15+p75+inc) # fit a model using the same predictors and a new response which equals the original response minus the offset

> summary(g6) # take a look of the fitted model g6

Call:

lm(formula = I(sav - gro) ~ p15 + p75 + inc)

Residuals:

Min 1Q Median 3Q Max

-10.1949 -2.1483 -0.2972 2.4888 8.8658

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 24.4005843 7.8296970 3.116 0.00315 **

p15 -0.4166095 0.1559668 -2.671 0.01042 *

p75 -1.8699958 1.1728456 -1.594 0.11769

inc 0.0003790 0.0009759 0.388 0.69954

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.122 on 46 degrees of freedom

Multiple R-Squared: 0.2208, Adjusted R-squared: 0.17

F-statistic: 4.346 on 3 and 46 DF, p-value: 0.008888

> summary(g5) # take a look of the fitted model g5

Call:

lm(formula = sav ~ p15 + p75 + inc + offset(gro))

Residuals:

Min 1Q Median 3Q Max

-10.1949 -2.1483 -0.2972 2.4888 8.8658

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 24.4005843 7.8296970 3.116 0.00315 **

p15 -0.4166095 0.1559668 -2.671 0.01042 *

p75 -1.8699958 1.1728456 -1.594 0.11769

inc 0.0003790 0.0009759 0.388 0.69954

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.122 on 46 degrees of freedom

Multiple R-Squared: 0.4523, Adjusted R-squared: 0.4166

F-statistic: 12.66 on 3 and 46 DF, p-value: 3.67e-06

Compare the summary outputs of g6 and g5 and find what values are identical and what are different. Explain why they are identical or different?

> cbind(g5$fit, g6$fit, g6$fit+gro) # compare the fitted values under g5 and g6 models

[,1] [,2] [,3]

1 10.559130 7.689130 10.559130

2 10.940081 7.010081 10.940081

3 10.820283 7.000283 10.820283

... much deleted ...

50 8.674396 3.594396 8.674396

> sum(g5$fit*g5$res) # the inner product of fitted values and residuals under g5.

[1] -221.7506

Note that in g5, the fitted values and residuals are not orthogonal.

> sum(g6$fit*g6$res) # the inner product of fitted values and residuals under g6.

[1] -2.220446e-14

Note that in g6, the fitted values and residuals are orthogonal.

For your information, to test a general linear hypothesis H₀:Ab=c in R, you can use the "linear.hypothesis" command in "car" package. Take a look of this document for more information.