Here is an example of an experiment to determine the effects of
columns temperature (temp),
gas/liquid ratio (gas), and
packing height (pack)
in reducing unpleasant odor (odor) of chemical product that was being sold for household use.
> odor <- read.table("odor.data", header=T) #
read in the data
> odor # take a look
the three predictors have been transformed from their original scale of measurement, for example
so the original values of the predictor were 40, 80, 120.
the data is presented in John (1971), Statistical Design and Analysis of Experiments, and give an example of a central composite design.
Suppose that we are interested in fitting the model:
The X-matrix is:
> x <- as.matrix(cbind(rep(1,15),odor[,-1])) # remember to include the constant term
> x # take a look
Check whether the inner product of any two columns is zero.
Here is the XTX matrix:
> t(x)%*%x # calculate XTX
The XTX is diagonal because of orthogonality.
Now, let's check what would happen if the temp term is analyzed using it original Fahrenheit scale.
> x[, 2] <- odor[, 2]*40+80 # change temp to its original scale
> x # take a look of the new model matrix
> t(x)%*%x # calculate XTX
Q: Why is the constant and temp terms not orthogonal any more?
Q: Why are temp, gas, and pack terms still orthogonal?
Now fit the model.
> g <- lm(odor~temp+gas+pack, data=odor) # fit the
> summary(g,cor=T) # take a look of the fitted model
Check out the correlation of the coefficients --- why did that happen?
Also, note that the standard error for the three coefficients are equal due to the balanced design.
Now, let us examine the effect of dropping variables when orthogonality exists.
> g1 <-
lm(odor~gas+pack,data=odor) # drop temp and fit a
> summary(g1) # take a look of the fitted model
Compare the summary of the model g1 and that of model g.
Q: Which things have changed and which stayed the same? Explain why.
The estimated values of β's do not change.
The residual standard error does change slightly, which causes small changes in the std. error of β-hat, t-values, and p-values.
But, in this case, these changes are not large enough to affect our qualitative conclusions.
Can orthogonality still hold if we fit a more complicate model such as:
|model 1:||odor =||β0 + β1(temp) + β2(gas) + β3(pack) + β4(temp*gas) + β5(temp*pack) + β6(gas*pack) + ε, or|
|model 2:||odor =||β0 + β1(temp) + β2(gas) + β3(pack) + β4(temp*gas) + β5(temp*pack) + β6(gas*pack) + β7(temp2) + β8(gas2) + β9(pack2) + ε|
# fit model 1
> summary(lm(odor~temp+gas+pack+I(temp*gas)+I(temp*pack)+I(gas*pack)+I(temp^2)+I(gas^2)+I(pack^2), data=odor), cor=T) # fit model 2
Take a guess before you read the results in R. A good design should have the ability to keep terms in the model as orthogonal as possible when the fitted model becomes more and more complicate.