Box-Cox Transformation (Reading: Faraway (2005, 1st edition), 7.1)


Transforming response

Does the response in the savings data need transformation? You'll need a function from the "MASS" library for performing Box-Cox transformation. Read in the library:

> library(MASS)

Try it out on the savings dataset:

> savings <- read.table("savings.data")

> g <- lm(sav ~ p15 + p75 + inc + gro, data=savings)
> boxcox(g, plotit=T)

> boxcox(g, plotit=T, lambda=seq(0.5,1.5,by=0.1))

Now consider the Galapagos data analyzed earlier:

> gala <- read.table("gala.data")

> gg <- lm(Species~Area+Elevation+Nearest+Scruz+Adjacent, data=gala)

> boxcox(gg, plotit=T)

> boxcox(gg, lambda=seq(0.0,1.0,by=0.05), plotit=T)

@


Transforming predictors

Let's see if the gro variable in the savings dataset needs transformation:

> g <- lm(sav ~ p15 + p75 + gro + inc, data=savings)
> g2 <- update(g, . ~ . + I(gro*log(gro))) # Add gro*log(gro) to the model
> summary(g2) 

Coefficients:

                    Estimate Std. Error t value Pr(>|t|)  

(Intercept)       23.7631766  8.6192503   2.757  0.00846 **

p15               -0.4026101  0.1545631  -2.605  0.01249 *

p75               -1.3604915  1.1258041  -1.208  0.23332  

gro                1.6904514  1.2190155   1.387  0.17251  

inc               -0.0004147  0.0009326  -0.445  0.65874  

I(gro * log(gro)) -0.4675886  0.4392646  -1.064  0.29292  

---

Residual standard error: 3.797 on 44 degrees of freedom

Multiple R-Squared: 0.3551,     Adjusted R-squared: 0.2818

F-statistic: 4.845 on 5 and 44 DF,  p-value: 0.001291

Now see if p15 should be transformed. 

> g3 <- update(g, . ~ . + I(p15*log(p15))) ;  summary(g3)