Moore (1975) reported the results of an experiment to construct a model for total oxygen demand in dairy wastes as a function of five laboratory measurements (data). Data were collected on samples kept in suspension in water in a laboratory for 220 days. Although all observations reported here were taken on the same sample over time, assume that they are independent. The measured variables are:
y log (oxygen demand, mg oxygen per minute)
x1 biological oxygen demand, mg/liter
x2 total Kjeldahl nitrogen, mg/liter
x3 total solids, mg/liter
x4 total volatile solids, a component of x4, mg/liter
x5 chemical oxygen demand, mg/liter
(a) Fit a multiple regression model using y as the dependent variable and all xj°¶s as the independent variables.
(b) Now fit a regression model with only the independent variable x3 and x5. How do the new parameters, the corresponding value of R2 and the t-tests compare with those obtained from the full model in (a)?
(c) Under the full model y=β0+β1x1+β2x2+β3x3+β4x4+β5x5+£`, test the hypothesis β1=β2=β4=0 at the 5% level of significance. What have you found? Use it to explain or rationale your answer to (b). What is the possible reason that cause the insignificance of x3 and x5 in (a)?
In a study of infant mortality, a regression model was constructed using birth weight (which is a measure of prematurity and good indicator of the baby°¶s likelihood of survival) as a dependent variable and several independent variables, including the age of the mother, whether the birth was out of wedlock, whether the mother smoked or took drugs during pregnancy, the amount of medical attention she had, her income, etc. The R2 was only 0.092, but each independent variable was significant at a 1% significance level. An obstetrician has asked you to explain the significance of the study as it relates to his practice. What would you say to him? What are the possible reasons that can cause the significance?