Moore (1975) reported the results of an experiment to construct a model for total oxygen demand in dairy wastes as a function of five laboratory measurements (data). Data were collected on samples kept in suspension in water in a laboratory for 220 days. Although all observations reported here were taken on the same sample over time, assume that they are independent. The measured variables are:
y log (oxygen demand, mg oxygen per minute)
x_{1 } biological oxygen demand, mg/liter
x_{2 } total Kjeldahl nitrogen, mg/liter
x_{3 } total solids, mg/liter
x_{4 } total volatile solids, a component of x_{4}, mg/liter
x_{5 } chemical oxygen demand, mg/liter
(a) Fit a multiple regression model using y as the dependent variable and all x_{j}¡¦s as the independent variables.
(b) Now fit a regression model with only the independent variable x_{3} and x_{5}. How do the new parameters, the corresponding value of R^{2} and the t-tests compare with those obtained from the full model in (a)?
(c) Under the full model y=β_{0}+β_{1}x_{1}+β_{2}x_{2}+β_{3}x_{3}+β_{4}x_{4}+β_{5}x_{5}+£`, test the hypothesis β_{1}=β_{2}=β_{4}=0 at the 5% level of significance. What have you found? Use it to explain or rationale your answer to (b). What is the possible reason that cause the insignificance of x_{3} and x_{5} in (a)?
In a study of infant mortality, a regression model was constructed using birth weight (which is a measure of prematurity and good indicator of the baby¡¦s likelihood of survival) as a dependent variable and several independent variables, including the age of the mother, whether the birth was out of wedlock, whether the mother smoked or took drugs during pregnancy, the amount of medical attention she had, her income, etc. The R^{2 }was only 0.092, but each independent variable was significant at a 1% significance level. An obstetrician has asked you to explain the significance of the study as it relates to his practice. What would you say to him? What are the possible reasons that can cause the significance?