# Assignment 3

1. Moore (1975) reported the results of an experiment to construct a model for total oxygen demand in dairy wastes as a function of five laboratory measurements (data). Data were collected on samples kept in suspension in water in a laboratory for 220 days. Although all observations reported here were taken on the same sample over time, assume that they are independent. The measured variables are:

y   log (oxygen demand, mg oxygen per minute)

x1   biological oxygen demand, mg/liter

x2   total Kjeldahl nitrogen, mg/liter

x3   total solids, mg/liter

x4   total volatile solids, a component of x4, mg/liter

x5   chemical oxygen demand, mg/liter

(a) Fit a multiple regression model using y as the dependent variable and all xj’s as the independent variables.

(b) Now fit a regression model with only the independent variable x3 and x5. How do the new parameters, the corresponding value of R2 and the t-tests compare with those obtained from the full model in (a)?

(c) Under the full model y=β0+β1x1+β2x2+β3x3+β4x4+β5x5+ε, test the hypothesis β1=β2=β4=0 at the 5% level of significance. What have you found? Use it to explain or rationale your answer to (b). What is the possible reason that cause the insignificance of x3 and x5 in (a)?

1. In a study of infant mortality, a regression model was constructed using birth weight (which is a measure of prematurity and good indicator of the baby’s likelihood of survival) as a dependent variable and several independent variables, including the age of the mother, whether the birth was out of wedlock, whether the mother smoked or took drugs during pregnancy, the amount of medical attention she had, her income, etc. The Rwas only 0.092, but each independent variable was significant at a 1% significance level. An obstetrician has asked you to explain the significance of the study as it relates to his practice. What would you say to him? What are the possible reasons that can cause the significance?