Assignment 2

  1. The data set gives data on per capita output (output) in Chinese yuan, number (SI) of workers in the factory, land area (SP) of the factory in square meters per worker, and investment (I) in yuans per worker for 17 factories in Shanghai.
    1. Using least squares, fit a model expressing output in terms of the other variables.
    2. In addition to the variables in part a, add SI2 and SP¡ÑI and obtain another model.
    3. Using the model of part b, find the values of SP, SI, and I that maximize per capita output.

    ¡@

  2. The dataset prostate comes from a study on 97 men with prostate cancer who were due to receive a radical prostatectomy. This data contains the following variables:
        lcavol: log(cancer volume)
        lweight: log(prostate weight)
        age: age
        lbph: log(benign prostatic hyperplasia amount)
        svi: seminal vesicle invasion
        lcp: log(capsular penetration)
        gleason: Gleason score
        pgg45: percentage Gleason scores 4 or 5
        lpsa: log(prostate specific antigen)
    ¡@
    1. Fit a model with lpsa as the response and lcavol as the predictor. Report the residual standard error and the R2.
    2. Now add lweight, svi, lbph, age, lcp, pgg45, and gleason to the model one at a time. For each model record the residual standard error and the R2. Plot the trends in these two statistics and comment on any features that you find interesting.
    3. Plot lpsa against lcavol. Fit the simple regressions of lpsa on lcavol and lcavol on lpsa. Display both simple regression lines on the plot and comment on any features that you find interesting. At what point do the two lines intersect?

    ¡@

  1. The data set gives information on capital, labor and value added for each of three economic sectors: Food and kindred products (20), electrical and electronic machinery, equipment and supplies (36) and transportation equipment (37). For each sector:

  1. Consider the model V=αKtβ1Ltβ2εt , where the subscript t indicates year, Vis value added, Kt  is capital, Lt  is labor, and εt  is an error term with E(log(εt))=0 and var(log(εt)) a constant. Assuming that the errors are independent, and taking logs of both sides of the above model, estimate β1  and β2 .

  2. The model given in part a above is said to be of the Cobb-Douglas form. It is easier to interpret if β1 + β2 =1. Estimate β1 and β2  under this constraint.

  3. Sometimes the model  V=αγtKtβ1Ltβ2εt  is considered, where γt  is assumed to account for technological development. Estimate β1 and β2  for this model.

  4. Estimate  β1 and β2  in the model in part c, under the constraint  β1 + β2 =1.

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@

¡@