Assignment 2

The data is taken from the Places Rated Almanac, by Richard Boyer and David Savageau, copyrighted and published by Rand McNally. The nine rating criteria used by Places Rated Almanac are:

For all but two of the above criteria, the higher the score, the better. For Housing and Crime, the lower the score the better. The scores are computed using the following component statistics for each criterion (see the Places Rated Almanac for details):

In addition latitude and longitude, population and state and case number are also given. Use principal components analysis to identify the major components of variation in the ratings amongst cities.

In particular

  1. How many principal components are needed?
  2. Interpret your principal components.
  3. If you could only use a few variables from the original dataset, which would they be?
  4. Identify unusual cities.
To identify the unusual cities, the identify() function is useful. here is an example of its use:
> plot(places[,1],places[,2])
> identify(places[,1],places[,2],row.names(places))
Now click on the plot with the right mouse button to identify points and use the middle button to finish.