Question.
The data for this question gives the number of students in given age group in different communities around the Swiss city of Lausanne cross-classified with highest level of schooling. The approximate translations of the levels are
Swiss Name | International Equivalent |
Aucune formation | No schooling |
Scolarité obligatoire | Mandatory schooling |
Formation professionnelle | Learned a trade directly in the workplace with courses once a week |
Maturité | A little more than High School, finishing between 18 and 20 |
Formation professionnelle supérieure | Learned a specialized trade |
Ecole professionnelle supérieure | Very specialized trade learned at a full time school |
Université / Haute école | University or College |
Autre | Other |
Display the data in a two-way table. Make a graphical display of the data and comment on the evidence of dependence between the two variables.
What statistical model(s) would be appropriate for analyzing the relationship between the variables? Briefly explain your reasoning.
Fit a model for the count response and use it to check for independence between community and level of schooling. Does this model fit the data?
Use the xtabs command to test for independence. Explain why the numerical results of test statistics and p-value differ from those in the previous test. Do both tests lead to the same conclusion?
Make a two-way table of the residuals from the model in 3. Comment on the larger residuals.
Perform a correspondence analysis and interpret the results.