Matched Pairs (Reading: Faraway (2006), section 4.3)

@

In Stuart (1955), data on the vision of a sample of women is presented. The left and right eye performance is graded into four categories. Let us read the data into R and take a look of it:

> eyegrade <- read.table("eyegrade.txt")
> eyegrade

      y  right   left

1  1520   best   best

2   266   best second

3   124   best  third

4    66   best  worst

5   234 second   best

6  1512 second second

7   432 second  third

8    78 second  worst

9   117  third   best

10  362  third second

11 1772  third  third

12  205  third  worst

13   36  worst   best

14   82  worst second

15  179  worst  third

16  492  worst  worst

> (ct <- xtabs(y ~ right+left, eyegrade))

        left

right    best second third worst

  best   1520    266   124    66

  second  234   1512   432    78

  third   117    362  1772   205

  worst    36     82   179   492

In the table,

@

Let us check for independence:

> summary(ct)

Call: xtabs(formula = y ~ right + left, data = eyegrade)

Number of cases in table: 7477

Number of factors: 2

Test for independence of all factors:

        Chisq = 8097, df = 9, p-value = 0

We are not surprised to find strong evidence of dependence in matched-pair data.

@

Let us check whether the matched-pair data has the symmetry property. We can fit such a model by first defining a factor where the levels represent the symmetric pairs for the off-diagonal elements. There is only one observation for each level down the diagonal while there are two observations for each level off the diagonal:

> (symfac <- factor(apply(eyegrade[,2:3],1,function(x) paste(sort(x),collapse="-"))))

 [1] best-best     best-second   best-third    best-worst  

 [5] best-second   second-second second-third  second-worst

 [9] best-third    second-third  third-third   third-worst 

[13] best-worst    second-worst  third-worst   worst-worst 

10 Levels: best-best best-second best-third ... worst-worst

We can see that the new factor has I(I+1)/2=4(4+1)/2=10 levels. We now fit a model for the new factor:

> mods <- glm(y ~ symfac, eyegrade, family=poisson)
> summary(mods)

Call:

glm(formula = y ~ symfac, family = poisson, data = eyegrade)

 

Deviance Residuals:

       Min          1Q      Median          3Q         Max 

-2.219e+00  -4.776e-01  -1.475e-07   4.700e-01   2.008e+00 

 

Coefficients:

                     Estimate Std. Error z value Pr(>|z|)   

(Intercept)          7.326466   0.025649 285.638  < 2e-16 ***

symfacbest-second   -1.805005   0.051555 -35.011  < 2e-16 ***

symfacbest-third    -2.534816   0.069334 -36.559  < 2e-16 ***

symfacbest-worst    -3.394640   0.102283 -33.189  < 2e-16 ***

symfacsecond-second -0.005277   0.036322  -0.145    0.884   

symfacsecond-third  -1.342529   0.043787 -30.660  < 2e-16 ***

symfacsecond-worst  -2.944439   0.083114 -35.427  < 2e-16 ***

symfacthird-third    0.153399   0.034960   4.388 1.15e-05 ***

symfacthird-worst   -2.068970   0.057114 -36.225  < 2e-16 ***

symfacworst-worst   -1.127987   0.051869 -21.747  < 2e-16 ***

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

(Dispersion parameter for poisson family taken to be 1)

 

    Null deviance: 8692.334  on 15  degrees of freedom

Residual deviance:   19.249  on  6  degrees of freedom

AIC: 156.63

 

Number of Fisher Scoring iterations: 4

> pchisq(deviance(mods),df.residual(mods),lower=F)

[1] 0.003762852

Here, we see evidence of a lack of symmetry because a deviance 19.249 on 6 degrees of freedom is too large. It is worth checking the residuals:

> round(xtabs(residuals(mods) ~ right+left, eyegrade),3)

        left

right      best second  third  worst

  best    0.000  1.001  0.317  2.008

  second -1.023  0.000  1.732 -0.225

  third  -0.320 -1.783  0.000  0.928

  worst  -2.219  0.223 -0.949  0.000

We see that:

@

Let us examine whether the data has the quasi-symmetry property. To do it, we can fit a model with main effect terms of the row and column variables and the new factor, symfac:

> modq <- glm(y ~ right+left+symfac, eyegrade, family=poisson)
> summary(modq)

Call:

glm(formula = y ~ right + left + symfac, family = poisson, data = eyegrade)

 

Deviance Residuals:

         1           2           3           4           5           6           7           8 

 0.000e+00   1.612e-01  -8.394e-01   8.894e-01  -1.706e-01  -1.460e-07   6.325e-01  -1.128e+00 

         9          10          11          12          13          14          15          16 

 9.114e-01  -6.760e-01  -3.501e-07   2.409e-01  -1.093e+00   1.200e+00  -2.548e-01  -1.490e-07 

 

Coefficients: (3 not defined because of singularities)

                    Estimate Std. Error z value Pr(>|z|)   

(Intercept)          7.32647    0.02565 285.638  < 2e-16 ***

rightsecond         -2.43955    0.09055 -26.942  < 2e-16 ***

rightthird          -1.61523    0.06955 -23.223  < 2e-16 ***

rightworst          -0.72288    0.05641 -12.816  < 2e-16 ***

leftsecond          -2.33241    0.09149 -25.493  < 2e-16 ***

leftthird           -1.39721    0.07012 -19.927  < 2e-16 ***

leftworst           -0.40510    0.05641  -7.182 6.88e-13 ***

symfacbest-second    0.57954    0.09462   6.125 9.07e-10 ***

symfacbest-third    -1.03453    0.08633 -11.983  < 2e-16 ***

symfacbest-worst    -2.84322    0.10266 -27.696  < 2e-16 ***

symfacsecond-second  4.76669    0.16668  28.598  < 2e-16 ***

symfacsecond-third   2.54814    0.11038  23.085  < 2e-16 ***

symfacsecond-worst        NA         NA      NA       NA   

symfacthird-third    3.16584    0.11415  27.734  < 2e-16 ***

symfacthird-worst         NA         NA      NA       NA   

symfacworst-worst         NA         NA      NA       NA   

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

(Dispersion parameter for poisson family taken to be 1)

 

    Null deviance: 8692.3336  on 15  degrees of freedom

Residual deviance:    7.2708  on  3  degrees of freedom

AIC: 150.65

 

Number of Fisher Scoring iterations: 4

We can see that:

@

Because marginal homogeneity together with quasi-symmetry implies symmetry. One can test for marginal homogeneity by comparing the symmetry and quasi-symmetry models using the difference-in-deviance test:

> anova(mods,modq,test="Chi")

Analysis of Deviance Table

 

Model 1: y ~ symfac

Model 2: y ~ right + left + symfac

  Resid. Df Resid. Dev Df Deviance P(>|Chi|)

1         6    19.2492                     

2         3     7.2708  3  11.9784    0.0075

We can see that:

@

When we examine the table here, we do see that the numbers on the diagonal is much larger than those off the diagonal. A table with such a property is almost impossible to pass the test for independence. We might ask whether there is independence within off-diagonal cells, i.e., whether left and right eye performance are independent among those people whose vision is not symmetric. This is called quasi-independence hypothesis and we can test it by removing the data on the diagonal:

> modqi <- glm(y ~ right+left, eyegrade, family=poisson, subset=-c(1,6,11,16))
> summary(modqi)

Call:

glm(formula = y ~ right + left, family = poisson, data = eyegrade, subset = -c(1, 6, 11, 16))

 

Deviance Residuals:

      2        3        4        5        7        8        9       10       12       13       14 

 4.7352  -5.2834  -0.1783   4.1642   0.5516  -6.2875  -3.5775  -1.4254   5.6736  -2.2159  -4.0338 

     15 

 4.8330 

 

Coefficients:

            Estimate Std. Error z value Pr(>|z|)   

(Intercept)  4.38841    0.07454  58.870  < 2e-16 ***

rightsecond  0.78176    0.06409  12.197  < 2e-16 ***

rightthird   0.68723    0.06470  10.622  < 2e-16 ***

rightworst  -0.45698    0.07552  -6.051 1.44e-09 ***

leftsecond   0.88999    0.06789  13.110  < 2e-16 ***

leftthird    0.87160    0.06689  13.030  < 2e-16 ***

leftworst   -0.17689    0.07481  -2.364   0.0181 * 

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

(Dispersion parameter for poisson family taken to be 1)

 

    Null deviance: 900.99  on 11  degrees of freedom

Residual deviance: 199.11  on  5  degrees of freedom

AIC: 294.81

 

Number of Fisher Scoring iterations: 4

Notice that: