Exercise 10. Examining the proportional hazards hypothesis (localised melanoma)


Load the diet data using time-on-study as the timescale with a maximum of 10 years follow-up.

You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest") for each package you require.

Load melanoma data and explore it.

(b)

There is no strong evidence against an assumption of proportional hazards since we see (close to) parallel curves when plotting the instantaneous cause-specific hazard on the log scale.

(c)

If the proportional hazards assumption is appropriate then we should see parallel lines. This looks okay; we shouldn’t put too much weight on the fact that the curves cross early in the follow-up since there are so few deaths there. The difference between the two log-cumulative hazard curves is similar during the part of the follow-up where we have the most information (most deaths). Note that these curves are not based on the estimated Cox model (i.e., they are unadjusted).

(d)

The estimated hazard ratio from the Cox model is \(0.78\) which is similar (as it should be) to the estimate made by looking at the hazard function plot.

## Call:
## coxph(formula = Surv(trunc_yy, death_cancer == 1) ~ year8594, 
##     data = localised)
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## year8594Diagnosed 85-94 -0.25297   0.77649  0.06579 -3.845 0.000121 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## year8594Diagnosed 85-94    0.7765      1.288    0.6825    0.8834
## 
## Concordance= 0.533  (se = 0.008 )
## Likelihood ratio test= 14.83  on 1 df,   p=1e-04
## Wald test            = 14.78  on 1 df,   p=1e-04
## Score (logrank) test = 14.86  on 1 df,   p=1e-04

(e)

The plot of the scaled Schoenfeld residuals for the effect of period. Under proportional hazards, the smoother will be a horizontal line. The line is not, however, perfectly horizontal; it appears that the effect of period is greater earlier in the follow-up.

## Call:
## coxph(formula = Surv(trunc_yy, death_cancer == 1) ~ sex + year8594 + 
##     agegrp, data = localised)
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## sexFemale               -0.53061   0.58825  0.06545 -8.107 5.19e-16 ***
## year8594Diagnosed 85-94 -0.33339   0.71649  0.06618 -5.037 4.72e-07 ***
## agegrp45-59              0.28283   1.32688  0.09417  3.003  0.00267 ** 
## agegrp60-74              0.62006   1.85904  0.09088  6.823 8.90e-12 ***
## agegrp75+                1.21801   3.38045  0.10443 11.663  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## sexFemale                  0.5882     1.7000    0.5174    0.6688
## year8594Diagnosed 85-94    0.7165     1.3957    0.6293    0.8157
## agegrp45-59                1.3269     0.7536    1.1032    1.5959
## agegrp60-74                1.8590     0.5379    1.5557    2.2215
## agegrp75+                  3.3804     0.2958    2.7547    4.1483
## 
## Concordance= 0.646  (se = 0.009 )
## Likelihood ratio test= 212.7  on 5 df,   p=<2e-16
## Wald test            = 217.9  on 5 df,   p=<2e-16
## Score (logrank) test = 226.8  on 5 df,   p=<2e-16

(f)

No solution written for this part.

(g)

It seems that there is evidence of non-proportional hazards by age (particularly for the comparison of the oldest to youngest) but not for calendar period. The plot of Schoenfeld residuals suggested non-proportionality for period but this was not statistically significant.

##          chisq df      p
## sex       1.17  1 0.2784
## year8594  1.57  1 0.2096
## agegrp   15.93  3 0.0012
## GLOBAL   20.45  5 0.0010

(h)

The hazard ratios for age in the top panel are for the first two years subsequent to diagnosis. To obtain the hazard ratios for the period two years or more following diagnosis we multiply the hazard ratios in the top and bottom panel. That is, during the first two years following diagnosis patients aged 75 years or more at diagnosis have 5.4 times higher cancer-specific mortality than patients aged 0–44 at diagnosis. During the period two years or more following diagnosis the corresponding hazard ratio is \(5.4 \times 0.49=2.66\).
Using survSplit to split on time will give you the same results as above. We see that the age:follow up interaction is statistically significant.

##   id start  trunc_yy
## 1  1     0  2.000000
## 2  1     2  2.208333
## 3  2     0  2.000000
## 4  2     2  4.625000
## 5  3     0  2.000000
## 6  3     2 10.000000
##      sex age     stage mmdx yydx surv_mm surv_yy      status       subsite
## 1 Female  81 Localised    2 1981    26.5     2.5 Dead: other Head and Neck
## 2 Female  81 Localised    2 1981    26.5     2.5 Dead: other Head and Neck
## 3 Female  75 Localised    9 1975    55.5     4.5 Dead: other Head and Neck
## 4 Female  75 Localised    9 1975    55.5     4.5 Dead: other Head and Neck
## 5 Female  78 Localised    2 1978   177.5    14.5 Dead: other         Limbs
## 6 Female  78 Localised    2 1978   177.5    14.5 Dead: other         Limbs
##          year8594         dx       exit agegrp id      ydx    yexit start  trunc_yy
## 1 Diagnosed 75-84 1981-02-02 1983-04-20    75+  1 1981.088 1983.298     0  2.000000
## 2 Diagnosed 75-84 1981-02-02 1983-04-20    75+  1 1981.088 1983.298     2  2.208333
## 3 Diagnosed 75-84 1975-09-21 1980-05-07    75+  2 1975.720 1980.348     0  2.000000
## 4 Diagnosed 75-84 1975-09-21 1980-05-07    75+  2 1975.720 1980.348     2  4.625000
## 5 Diagnosed 75-84 1978-02-21 1992-12-07    75+  3 1978.140 1992.934     0  2.000000
## 6 Diagnosed 75-84 1978-02-21 1992-12-07    75+  3 1978.140 1992.934     2 10.000000
##   death_cancer fu
## 1            0  1
## 2            0  2
## 3            0  1
## 4            0  2
## 5            0  1
## 6            0  2
## Call:
## coxph(formula = Surv(start, trunc_yy, death_cancer) ~ sex + year8594 + 
##     agegrp * fu, data = melanoma2p8Split)
## 
##   n= 9856, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## sexFemale               -0.52742   0.59012  0.06543 -8.061 7.58e-16 ***
## year8594Diagnosed 85-94 -0.33548   0.71499  0.06623 -5.065 4.08e-07 ***
## agegrp45-59              0.53058   1.69992  0.19634  2.702  0.00689 ** 
## agegrp60-74              0.90046   2.46074  0.18741  4.805 1.55e-06 ***
## agegrp75+                1.68918   5.41503  0.19175  8.809  < 2e-16 ***
## fu2                           NA        NA  0.00000     NA       NA    
## agegrp45-59:fu2         -0.32093   0.72547  0.22382 -1.434  0.15161    
## agegrp60-74:fu2         -0.36715   0.69270  0.21467 -1.710  0.08720 .  
## agegrp75+:fu2           -0.70783   0.49271  0.23207 -3.050  0.00229 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## sexFemale                  0.5901     1.6946    0.5191    0.6709
## year8594Diagnosed 85-94    0.7150     1.3986    0.6280    0.8141
## agegrp45-59                1.6999     0.5883    1.1569    2.4978
## agegrp60-74                2.4607     0.4064    1.7043    3.5529
## agegrp75+                  5.4150     0.1847    3.7186    7.8853
## fu2                            NA         NA        NA        NA
## agegrp45-59:fu2            0.7255     1.3784    0.4678    1.1250
## agegrp60-74:fu2            0.6927     1.4436    0.4548    1.0550
## agegrp75+:fu2              0.4927     2.0296    0.3126    0.7765
## 
## Concordance= 0.645  (se = 0.009 )
## Likelihood ratio test= 222.5  on 8 df,   p=<2e-16
## Wald test            = 224.5  on 8 df,   p=<2e-16
## Score (logrank) test = 238  on 8 df,   p=<2e-16
## Call:
## coxph(formula = Surv(start, trunc_yy, death_cancer) ~ sex + year8594 + 
##     agegrp + I(agegrp == "75+" & fu == "2"), data = melanoma2p8Split)
## 
##   n= 9856, number of events= 960 
## 
##                                        coef exp(coef) se(coef)      z Pr(>|z|)    
## sexFemale                          -0.52813   0.58970  0.06543 -8.072 6.90e-16 ***
## year8594Diagnosed 85-94            -0.33500   0.71534  0.06622 -5.059 4.22e-07 ***
## agegrp45-59                         0.28459   1.32921  0.09417  3.022  0.00251 ** 
## agegrp60-74                         0.62377   1.86595  0.09089  6.863 6.73e-12 ***
## agegrp75+                           1.48077   4.39633  0.14309 10.348  < 2e-16 ***
## I(agegrp == "75+" & fu == "2")TRUE -0.43812   0.64525  0.16994 -2.578  0.00994 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                                    exp(coef) exp(-coef) lower .95 upper .95
## sexFemale                             0.5897     1.6958    0.5187    0.6704
## year8594Diagnosed 85-94               0.7153     1.3979    0.6283    0.8145
## agegrp45-59                           1.3292     0.7523    1.1052    1.5987
## agegrp60-74                           1.8660     0.5359    1.5615    2.2298
## agegrp75+                             4.3963     0.2275    3.3211    5.8196
## I(agegrp == "75+" & fu == "2")TRUE    0.6452     1.5498    0.4625    0.9003
## 
## Concordance= 0.646  (se = 0.009 )
## Likelihood ratio test= 219.3  on 6 df,   p=<2e-16
## Wald test            = 225.4  on 6 df,   p=<2e-16
## Score (logrank) test = 236.9  on 6 df,   p=<2e-16

(i)

0–2 years 2+ years
Agegrp0-44 1.00 1.00
Agegrp45-59 1.70 1.23
Agegrp60-74 2.46 1.70
Agegrp75+ 5.42 2.67
## Call:
## coxph(formula = Surv(start, trunc_yy, death_cancer) ~ sex + year8594 + 
##     fu + fu:agegrp, data = melanoma2p8Split)
## 
##   n= 9856, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## sexFemale               -0.52742   0.59012  0.06543 -8.061 7.58e-16 ***
## year8594Diagnosed 85-94 -0.33548   0.71499  0.06623 -5.065 4.08e-07 ***
## fu2                           NA        NA  0.00000     NA       NA    
## fu1:agegrp45-59          0.53058   1.69992  0.19634  2.702  0.00689 ** 
## fu2:agegrp45-59          0.20965   1.23325  0.10774  1.946  0.05167 .  
## fu1:agegrp60-74          0.90046   2.46074  0.18741  4.805 1.55e-06 ***
## fu2:agegrp60-74          0.53331   1.70456  0.10479  5.089 3.59e-07 ***
## fu1:agegrp75+            1.68918   5.41503  0.19175  8.809  < 2e-16 ***
## fu2:agegrp75+            0.98135   2.66806  0.13157  7.458 8.75e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## sexFemale                  0.5901     1.6946    0.5191    0.6709
## year8594Diagnosed 85-94    0.7150     1.3986    0.6280    0.8141
## fu2                            NA         NA        NA        NA
## fu1:agegrp45-59            1.6999     0.5883    1.1569    2.4978
## fu2:agegrp45-59            1.2332     0.8109    0.9985    1.5232
## fu1:agegrp60-74            2.4607     0.4064    1.7043    3.5529
## fu2:agegrp60-74            1.7046     0.5867    1.3881    2.0932
## fu1:agegrp75+              5.4150     0.1847    3.7186    7.8853
## fu2:agegrp75+              2.6681     0.3748    2.0616    3.4530
## 
## Concordance= 0.645  (se = 0.009 )
## Likelihood ratio test= 222.5  on 8 df,   p=<2e-16
## Wald test            = 224.5  on 8 df,   p=<2e-16
## Score (logrank) test = 238  on 8 df,   p=<2e-16
## Call:
## coxph(formula = Surv(trunc_yy, death_cancer) ~ sex + year8594 + 
##     agegrp + tt(agegrp), data = localised, tt = function(x, t, 
##     ...) (x == "75+") * (t >= 2))
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## sexFemale               -0.52813   0.58970  0.06543 -8.072 6.90e-16 ***
## year8594Diagnosed 85-94 -0.33500   0.71534  0.06622 -5.059 4.22e-07 ***
## agegrp45-59              0.28459   1.32921  0.09417  3.022  0.00251 ** 
## agegrp60-74              0.62377   1.86595  0.09089  6.863 6.73e-12 ***
## agegrp75+                1.48077   4.39633  0.14309 10.348  < 2e-16 ***
## tt(agegrp)              -0.43812   0.64525  0.16994 -2.578  0.00994 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## sexFemale                  0.5897     1.6958    0.5187    0.6704
## year8594Diagnosed 85-94    0.7153     1.3979    0.6283    0.8145
## agegrp45-59                1.3292     0.7523    1.1052    1.5987
## agegrp60-74                1.8660     0.5359    1.5615    2.2298
## agegrp75+                  4.3963     0.2275    3.3211    5.8196
## tt(agegrp)                 0.6452     1.5498    0.4625    0.9003
## 
## Concordance= 0.646  (se = 0.009 )
## Likelihood ratio test= 219.3  on 6 df,   p=<2e-16
## Wald test            = 225.4  on 6 df,   p=<2e-16
## Score (logrank) test = 236.9  on 6 df,   p=<2e-16
## Call:
## coxph(formula = Surv(trunc_yy, death_cancer) ~ sex + year8594 + 
##     agegrp + tt(agegrp), data = localised, tt = function(x, t, 
##     ...) (x == "75+") * t)
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## sexFemale               -0.52607   0.59092  0.06540 -8.044 8.71e-16 ***
## year8594Diagnosed 85-94 -0.33673   0.71410  0.06625 -5.083 3.72e-07 ***
## agegrp45-59              0.28592   1.33099  0.09418  3.036  0.00240 ** 
## agegrp60-74              0.62623   1.87054  0.09089  6.890 5.59e-12 ***
## agegrp75+                1.66488   5.28506  0.17061  9.758  < 2e-16 ***
## tt(agegrp)              -0.15804   0.85381  0.05048 -3.131  0.00174 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## sexFemale                  0.5909     1.6923    0.5198    0.6717
## year8594Diagnosed 85-94    0.7141     1.4004    0.6271    0.8131
## agegrp45-59                1.3310     0.7513    1.1066    1.6008
## agegrp60-74                1.8705     0.5346    1.5653    2.2353
## agegrp75+                  5.2851     0.1892    3.7829    7.3837
## tt(agegrp)                 0.8538     1.1712    0.7734    0.9426
## 
## Concordance= 0.646  (se = 0.009 )
## Likelihood ratio test= 223.7  on 6 df,   p=<2e-16
## Wald test            = 230.6  on 6 df,   p=<2e-16
## Score (logrank) test = 242.8  on 6 df,   p=<2e-16
##                        Estimate    2.5 %   97.5 %    Chisq   Pr(>Chisq)
## agegrp75+ + tt(agegrp)  4.51245 3.465555 5.875597 125.1785 4.651533e-29
## Call:
## coxph(formula = Surv(trunc_yy, death_cancer) ~ sex + year8594 + 
##     agegrp + tt(agegrp), data = localised, tt = function(x, t, 
##     ...) (x == "75+") * log(t))
## 
##   n= 5318, number of events= 960 
## 
##                             coef exp(coef) se(coef)      z Pr(>|z|)    
## sexFemale               -0.52565   0.59117  0.06539 -8.038 9.11e-16 ***
## year8594Diagnosed 85-94 -0.33703   0.71388  0.06626 -5.087 3.64e-07 ***
## agegrp45-59              0.28630   1.33150  0.09418  3.040  0.00236 ** 
## agegrp60-74              0.62732   1.87259  0.09089  6.902 5.13e-12 ***
## agegrp75+                1.61984   5.05230  0.13844 11.701  < 2e-16 ***
## tt(agegrp)              -0.49528   0.60940  0.11848 -4.180 2.91e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                         exp(coef) exp(-coef) lower .95 upper .95
## sexFemale                  0.5912     1.6916    0.5201    0.6720
## year8594Diagnosed 85-94    0.7139     1.4008    0.6269    0.8129
## agegrp45-59                1.3315     0.7510    1.1071    1.6014
## agegrp60-74                1.8726     0.5340    1.5670    2.2377
## agegrp75+                  5.0523     0.1979    3.8517    6.6272
## tt(agegrp)                 0.6094     1.6410    0.4831    0.7687
## 
## Concordance= 0.646  (se = 0.009 )
## Likelihood ratio test= 230.8  on 6 df,   p=<2e-16
## Wald test            = 237.6  on 6 df,   p=<2e-16
## Score (logrank) test = 255.6  on 6 df,   p=<2e-16

Note that the tt functionality only works for a single variable (in our case, agegrp75+). We have used the lincom function to estimate the hazard ratio for agegrp75+. We will later describe a more flexible approach to modelling time-dependent effects using stpm2.

(j)

## `summarise()` has grouped output by 'mid', 'sex', 'year8594'. You can override using the `.groups` argument.