RJafroc Documentation

Dev P. Chakraborty

2018-11-14

Introduction

This vignette is intended for those seeking a quick transiton from Windows JAFROC to RJafroc. It is assumed that the user is familiar with the JAFROC data format and can analyze a dataset using the Windows program. First, let me describe the structure in R of an RJafroc dataset. Later I will tell you how to read a JAFROC format file to create an RJafroc dataset.

An ROC dataset

Let us start with a predefined dataset {dataset3} corresponding to the Franken ROC data. Let us examine the structure of this dataset.

str(dataset03)
#> List of 8
#>  $ NL          : num [1:2, 1:4, 1:100, 1] 3 3 4 3 3 ...
#>  $ LL          : num [1:2, 1:4, 1:67, 1] 5 5 4 4 5 4 4 5 2 2 ...
#>  $ lesionNum   : int [1:67] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ lesionID    : num [1:67, 1] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ lesionWeight: num [1:67, 1] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ dataType    : chr "ROC"
#>  $ modalityID  : Named chr [1:2] "0" "1"
#>   ..- attr(*, "names")= chr [1:2] "0" "1"
#>  $ readerID    : Named chr [1:4] "0" "1" "2" "3"
#>   ..- attr(*, "names")= chr [1:4] "0" "1" "2" "3"

It shows a list with 8 members. The false positive ratings are contained in {NL}, an array with dimensions [1:2,1:4,1:100,1]. The first index corresponds to treatments, and since the dataset has 2 treatments, the corresponding dimension is 2. The second index corresponds to readers, and since the dataset has 4 readers, the corresponding dimension is 4. The third index corresponds to the total number of cases. Since the dataset has 100 cases, the corresponding dimension is 100. But, as you can see from the code below, the entries in this array for cases 34 through 100 are -Inf.

dataset03$NL[1,1,34:100,1]
#>  [1] -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf
#> [15] -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf
#> [29] -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf
#> [43] -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf
#> [57] -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf

This is because in the ROC paradigm false positive are not possible on diseased cases. So the actual FP ratings are contained in the first 33 elements of the array. How did I know that there are 34 non-diseased cases? This can be understood in several ways.

The dataType list member is the character string "ROC", characterizing the ROC dataset. Alternatives are "FROC" and "LROC".

dataset03$dataType
#> [1] "ROC"

The modalityID list member is a character string with two entries, “0” and “1”, corresponding to the two treatments (i.e., modalities). These can be longer strings, if you please, that label the two treatments.

dataset03$modalityID
#>   0   1 
#> "0" "1"

The readerID list member is a character string with four entries, “0”, “1”, “2” and “3” corresponding to the four readers. These can be longer strings that label the four readers.

dataset03$readerID
#>   0   1   2   3 
#> "0" "1" "2" "3"

Here are the actual ratings for cases 1:34.

dataset03$NL[1,1,1:34,1]
#>  [1]    3 -Inf    2    2    2    2    2    4 -Inf -Inf    4    2 -Inf    2
#> [15]    4    2 -Inf    2 -Inf    2    4    2    3    2    2    2    4    3
#> [29]    2    2    2    5    3 -Inf

This says that for treatment 1 and reader 1, (non-diseased) case 1 was rated 3, case 3 was rated 2, case 8 was rated 4, etc. The -Inf corresponds to the cases rated 1, which is is equivalent to a 1-rating. The reason for this is that the ratings are ordered labels. As far as the ordering is concerned, nothing is changed be replacing -Inf with 1 and vice-versa.

As another example, for treatment 2 and reader 3,

dataset03$NL[2,3,1:34,1]
#>  [1]    3 -Inf    2    2    2    2    4    4    2    3    2    2 -Inf    3
#> [15]    2    4    2    3    2    2    2    2    2    4    2    2 -Inf    2
#> [29]    2    2    2    4    2 -Inf

As you can see, there are no cases that are explicitly rated 1, so changing the -Inf to 1 does not change the ordering of the ratings.

Creating a dataset from a JAFROC format file

There is a file includedRocData.xlsx that is part of the package intallation. Since it is a system file one must get its name as follows.

fileName <- "includedRocData.xlsx"
sysFileName <- system.file(paste0("extdata/",fileName), package = "RJafroc", mustWork = TRUE)

Next, one uses DfReadDataFile() as follows, assuming it is a JAFROC format file.

ds <- DfReadDataFile(sysFileName)

Now ds is the desired dataset.

str(ds)
#> List of 8
#>  $ NL          : num [1:2, 1:5, 1:114, 1] 1 3 2 3 2 2 1 2 3 2 ...
#>  $ LL          : num [1:2, 1:5, 1:45, 1] 5 5 5 5 5 5 5 5 5 5 ...
#>  $ lesionNum   : int [1:45] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ lesionID    : num [1:45, 1] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ lesionWeight: num [1:45, 1] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ dataType    : chr "ROC"
#>  $ modalityID  : Named chr [1:2] "0" "1"
#>   ..- attr(*, "names")= chr [1:2] "0" "1"
#>  $ readerID    : Named chr [1:5] "0" "1" "2" "3" ...
#>   ..- attr(*, "names")= chr [1:5] "0" "1" "2" "3" ...

Analysis is illustrated for dataset03, but one could have used the newly created dataset ds.

Analyzing the ROC dataset

This illustrates the StSignificanceTesting() function. The significance testing method is specified as "DBMH" and the figure of merit FOM is specified as “Wilcoxon”.

ret <- StSignificanceTesting(dataset03, method = "DBMH", FOM = "Wilcoxon")
print(ret)
#> $fomArray
#>           Rdr - 0   Rdr - 1   Rdr - 2   Rdr - 3
#> Trt - 0 0.8534600 0.8649932 0.8573044 0.8152420
#> Trt - 1 0.8496156 0.8435097 0.8401176 0.8143374
#> 
#> $anovaY
#>   Source           SS  DF          MS
#> 1      T   0.02356541   1 0.023565410
#> 2      R   0.20521800   3 0.068406000
#> 3      C  52.52839868  99 0.530589886
#> 4     TR   0.01506079   3 0.005020264
#> 5     TC   6.41004881  99 0.064747968
#> 6     RC  39.24295381 297 0.132131158
#> 7    TRC  22.66007764 297 0.076296558
#> 8  Total 121.08532315 799          NA
#> 
#> $anovaYi
#>   Source  DF          0          1
#> 1      R   3 0.04926635 0.02415991
#> 2      C  99 0.29396753 0.30137032
#> 3     RC 297 0.10504787 0.10337984
#> 
#> $varComp
#>                  varComp
#> Var(R)      3.775568e-05
#> Var(C)      5.125091e-02
#> Var(T*R)   -7.127629e-04
#> Var(T*C)   -2.887147e-03
#> Var(R*C)    2.791730e-02
#> Var(Error)  7.629656e-02
#> 
#> $fRRRC
#> [1] 4.694058
#> 
#> $ddfRRRC
#> [1] 3
#> 
#> $pRRRC
#> [1] 0.1188379
#> 
#> $ciDiffTrtRRRC
#>   Treatment   Estimate      StdErr DF        t    Pr > t     CI Lower
#> 1     0 - 1 0.01085482 0.005010122  3 2.166577 0.1188379 -0.005089627
#>     CI Upper
#> 1 0.02679926
#> 
#> $ciAvgRdrEachTrtRRRC
#>   Treatment      Area     StdErr        DF  CI Lower  CI Upper
#> 1         0 0.8477499 0.02440215  70.12179 0.7990828 0.8964170
#> 2         1 0.8368951 0.02356642 253.64403 0.7904843 0.8833058
#> 
#> $fFRRC
#> [1] 0.363956
#> 
#> $ndf
#> [1] 1
#> 
#> $ddfFRRC
#> [1] 99
#> 
#> $pFRRC
#> [1] 0.547697
#> 
#> $ciDiffTrtFRRC
#>   Treatment   Estimate     StdErr DF         t   Pr > t    CI Lower
#> 1     0 - 1 0.01085482 0.01799277 99 0.6032876 0.547697 -0.02484675
#>     CI Upper
#> 1 0.04655638
#> 
#> $ciAvgRdrEachTrtFRRC
#>   Treatment      Area     StdErr DF  CI Lower  CI Upper
#> 1         0 0.8477499 0.02710939 99 0.7939590 0.9015408
#> 2         1 0.8368951 0.02744860 99 0.7824311 0.8913591
#> 
#> $ssAnovaEachRdr
#>   Source DF            0           1           2            3
#> 1      T  1 7.389761e-04  0.02307702  0.01476929 4.091217e-05
#> 2      C 99 2.018360e+01 22.12074893 21.21043057 2.825657e+01
#> 3     TC 99 9.064315e+00  7.94764631  6.06166901 5.996496e+00
#> 
#> $msAnovaEachRdr
#>   Source DF            0          1          2            3
#> 1      T  1 0.0007389761 0.02307702 0.01476929 4.091217e-05
#> 2      C 99 0.2038747746 0.22344191 0.21424677 2.854199e-01
#> 3     TC 99 0.0915587344 0.08027926 0.06122898 6.057067e-02
#> 
#> $ciDiffTrtEachRdr
#>   Reader Treatment     Estimate     StdErr DF          t    Pr > t
#> 1      0     0 - 1 0.0038444143 0.04279223 99 0.08983908 0.9285966
#> 2      1     0 - 1 0.0214834916 0.04006975 99 0.53615233 0.5930559
#> 3      2     0 - 1 0.0171867933 0.03499399 99 0.49113552 0.6244176
#> 4      3     0 - 1 0.0009045681 0.03480536 99 0.02598933 0.9793182
#>      CI Lower   CI Upper
#> 1 -0.08106465 0.08875348
#> 2 -0.05802359 0.10099057
#> 3 -0.05224888 0.08662247
#> 4 -0.06815683 0.06996596
#> 
#> $fRRFC
#> [1] 4.694058
#> 
#> $ddfRRFC
#> [1] 3
#> 
#> $pRRFC
#> [1] 0.1188379
#> 
#> $ciDiffTrtRRFC
#>   Treatment   Estimate      StdErr DF        t    Pr > t     CI Lower
#> 1     0 - 1 0.01085482 0.005010122  3 2.166577 0.1188379 -0.005089627
#>     CI Upper
#> 1 0.02679926
#> 
#> $ciAvgRdrEachTrtRRFC
#>   Treatment      Area     StdErr DF  CI Lower  CI Upper
#> 1         0 0.8477499 0.01109801  3 0.8124311 0.8830687
#> 2         1 0.8368951 0.00777173  3 0.8121620 0.8616282

Explanation of the output

The function returns a long unwieldy list. Let us consider them one by one. The function UtilOutputReport() can generate an Excel file report, making it much easier to visualize the results. This is described in another vignette.

FOMs

ret$fomArray
#>           Rdr - 0   Rdr - 1   Rdr - 2   Rdr - 3
#> Trt - 0 0.8534600 0.8649932 0.8573044 0.8152420
#> Trt - 1 0.8496156 0.8435097 0.8401176 0.8143374

This shows the 2 x 4 array of FOM values.

Pseudovalue ANOVA table

ret$anovaY
#>   Source           SS  DF          MS
#> 1      T   0.02356541   1 0.023565410
#> 2      R   0.20521800   3 0.068406000
#> 3      C  52.52839868  99 0.530589886
#> 4     TR   0.01506079   3 0.005020264
#> 5     TC   6.41004881  99 0.064747968
#> 6     RC  39.24295381 297 0.132131158
#> 7    TRC  22.66007764 297 0.076296558
#> 8  Total 121.08532315 799          NA

Pseudovalue ANOVA table, each treatment

ret$anovaYi
#>   Source  DF          0          1
#> 1      R   3 0.04926635 0.02415991
#> 2      C  99 0.29396753 0.30137032
#> 3     RC 297 0.10504787 0.10337984

The 0 and 1 headers come from the treatment names.

Pseudovalue Variance Components

ret$varComp
#>                  varComp
#> Var(R)      3.775568e-05
#> Var(C)      5.125091e-02
#> Var(T*R)   -7.127629e-04
#> Var(T*C)   -2.887147e-03
#> Var(R*C)    2.791730e-02
#> Var(Error)  7.629656e-02

Random-reader random-case (RRRC) analysis

ret$fRRRC
#> [1] 4.694058

F-statistic and p-value for RRRC analysis

  • ddffRRRC is the denominator degrees of freedom of the F-statistic.
ret$ddffRRRC
#> NULL
  • pRRRC is the p-value of the test.
ret$pRRRC
#> [1] 0.1188379

Confidence Intervals for RRRC analysis

  • ciDiffTrtRRRC is the 95% confidence interval of reader-averaged differences between treatments.
ret$ciDiffTrtRRRC
#>   Treatment   Estimate      StdErr DF        t    Pr > t     CI Lower
#> 1     0 - 1 0.01085482 0.005010122  3 2.166577 0.1188379 -0.005089627
#>     CI Upper
#> 1 0.02679926
  • ciAvgRdrEachTrtRRRC is the 95% confidence interval of reader-averaged FOMs for each treatments.
ret$ciAvgRdrEachTrtRRRC
#>   Treatment      Area     StdErr        DF  CI Lower  CI Upper
#> 1         0 0.8477499 0.02440215  70.12179 0.7990828 0.8964170
#> 2         1 0.8368951 0.02356642 253.64403 0.7904843 0.8833058

Fixed-reader random-case (FRRC) analysis

F-statistic and p-value for RRRC analysis

  • fFRRC is the F-statistic for fixed-reader random-case analysis.
ret$fFRRC
#> [1] 0.363956
  • ndf is the numerator degrees of freedom of the F-statistic, always one less than the number of treatments.
ret$ndf
#> [1] 1
  • ddfFRRC is the denominator degreesof freedom of the F-statistic, for fixed-reader random-case analysis.
ret$ddfFRRC
#> [1] 99
  • pFRRC is the p-value for fixed-reader random-case analysis.
ret$pFRRC
#> [1] 0.547697

Confidence Intervals for FRRC analysis

  • ciDiffTrtFRRC is the 95% CI of reader-average differences between treatments for fixed-reader random-case analysis
ret$ciDiffTrtFRRC
#>   Treatment   Estimate     StdErr DF         t   Pr > t    CI Lower
#> 1     0 - 1 0.01085482 0.01799277 99 0.6032876 0.547697 -0.02484675
#>     CI Upper
#> 1 0.04655638
  • ciAvgRdrEachTrtRRFC is the 95% CI of reader-average FOMs of each treatment for fixed-reader random-case analysis
ret$ciAvgRdrEachTrtRRFC
#>   Treatment      Area     StdErr DF  CI Lower  CI Upper
#> 1         0 0.8477499 0.01109801  3 0.8124311 0.8830687
#> 2         1 0.8368951 0.00777173  3 0.8121620 0.8616282

ANOVA for FRRC analysis

  • ssAnovaEachRdr is the sum of squares ANOVA for each reader
ret$ssAnovaEachRdr
#>   Source DF            0           1           2            3
#> 1      T  1 7.389761e-04  0.02307702  0.01476929 4.091217e-05
#> 2      C 99 2.018360e+01 22.12074893 21.21043057 2.825657e+01
#> 3     TC 99 9.064315e+00  7.94764631  6.06166901 5.996496e+00
  • msAnovaEachRdr is the mean squares ANOVA for each reader
ret$msAnovaEachRdr
#>   Source DF            0          1          2            3
#> 1      T  1 0.0007389761 0.02307702 0.01476929 4.091217e-05
#> 2      C 99 0.2038747746 0.22344191 0.21424677 2.854199e-01
#> 3     TC 99 0.0915587344 0.08027926 0.06122898 6.057067e-02

Confidence Intervals for FRRC analysis

  • ciDiffTrtFRRC is the CI for reader-averaged treatment differences, for fixed-reader random-case analysis
ret$ciDiffTrtFRRC
#>   Treatment   Estimate     StdErr DF         t   Pr > t    CI Lower
#> 1     0 - 1 0.01085482 0.01799277 99 0.6032876 0.547697 -0.02484675
#>     CI Upper
#> 1 0.04655638

Random-reader fixed-case (RRFC) analysis

F-statistic and p-value for RRFC analysis

  • fRRFC is the F-statistic for for random-reader fixed-case analysis
ret$fRRFC
#> [1] 4.694058
  • ddfRRFC is the ddf for for random-reader fixed-case analysis
ret$ddfRRFC
#> [1] 3
  • pRRFC is the p-value for for random-reader fixed-case analysis
ret$pRRFC
#> [1] 0.1188379

Confidence Intervals for RRFC analysis

  • ciDiffTrtRRFC is the CI for reader-averaged inter-treatment FOM differences for random-reader fixed-case analysis
ret$ciDiffTrtRRFC
#>   Treatment   Estimate      StdErr DF        t    Pr > t     CI Lower
#> 1     0 - 1 0.01085482 0.005010122  3 2.166577 0.1188379 -0.005089627
#>     CI Upper
#> 1 0.02679926
  • ciAvgRdrEachTrtRRFC is the CI for treatment FOMs for each reader for random-reader fixed-case analysis
ret$ciAvgRdrEachTrtRRFC
#>   Treatment      Area     StdErr DF  CI Lower  CI Upper
#> 1         0 0.8477499 0.01109801  3 0.8124311 0.8830687
#> 2         1 0.8368951 0.00777173  3 0.8121620 0.8616282