In this vignette, I use the space-time permutation scan to show making data in R, running SaTScan on the generated data, and collecting the results.

I begin by making data on a 10*10 grid of locations, over 30 days. Each day, each location has a 0.1 probability of having a single case.

set.seed(42)
mygeo = expand.grid(1:10,1:10)
daysbase = 30
locid = rep(1:100, times=daysbase)
basecas = rbinom(3000, 1, .1)
day = rep(1:30, each = 100)
mycas = data.frame(locid,basecas, day)

Here’s what the geo and case files look like. I’m using generic time, for convenience.

head(mygeo)
##   Var1 Var2
## 1    1    1
## 2    2    1
## 3    3    1
## 4    4    1
## 5    5    1
## 6    6    1
head(mycas)
##   locid basecas day
## 1     1       1   1
## 2     2       1   1
## 3     3       0   1
## 4     4       0   1
## 5     5       0   1
## 6     6       0   1

Now I can write the data into the OS; the row names in the mygeo data.frame object are the location IDs for SaTSCan, so I’m using the userownames option to use, rather than ignore, the row names from R in the geography file; in the case file, there is an explicit column with the same information included.

library("rsatscan")
td = tempdir()
write.geo(mygeo, location = td, file = "mygeo", userownames=TRUE)
write.cas(mycas, location = td, file = "mycas")

Now I’m ready to build the parameter file. This is adapted pretty closely from the NYCfever example in the rsatscan vignette.

invisible(ss.options(reset=TRUE))
ss.options(list(CaseFile="mycas.cas", PrecisionCaseTimes=4))
ss.options(list(StartDate="1", CoordinatesType=0, TimeAggregationUnits=4))
ss.options(list(EndDate="30", CoordinatesFile="mygeo.geo", AnalysisType=4, ModelType=2)) 
ss.options(list(UseDistanceFromCenterOption="y", MaxSpatialSizeInDistanceFromCenter=3)) 
ss.options(list(NonCompactnessPenalty=0, MaxTemporalSizeInterpretation=1, MaxTemporalSize=7))
ss.options(list(ProspectiveStartDate="30", ReportGiniClusters="n", LogRunToHistoryFile="n"))

Then I write the parameter file into the OS and run SaTScan using it. I’ll peek in the summary cluster table to see what we got.

write.ss.prm(td, "mybase")
# This step omitted in compliance with CRAN policies
# Please install SaTScan and run the vignette with this and following code uncommented
# SaTScan can be downloaded from www.satscan.org, free of charge
# you will also find there fully compiled versions of this vignette with results

# mybase = satscan(td, "mybase")
# mybase$col[3:10]

As one would hope, there’s no evidence of a meaningful cluster.

Now, let’s add a day just like the others. I’ll stick it onto the end of the previous data, then write out a new case file.

newday = data.frame(locid = 1:100, basecas = rbinom(100,1,.1), day = 31)
newcas = rbind(mycas,newday)
write.cas(newcas, location = td, file = "mycas")

I don’t need to re-assign any parameter values that don’t change between runs. In this case, since I used the same name for the data file, I only need to change the end date of the surveillance period.

ss.options(list(EndDate="31"))
write.ss.prm(td, "day1")

# day1 = satscan(td, "day1")
# day1$col[3:10]

Again, no clusters, as we would expect.

But now let’s make a cluster appear. I create an additional time unit as before, but then select a location to get a heap of extra cases. Glue the new day to the end of the old case file, write it to the OS, change the end date, and re-run SaTScan.

newday = data.frame(locid = 1:100, basecas = rbinom(100,1,.1), day = 32)
newday$basecas[20] =5
newcas = rbind(mycas,newday)

write.cas(newcas, location = td, file = "mycas")

ss.options(list(EndDate="32"))
write.ss.prm(td, "day2")

# day2 = satscan(td,"day2")
# day2$col[3:10]

This demonstrates that I did detect what I inserted. I can also extract the wordier section of the report about this cluster.

# summary(day2)
# cat(day2$main[20:31],fill=1)