The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

santoku santoku logo

CRAN status Lifecycle: stable CRAN Downloads Per Month R-universe R-CMD-check AppVeyor build status Codecov test coverage

santoku is a versatile cutting tool for R. It provides chop(), a replacement for base::cut().

Installation

Install from r-universe:

install.packages("santoku", repos = c("https://hughjonesd.r-universe.dev", 
                                      "https://cloud.r-project.org"))

Or from CRAN:

install.packages("santoku")

Or get the development version from github:

# install.packages("remotes")
remotes::install_github("hughjonesd/santoku")

Advantages

Here are some advantages of santoku:

These advantages make santoku especially useful for exploratory analysis, where you may not know the range of your data in advance.

Examples

library(santoku)

chop returns a factor:

chop(1:5, c(2, 4))
#> [1] [1, 2) [2, 4) [2, 4) [4, 5] [4, 5]
#> Levels: [1, 2) [2, 4) [4, 5]

Include a number twice to match it exactly:

chop(1:5, c(2, 2, 4))
#> [1] [1, 2) {2}    (2, 4) [4, 5] [4, 5]
#> Levels: [1, 2) {2} (2, 4) [4, 5]

Use names in breaks for labels:

chop(1:5, c(Low = 1, Mid = 2, High = 4))
#> [1] Low  Mid  Mid  High High
#> Levels: Low Mid High

Or use lbl_* functions:

chop(1:5, c(2, 4), labels = lbl_dash())
#> [1] 1—2 2—4 2—4 4—5 4—5
#> Levels: 1—2 2—4 4—5

Chop into fixed-width intervals:

chop_width(runif(10), 0.1)
#>  [1] [0.1068, 0.2068)   [0.6068, 0.7068)   [0.9068, 1.007]    [0.006763, 0.1068)
#>  [5] [0.9068, 1.007]    [0.3068, 0.4068)   [0.6068, 0.7068)   [0.1068, 0.2068)  
#>  [9] [0.4068, 0.5068)   [0.5068, 0.6068)  
#> 7 Levels: [0.006763, 0.1068) [0.1068, 0.2068) ... [0.9068, 1.007]

Or into fixed-size groups:

chop_n(1:10, 5)
#>  [1] [1, 6)  [1, 6)  [1, 6)  [1, 6)  [1, 6)  [6, 10] [6, 10] [6, 10] [6, 10]
#> [10] [6, 10]
#> Levels: [1, 6) [6, 10]

Chop dates by calendar month, then tabulate:

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

dates <- as.Date("2021-12-31") + 1:90

tab_width(dates, months(1), labels = lbl_discrete(fmt = "%d %b"))
#> 01 Jan—31 Jan 01 Feb—28 Feb 01 Mar—31 Mar 
#>            31            28            31

For more information, see the vignette.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.