The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Title: Customisable Ranking of Numerical and Categorical Data
Version: 0.1.1
Description: Provides a flexible alternative to the built-in rank() function called smartrank(). Optionally rank categorical variables by frequency (instead of in alphabetical order), and control whether ranking is based on descending/ascending order. smartrank() is suitable for both numerical and categorical data.
License: MIT + file LICENSE
Suggests: covr, dplyr, knitr, rmarkdown, testthat (≥ 3.0.0)
Config/testthat/edition: 2
Encoding: UTF-8
RoxygenNote: 7.3.2
URL: https://github.com/selkamand/rank, https://selkamand.github.io/rank/
BugReports: https://github.com/selkamand/rank/issues
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2024-12-01 21:59:58 UTC; selkamand
Author: Sam El-Kamand ORCID iD [aut, cre, cph]
Maintainer: Sam El-Kamand <sam.elkamand@gmail.com>
Repository: CRAN
Date/Publication: 2024-12-01 22:30:02 UTC

Rank a vector based on either alphabetical or frequency order

Description

This function acts as a drop-in replacement for the base rank() function with the added option to:

  1. Rank categorical factors based on frequency instead of alphabetically

  2. Rank in descending or ascending order

Usage

smartrank(
  x,
  sort_by = c("alphabetical", "frequency"),
  desc = FALSE,
  ties.method = "average",
  na.last = TRUE,
  verbose = TRUE
)

Arguments

x

A numeric, character, or factor vector

sort_by

Sort ranking either by "alphabetical" or "frequency" . Default is "alphabetical"

desc

A logical indicating whether the ranking should be in descending ( TRUE ) or ascending ( FALSE ) order. When input is numeric, ranking is always based on numeric order.

ties.method

a character string specifying how ties are treated, see ‘Details’; can be abbreviated.

na.last

a logical or character string controlling the treatment of NAs. If TRUE, missing values in the data are put last; if FALSE, they are put first; if NA, they are removed; if "keep" they are kept with rank NA.

verbose

verbose (flag)

Details

If x includes ‘ties’ (equal values), the ties.method argument determines how the rank value is decided. Must be one of:

NA values are never considered to be equal: for na.last = TRUE and na.last = FALSE they are given distinct ranks in the order in which they occur in x.

Value

The ranked vector

Note

When sort_by = "frequency", ties based on frequency are broken by alphabetical order of the terms

When sort_by = "frequency" and input is character, ties.method is ignored. each distinct element level gets its own rank, and each rank is 1 unit away from the next element, irrespective of how many duplicates

Examples


# ------------------
## CATEGORICAL INPUT
# ------------------

fruits <- c("Apple", "Orange", "Apple", "Pear", "Orange")

# rank alphabetically
smartrank(fruits)
#> [1] 1.5 3.5 1.5 5.0 3.5

# rank based on frequency
smartrank(fruits, sort_by = "frequency")
#> [1] 2.5 4.5 2.5 1.0 4.5

# rank based on descending order of frequency
smartrank(fruits, sort_by = "frequency", desc = TRUE)
#> [1] 1.5 3.5 1.5 5.0 3.5

# sort fruits vector based on rank
ranks <- smartrank(fruits,sort_by = "frequency", desc = TRUE)
fruits[order(ranks)]
#> [1] "Apple"  "Apple"  "Orange" "Orange" "Pear"


# ------------------
## NUMERICAL INPUT
# ------------------

# rank numerically
smartrank(c(1, 3, 2))
#> [1] 1 3 2

# rank numerically based on descending order
smartrank(c(1, 3, 2), desc = TRUE)
#> [1] 3 1 2

# always rank numeric vectors based on values, irrespective of sort_by
smartrank(c(1, 3, 2), sort_by = "frequency")
#> smartrank: Sorting a non-categorical variable. Ignoring `sort_by` and sorting numerically
#> [1] 1 3 2

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.