The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Version: 0.1.1
What | kibior is a R package dedicated to ease the pain of
data handling in science, and more notably with biological data. |
Where | kibior is using Elasticsearch as database
and search engine. |
Who | kibior is built for data science and data manipulation,
so when any data-related action or need is involved, notably
sharing data . It mainly targets bioinformaticians, and more
broadly, data scientists. |
When | Available now from this repository, or CRAN repository. |
Public instances | Use the $get_kibio_instance() method to connect to
Kibio and access known datasets. See
Kibio datasets at the end of this document for a complete
list. |
Cite this package | In R session, run citation("kibior") |
Publication | coming soon . |
This package allows:
Pushing
, pulling
, joining
,
sharing
and searching
tabular data between an
R session and one or multiple Elasticsearch instances/clusters.Massive data query and filter
with Elasticsearch
engine.Multiple living Elasticsearch connections
to different
addresses.Method autocompletion
in proper environments (e.g. R
cli, RStudio).Import and export datasets
from an to files.Server-side execution
for most of operations (i.e. on
Elasticsearch instances/clusters).# Get from CRAN
install.packages("kibior")
# or get the latest from Github
::install_github("regisoc/kibior") devtools
# load
library(kibior)
# Get a specific instance
<- Kibior$new("server_or_address", port)
kc
# Or try something bigger...
<- Kibior$get_kibio_instance()
kibio $list() kibio
Here is an extract of some of the features proposed by
KibioR
. See Introduction
vignette for more
advanced usage.
push
datasets# Push data (R memory -> Elasticsearch)
::starwars %>% kc$push("sw")
dplyr::storms %>% kc$push("st") dplyr
pull
datasets# Pull data with columns selection (Elasticsearch -> R memory)
$pull("sw", query = "homeworld:(naboo || tatooine)",
kccolumns = c("name", "homeworld", "height", "mass", "species"))
# see vignette for query syntax
copy
datasets# Copy dataset (Elasticsearch internal operation)
$copy("sw", "sw_copy") kc
delete
datasets
# Delete datasets
$delete("sw_copy") kc
list
,
match
dataset names# List available datasets
$list()
kc
# Search for index names starting with "s"
$match("s*") kc
columns
names and list unique keys
in
values# Get columns of all datasets starting with "s"
$columns("s*")
kc
# Get unique values of a column
$keys("sw", "homeworld") kc
# Count number of lines in dataset
$count("st")
kc
# Count number of lines with query (name of the storm is Anita)
$count("st", query = "name:anita")
kc
# Generic stats on two columns
$stats("sw", c("height", "mass"))
kc
# Specific descriptive stats with query
$avg("sw", c("height", "mass"), query = "homeworld:naboo") kc
join
# Inner join between:
# 1/ a Elasticsearch-based dataset with query ("sw"),
# 2/ and a in-memory R dataset (dplyr::starwars)
$inner_join("sw", dplyr::starwars,
kcleft_query = "hair_color:black",
left_columns = c("name", "mass", "height"),
by = "name")
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.