The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Introduction to rchroma

Introduction

The rchroma package provides an R interface to ChromaDB, a vector database for storing and querying embeddings. This vignette demonstrates the basic usage of the package.

Installation

You can install the development version of rchroma from GitHub:

# install.packages("remotes")
remotes::install_github("cynkra/rchroma")

Installing ChromaDB

Before using rchroma, you need to have a running ChromaDB instance. The easiest way to get started is using Docker:

docker pull chromadb/chroma
docker run -p 8000:8000 chromadb/chroma

This will start a ChromaDB server on http://localhost:8000.

For other installation methods and configuration options, please refer to the ChromaDB documentation.

Basic Usage

Connecting to ChromaDB

First, we need to establish a connection to ChromaDB:

library(rchroma)

# Connect to a local ChromaDB instance
client <- chroma_connect()

# Check the connection
heartbeat(client)
version(client)

Managing Collections

Collections are the main way to organize your data in ChromaDB:

# Create a new collection
create_collection(client, "my_collection")

# List all collections
list_collections(client)

# Get a specific collection
get_collection(client, "my_collection")

Working with Documents

Documents are the basic unit of data in ChromaDB. Each document consists of text content and its associated embedding:

# Add documents with embeddings
docs <- c(
  "apple fruit",
  "banana fruit",
  "carrot vegetable"
)
embeddings <- list(
  c(1.0, 0.0, 0.0),  # apple
  c(0.8, 0.2, 0.0),  # banana (similar to apple)
  c(0.0, 0.0, 1.0)   # carrot (different)
)

# Add documents to the collection
add_documents(
  client,
  "my_collection",
  documents = docs,
  ids = c("doc1", "doc2", "doc3"),
  embeddings = embeddings
)

# Query similar documents using embeddings
results <- query(
  client,
  "my_collection",
  query_embeddings = list(c(1.0, 0.0, 0.0)),  # should match apple best
  n_results = 2
)

Updating and Deleting

You can update or delete documents as needed:

# Update embedding separately
update_documents(
  client,
  "my_collection",
  ids = "doc1",
  embeddings = list(c(0.9, 0.1, 0.0))  # slightly different from original apple
)

# Delete documents
delete_documents(client, "my_collection", ids = "doc2")  # removes banana

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.