The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

rtiktoken: A Byte-Pair-Encoding (BPE) Tokenizer for OpenAI's Large Language Models

A thin wrapper around the tiktoken-rs crate, allowing to encode text into Byte-Pair-Encoding (BPE) tokens and decode tokens back to text. This is useful to understand how Large Language Models (LLMs) perceive text.

Version: 0.0.6
Suggests: testthat (≥ 3.0.0)
Published: 2024-11-06
DOI: 10.32614/CRAN.package.rtiktoken
Author: David Zimmermann-Kollenda [aut, cre], Roger Zurawicki [aut] (tiktoken-rs Rust library), Authors of the dependent Rust crates [aut] (see AUTHORS file)
rtiktoken author details
Maintainer: David Zimmermann-Kollenda <david_j_zimmermann at hotmail.com>
BugReports: https://github.com/DavZim/rtiktoken/issues
License: MIT + file LICENSE
URL: https://davzim.github.io/rtiktoken/, https://github.com/DavZim/rtiktoken/
NeedsCompilation: yes
SystemRequirements: Cargo (Rust's package manager), rustc >= 1.65.0
Materials: README NEWS
CRAN checks: rtiktoken results

Documentation:

Reference manual: rtiktoken.pdf

Downloads:

Package source: rtiktoken_0.0.6.tar.gz
Windows binaries: r-devel: rtiktoken_0.0.6.zip, r-release: rtiktoken_0.0.6.zip, r-oldrel: rtiktoken_0.0.6.zip
macOS binaries: r-release (arm64): rtiktoken_0.0.6.tgz, r-oldrel (arm64): rtiktoken_0.0.6.tgz, r-release (x86_64): rtiktoken_0.0.6.tgz, r-oldrel (x86_64): rtiktoken_0.0.6.tgz

Linking:

Please use the canonical form https://CRAN.R-project.org/package=rtiktoken to link to this page.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.