PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Bpe Tokenizer Python Packages

Python packages with the GitHub topic bpe-tokenizer. Sorted by relevance, with stars and monthly downloads.
gweidart
rs-bpe

A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust

18K 38 5
neluca
tinybpe

🐍This is a fast, lightweight, and clean CPython extension for the Byte Pair Encoding (BPE) algorithm, which is commonly used in LLM tokenization and NLP tasks.

1K 4 0
SauravP97
hf-tokenizer-visualizer

Visualize HuggingFace Byte-Pair Encoding (BPE) Tokenizer encoding process

664 2 0
jaco-bro
tokenizerz

Minimal BPE tokenizer in Zig

573 7 3
U4RASD
rbpe

R-BPE: Improving BPE-Tokenizers with Token Reuse

453 8 2
sefineh-ai
amharic-tokenizer

Syllable-aware BPE tokenizer for the Amharic language (αŠ αˆ›αˆ­αŠ›) – fast, accurate, trainable.

440 99 14
TnsaAi
tokenize2

Official Repository of Tokenize2 Tokenizers by TNSA

176 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery