constrained-decoding
The official zero-trust, high-throughput kinetic execution engine for the coreason-manifest ontology.
Speculative grammar backtracking algorithm for LLM decoding conforming to some lark context-free grammar (CFG)
ReFactX: Scalable Reasoning with Reliable Facts via Constrained Generation
[Pytorch] Efficient tokenization library for recommendations and generative retrieval using pre-trained language models. Inspired by "Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators"