bitsandbytes
A "standard library" of Triton kernels.
Effortlessly quantize, benchmark, and publish Hugging Face models with cross-platform support for CPU/GPU. Reduce model size by 75% while maintaining performance.
🦖 X—LLM: Cutting Edge & Easy LLM Finetuning