Gpu Inference Python Packages

krasis

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware

6K 477 27

tightwad

Mixed-vendor GPU inference cluster manager with speculative decoding

809 25 3

vulkan-ilm

Pythonic LLM inference on legacy GPUs using Vulkan — GPU-accelerated local AI for AMD, Intel, and NVIDIA without CUDA.

159 39 0