tensor-parallelism
Easy way to efficiently run 100B+ language models without high-end GPUs
Slicing a PyTorch Tensor Into Parallel Shards
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading