fp8
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
Repository of IntelĀ® Intel Extension for Transformers
FP8 per-tile scaled linear solver for consumer NVIDIA GPUs
JAX Scalify: end-to-end scaled arithmetics