grouped-gemm
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it