cosine-normalization
Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI