autoregressive-transformers
Explorations into adversarial losses on top of autoregressive loss for language modeling