automated-audio-captioning
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding