audio-captioning
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
Audio Captioning datasets for PyTorch.
Using pretrained encoder and language models to generate captions from multimedia inputs.
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding