• logo_github2
  • logo_gmail3

©2021 by Gwena Cunha

Datasets

Title Generation

Dataset obtained online from arXiv articles for the task of Abstract to Title Generation.

Annotated MV

Pre-processing of COGNIMUSE (annotated music video dataset) for the task of Emotional Music Generation.

Meeting Summary

Obtains Transcript and Summary (Abstractive and Extractive) from the AMI Meeting Corpus.

STT Error

Repository to make dataset with Speech-To-Text error by applying TTS and STT to text.

Error Correction

Extracts original and corrected essays from the FCE Corpus: XML to TXT format conversion.