A dataset containing sample IDs and paths from Ardila et al (2019)
'Common voice: A massively-multilingual speech corpus',
used in Zabala (2023) 'voice: new approaches to audio analysis'.
The considered sample contains 34,425 rows associated with
838 IDs (p_s = 2.4%).
Usage
mozilla_id_path
Arguments
References
Ardila R, Branson M, Davis K, Henretty M, Kohler M, Meyer J, Morais R, Saunders L, Tyers FM, Weber G (2019). "Common voice: A massively-multilingual speech corpus." arXiv preprint arXiv:1912.06670. URL https://arxiv.org/abs/1912.06670.