-
Notifications
You must be signed in to change notification settings - Fork 34
Is the training data available? #6
Comments
Hi @dirk61 thanks for your interest. Unfortunately the training data is not available but I am happy to provide further information if needed |
Thanks. @faroit For the data accessed from clean-360 datasetd, did you simply add up these audios together according to the value of k? Before the transformation to time-frequency matirx, what else did you do to form the wav files for training? |
No, for the time-domain signals there were a few important steps involved to sample the data:
the mixing was applied by normalizing each track to have the same SNR to each other and the final mix was peak normalized
yes |
Thanks for the detailed explanation! Awesome :)
Are these utterances randomly chosen from the There's another thing in the article I can't quite understand, which is:
If randomly chosen, why do these possibilites occur? |
@dirk61 sorry for the late reply (feel free to close):
yes, they were chopped.
not sure if I understand correctly: ideally the overlap should be 100% for all k, but since speakers still make pauses between words, the actual overlap is less than that. |
Hey! I really love your work and I'm wondering whether you can provide the training data you synthesized from LibirSpeech clean-360 dataset? That would help a lot!
The text was updated successfully, but these errors were encountered: