Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dump IPA transcription. Also fix some wrong words #115

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

Tomotz
Copy link

@Tomotz Tomotz commented Sep 13, 2024

  • Add option to dump the IPA transcription of the input text instead of creating a wav file.
    This is done when you pass -i to the cli. Input can still be from file/arg, and output is default to screen, but can be set to a file as well.
  • When some non-existent words that flite doesn't know how to pronounce, appears in the input (for example wannth), they were replaced by the names of each letter. That both sounded terrible, and was non understandable in the IPA transcription. I changed it to sound and look better (wannth - dʌbəljueɪɛnɛntieɪtʃ->wɑnθ etc.)
  • Remove the "fix_ah" function. This function changed all the ʌ sounds into ɑ. This just sounded wrong to me (example ʌv->ɑv)
  • Manual fixes to a few words in the lex data file:
    1. Challenge - tʃælədʒ->tʃæləndʒ
    2. Isn't - ɪsnt -> ɪzənt
    3. Suggest - səɡˈdʒɛst->səˈdʒɛst
      note - I did not fix the byte numbers in all the lines after the words I fixed. Not sure if those are important to anything, but since I did the fixes manually, it was to much of a hustle to fix.
  • Hacky fix for any word containing Suggest in it (suggestion, suggested, etc.) - apply_model gave them all a wrong g sound, but since changing the model table is very complected (I assume you somehow generate that one automatically) I solved it in a hacky find and replace way

I don't think this pull request will (or even should) be merged as is, but I would love it if I can get some of this stuff into master, and if you can fix all the misspelled words I found in a less hacky way

I think it's better they sound a bit weird than reading out the names of all the letters
Challenge - tʃælədʒ->tʃæləndʒ
Isn't - ɪsnt -> ɪzənt
Suggest - səɡˈdʒɛst->səˈdʒɛst
səɡˈdʒɛst->səˈdʒɛst
I assume the correct way to fix would be to change the rules in cst_lts_model, but I'm afraid any change I'll try doing there will mess up many things
@Tomotz Tomotz changed the title Tomm dump ipa Dump IPA transcription. Also fix some wrong words Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant