Adds multi-language support for VITS onnx, fixes onnx exporting and inference errors #2816

SystemPanic · 2023-07-29T20:26:48Z

Adds multi-language support for VITS onnx
Fixes [Bug] RuntimeError: Given groups=1, weight of size [196, 196, 1], expected input[1, 192, 100] to have 196 channels, but got 192 channels instead #2753 incorrect number of channels when having language_emb_dim > 0
Fixes onnx inference error when speaker_id is None or not passed as an argument
Fixes onnx exporting error AttributeError: 'Vits' object has no attribute 'disc' for models with init_discriminator=false

Tested with multi-speaker and multi-language models, and with single speaker and single language, using the following script:

import torch
import os
import numpy as np
from TTS.tts.models.vits import Vits
from TTS.tts.configs.vits_config import VitsConfig
from TTS.utils.audio.numpy_transforms import save_wav

modelPath = "MULTILANG_MULTISPEAKER_PATH"
speaker_id = 0 '''None if no multi-speaker model'''
language_id = 0 '''None if no multi-language model'''

config = VitsConfig()
config.load_json(os.path.join(modelPath, "config.json"))
vits = Vits.init_from_config(config)

vits.load_onnx(os.path.join(modelPath, "MULTILANG_MULTISPEAKER_PATH.onnx"))

text = "LONG TEXT HERE"
text_inputs = np.asarray(
    vits.tokenizer.text_to_ids(text),
    dtype=np.int64,
)[None, :]

audio = vits.inference_onnx(text_inputs, speaker_id=speaker_id, language_id=language_id)
save_wav(wav=audio[0], path=os.path.join(os.path.dirname(__file__), 'test.wav'), sample_rate=config.audio.sample_rate)

… when speaker_id is None or not passed, fixes onnx exporting for models with init_discriminator=false

CLAassistant · 2023-07-29T20:26:52Z

All committers have signed the CLA.

erogol · 2023-07-31T08:20:03Z

Thanks for the PR. There is one CI error but it is not about your PR.

… when speaker_id is None or not passed, fixes onnx exporting for models with init_discriminator=false (coqui-ai#2816)

Adds multi-language support for VITS onnx, fixes onnx inference error…

2b30405

… when speaker_id is None or not passed, fixes onnx exporting for models with init_discriminator=false

erogol merged commit c140df5 into coqui-ai:dev Jul 31, 2023
41 of 44 checks passed

SystemPanic deleted the onnxmultilang branch August 1, 2023 18:13

Tindell pushed a commit to pugtech-co/TTS that referenced this pull request Sep 4, 2023

Adds multi-language support for VITS onnx, fixes onnx inference error…

e645035

… when speaker_id is None or not passed, fixes onnx exporting for models with init_discriminator=false (coqui-ai#2816)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds multi-language support for VITS onnx, fixes onnx exporting and inference errors #2816

Adds multi-language support for VITS onnx, fixes onnx exporting and inference errors #2816

SystemPanic commented Jul 29, 2023

CLAassistant commented Jul 29, 2023 •

edited

Loading

erogol commented Jul 31, 2023

Adds multi-language support for VITS onnx, fixes onnx exporting and inference errors #2816

Adds multi-language support for VITS onnx, fixes onnx exporting and inference errors #2816

Conversation

SystemPanic commented Jul 29, 2023

CLAassistant commented Jul 29, 2023 • edited Loading

erogol commented Jul 31, 2023

CLAassistant commented Jul 29, 2023 •

edited

Loading