Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat/903: enabling vad when config.vad.enable is true #905

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

rammohan-y
Copy link
Contributor

@rammohan-y rammohan-y commented Sep 18, 2024

feat/903: enabling vad when config.vad.enable is true

#903

@vdharashive
Copy link
Contributor

@davehorton can you pls review it

@davehorton
Copy link
Contributor

@rammohan-y @vdharashive
I am not sure what you are trying to do here, and I think we are confusing two different vad-related features.

  1. Recently, you spoke of wanting to delay the ASR connection for microsoft until vad is detected. We used to have a feature like that built into the mod_azure_transcribe and you can see remnants of it here. The code that set that env var has been removed from the feature server though (the feature was thought to be problematic in terms of creating support issues if we connected "too late" to catch some speech. So this feature, which is what I thought you have been talking about recently, is not currently available and your changes in this PR do not resurrect it.
  2. If someone wants bargein to happen even faster than before we get a partial transcript back with one word back, they can specify minBargeinWords = 0 and then we use vad to detect speech energy barge in quickly.

Now, with that explanation, what exactly are you hoping to accomplish with this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants