Study recognizability of speech over different methods #161

neumantm · 2018-07-09T13:22:16Z

MakinZeel · 2018-07-16T06:02:43Z

Test wake-up call with and without music (CMU & DeepSpeech & Google Cloud Speech)
Text: "Amy wake up"
Microphone: HeadSet Microphone (SADES SA903)

Info:

cmu uses Input Stream from Microphone
DeepSpeech uses .wav files (16khz, Mono)
Google Using .wav files (16khz, Mono)

whole Amy system uses ~1.5gb ram
DeepSpeech-recognition uses ~1.0 gb ram
Google runs Online (https://cloud.google.com/speech-to-text/)

Amy running on Windows
DeepSpeech running ob Ubuntu Subsystem on Windows
Google runs Online (https://cloud.google.com/speech-to-text/)

cmu uses Grammar
DeepSpeech uses pretrained free Model
Google uses Cloud and free Model

Time Consumed:

cmu: ~1-2s after finishing to talk
DeepSpeech: ~ .wav file lenght * 5
Google: <1/5 of .wav file lenght

Test1 - Without Music:

cmu:
    Recognized: True
ds:
    Recognized: True (amy wake up)
Google:
    Recognized: True (Amy wake up)

Test2 - With loud Music:

cmu:
    Recognized: False (amy does not wake up)
ds:
    Recognized: False (Random words, sometimes 'wake' or 'up')
Google:
    Recognized: True (Amy wake up)

Test3 - With medium loud Music (voice louder than music by large margin):

cmu:
    Recognized: True (sometimes you have to repeat the command)
ds:
    Recognized: False (Random words, sometimes 'wake' or 'up')
Google:
    Recognized: True (Amy wake up)

Hobbitsloth · 2018-07-16T07:42:46Z

Test wake-up call with and without music (CMU)
Text: "Amy wake up"
Microphone: Samson Go Mic, PS3 EyeToy
Setup: Distance to mic ~75 cm, Sound from Monitor directly behind the mic
Test: Say Text 50 times back to back in every case. (I say "Amy wake up" without sending Amy back to sleep in between.)

Time Consumed:
cmu: <1s after finishing to talk

Test 1 - Without Music:
No background noise.

cmu:
- Samson Go Mic: good 7/10
- PS3 EyeToy: good 8/10

Test 2 - With medium loud Music:
I can speak at a normal volume. Sound volume is at 30% in my case.

cmu:
- Samson Go Mic: good 8/10
- PS3 EyeToy: moderate 6/10

Test 3 - With loud Music:
I have to scream in the hope that Amy can understand me. Sound volume is at 100% in my case.

cmu:
- Samson Go Mic: bad 1/10
- PS3 EyeToy: not at all 0/10

buddy200 · 2018-07-23T16:08:59Z

The tests were made with the USB Mikrophone Seacue (10€) and the build in micro phone from a hp pro book 440 G4.
comparative tests has shown that the build in micro from the notebook is equal to the usb microphone.

All tests are tested with serveral commands from different plugins (for each speech recognition the same commands).

Normal Commands:
CMU best recognize commands with the length of 4 words (8/10).
Commands with less then 4 words was recognize equal to 4 words but the commands was often recognized to the wrong time. e.g if nobody talks
Commands with more then 4 words were recognize clearly worse (6/10).
Commands with 10 or more words only 2/10.

Google Speech recognize commands with all lengths very well (9/10).

Number:
CMU can't recognize numbers at the moment. of 100 tests with 5 different numbers only 2 times was the correct number recognized.
Google can recognize numbers very well. on average 9/10.

Times:
CMU not possible see number.
Google can most of the time recognize the most times in diffrent formats. (8/10)
But single times are fromated strange e.g. 8 o'clock pm gets 2000 (every time by this time). the same behavior can be seen with long commands.

neumantm changed the title ~~Study recongizability of speech over different methods~~ Study recognizability of speech over different methods Jul 9, 2018

neumantm added the User Story label Jul 9, 2018

neumantm added this to the Sprint 3 milestone Jul 9, 2018

Legion2 added the Priority: High label Jul 9, 2018

FlxB2 mentioned this issue Jul 23, 2018

Sprint 3 review demo presentation #162

Closed

Legion2 modified the milestones: Sprint 3, Sprint 4 Jul 31, 2018

neumantm added Priority: Low and removed Priority: High labels Aug 13, 2018

Legion2 pinned this issue Jan 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Study recognizability of speech over different methods #161

Study recognizability of speech over different methods #161

neumantm commented Jul 9, 2018 •

edited by buddy200

Loading

MakinZeel commented Jul 16, 2018 •

edited

Loading

Hobbitsloth commented Jul 16, 2018 •

edited

Loading

buddy200 commented Jul 23, 2018 •

edited

Loading

Study recognizability of speech over different methods #161

Study recognizability of speech over different methods #161

Comments

neumantm commented Jul 9, 2018 • edited by buddy200 Loading

MakinZeel commented Jul 16, 2018 • edited Loading

Hobbitsloth commented Jul 16, 2018 • edited Loading

buddy200 commented Jul 23, 2018 • edited Loading

neumantm commented Jul 9, 2018 •

edited by buddy200

Loading

MakinZeel commented Jul 16, 2018 •

edited

Loading

Hobbitsloth commented Jul 16, 2018 •

edited

Loading

buddy200 commented Jul 23, 2018 •

edited

Loading