Skip to content

Use Watson Speech to Text, Natural Language Understanding, and Knowledge Studio in a web app with React components.

License

Notifications You must be signed in to change notification settings

arjungujral24/Watson-Danger-Response

Repository files navigation

Building a Watson Danger Response Tool with Custom NLU Domain Model

In moments of a disaster such as fires, floods, and shootings, victims need first response care as soon as possible. For example, the 2018 ‘Camp Fire’ in Northern California wiped out the town of Paradise, killing 85 people, and causing $16.5 billion in damages. In this case, and in many similar situations, the difference between a few seconds in response time meant not only lives saved, but millions and even billions of dollars salvaged in the potential destruction of valuable assets.

How can we develop a solution that will reduce the response time of first responders, and provide a positive impact for those who need it most? Well, when a victim is normally in danger, they call 911 and begin a dialogue with a dispatcher. While this conversation is ongoing, the Watson Danger Response Tool will screen the conversation for select dangers including fires and floods. If one of those dangers are identified, the appropriate authorities will be immediately alerted, thus shaving off precious seconds and sometimes even minutes off the response time. In this code pattern, using the power of Watson AI, we will create this danger response web app.

Below is a demo of the final product to inspire you to complete the code pattern!

IMAGE ALT TEXT

Flow

  1. Audio input is captured by Watson Speech-to-Text service.
  2. Once input text is transcribed, it is sent to the Watson Natural Language Understanding (NLU) service.
  3. Within the NLU, a custom Knowledge Studio machine learning model processes the text for danger.
  4. The danger analysis from the machine learning model is then sent to the NLU.
  5. The final output of the NLU is presented, and decision is made whether or not to alert the authorities.

architecture1

Built with React components and a Node.js server, the app will capture audio input and stream it to a Watson Speech to Text service. After the input speech is transcribed, it will be sent to a Watson Natural Language Understanding service that will identify, categorize, and score the danger threat in the text. Both the input speech and the danger analysis will be displayed in the app.

The key aspect of this tool is the NLU. Using Knowledge Studio, we will train a custom machine learning model for the ‘relations’ feature of the Watson NLU in order to drive the decision-making process of identifying the danger. We will create an ‘entity’ for the danger itself, and corresponding subcategories for each of the dangers-of-interest, for example ‘fire’ and ‘flood’. We will also create an ‘entity’ for the object that the danger is acting on.

architecture2

The model will then identify and categorize the danger by reaching a minimum confidence threshold for a particular subcategory of danger. Additionally, the model will determine the severity of the danger, by analyzing the strength of the relation between the danger and object entities.

architecture3

When you have completed this code pattern, you will understand how to:

  • Stream audio to Speech to Text using a WebSocket
  • Use Natural Language Understanding with a REST API
  • Retrieve and parse text from Speech to Text using a REST API
  • Integrate Speech to Text, Natural Language Understanding, and Knowledge Studio in a web app
  • Use React components and a Node.js server

NOTE: This code pattern includes instructions for running Watson services on IBM Cloud or with the Watson API Kit on IBM Cloud Pak for Data. Click here for more information about IBM Cloud Pak for Data.

Steps

  1. Create the Watson services
  2. Deploy the server
  3. Use the web app

Create the Watson services

Provision the following services:

  • Speech to Text
  • Natural Language Understanding

The instructions will depend on whether you are provisioning services using IBM Cloud Pak for Data or on IBM Cloud.

Click to expand one:

IBM Cloud Pak for Data

Use the following instructions for each of the three services.

Install and provision service instances

The services are not available by default. An administrator must install them on the IBM Cloud Pak for Data platform, and you must be given access to the service. To determine whether the service is installed, Click the Services icon (services_icon) and check whether the service is enabled.

Gather credentials

  1. For production use, create a user to use for authentication. From the main navigation menu (☰), select Administer > Manage users and then + New user.
  2. From the main navigation menu (☰), select My instances.
  3. On the Provisioned instances tab, find your service instance, and then hover over the last column to find and click the ellipses icon. Choose View details.
  4. Copy the URL to use as the {SERVICE_NAME}_URL when you configure credentials.
  5. Optionally, copy the Bearer token to use in development testing only. It is not recommended to use the bearer token except during testing and development because that token does not expire.
  6. Use the Menu and select Users and + Add user to grant your user access to this service instance. This is the user name (and password) you will use when you configure credentials to allow the Node.js server to authenticate.
IBM Cloud

Create the service instances
  • If you do not have an IBM Cloud account, register for a free trial account here.
  • Click here to create a Speech to Text instance.
  • Click here to create a Natural Language Understanding instance.
Gather credentials
  1. From the main navigation menu (☰), select Resource list to find your services under Services.
  2. Click on each service to find the Manage view where you can collect the API Key and URL to use for each service when you configure credentials.

Prerequisites (Local)

If you would like to run this pattern locally, without any cloud services, then all you need is VSCode and the IBM Credentials

Deploy the server here

local

Use the web app

NOTE: The app was developed using Chrome on macOS. Browser compatibility issues are still being worked out.

watson-speech-translator.gif

  1. Browse to your app URL

    • Use the URL provided at the end of your selected deployment option.
  2. Select the language of your choice

    • The drop-down will be populated with models supported by your Speech to Text service.
  3. Use the Speech to Text toggle

    • Use the Speak Here button (which becomes Stop Listening) to begin recording audio and streaming it to Speech to Text. Press the button again to stop listening/streaming.
  4. Use the Detect Danger toggle

    • Use the Detect Danger button (which becomes Detecting Danger) to begin running the NLU and executing danger analysis. Press the button again to stop when output has been printed.
  5. Resetting the transcribed text

    • The transcribed text will be cleared when you do any of the following:

      • Press Speech to Text to restart listening
      • Refresh the page

Future Features

  • With additional time and resources, here are some features that can be implemented to supplement the current code pattern.
    • Maps API - Parse the location of the danger incident using ‘Entities’ NLU feature and pin the location on a visual map.

    • Database - Store voice inputs in a database that can be used to periodically retrain and improve the model over time.

    • User Log - Using ‘Concepts’ NLU feature, summarize important information like location, time, and environment from voice input. This summary can be passed on to first responders so they are prepared, even before they approach the danger.

    • Diversify Input Sources - Scrape social media forums (twitter, facebook) and other miscellaneous sources (ex: police radio) for danger statements and analyze accordingly.

License

This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.

Apache License FAQ

About

Use Watson Speech to Text, Natural Language Understanding, and Knowledge Studio in a web app with React components.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published