diff --git a/.github/workflows/build-doc.yml b/.github/workflows/build-doc.yml index 111054b6a..979712c8d 100644 --- a/.github/workflows/build-doc.yml +++ b/.github/workflows/build-doc.yml @@ -110,6 +110,7 @@ jobs: ./generate-tts.py ./generate-tts-engine.py ./generate-speaker-identification.py + ./generate-speaker-diarization.py ./generate-audio-tagging.py ./generate-audio-tagging-wearos.py ./generate-slid.py @@ -122,6 +123,7 @@ jobs: mv -v apk.html ../build/html/onnx/tts/ mv -v apk-engine.html ../build/html/onnx/tts/ mv -v apk-speaker-identification.html ../build/html/onnx/speaker-identification/apk.html + mv -v apk-speaker-diarization.html ../build/html/onnx/speaker-diarization/apk.html mv -v apk-audio-tagging.html ../build/html/onnx/audio-tagging/apk.html mv -v apk-audio-tagging-wearos.html ../build/html/onnx/audio-tagging/apk-wearos.html mv -v apk-slid.html ../build/html/onnx/spoken-language-identification/apk.html @@ -134,6 +136,7 @@ jobs: mv -v apk-cn.html ../build/html/onnx/tts/ mv -v apk-engine-cn.html ../build/html/onnx/tts/ mv -v apk-speaker-identification-cn.html ../build/html/onnx/speaker-identification/apk-cn.html + mv -v apk-speaker-diarization-cn.html ../build/html/onnx/speaker-diarization/apk-cn.html mv -v apk-audio-tagging-cn.html ../build/html/onnx/audio-tagging/apk-cn.html mv -v apk-audio-tagging-wearos-cn.html ../build/html/onnx/audio-tagging/apk-wearos-cn.html mv -v apk-slid-cn.html ../build/html/onnx/spoken-language-identification/apk-cn.html diff --git a/docs/source/intro.rst b/docs/source/intro.rst index cdf6d3694..31c8bffe5 100644 --- a/docs/source/intro.rst +++ b/docs/source/intro.rst @@ -62,6 +62,7 @@ The differences are compared below: | C#, Java, Kotlin, | Swift, Go, | JavaScript, Dart + | Pascal, Rust - | C, C++, Python, | C#, Kotlin, | Swift, Go @@ -71,13 +72,15 @@ The differences are compared below: - | streaming speech recognition, | non-streaming speech recognition, | text-to-speech, + | speaker diarization, | speaker identification, | speaker verification, | spoken language identification, | audio tagging, | VAD, | keyword spotting, - - streaming speech recognition + - | streaming speech recognition, + | VAD, We also support `Triton`_. Please see :ref:`triton_overview`. diff --git a/docs/source/onnx/android/build-sherpa-onnx.rst b/docs/source/onnx/android/build-sherpa-onnx.rst index 8c137f24b..c4caea3fa 100644 --- a/docs/source/onnx/android/build-sherpa-onnx.rst +++ b/docs/source/onnx/android/build-sherpa-onnx.rst @@ -38,6 +38,7 @@ and ``text-to-speech`` (TTS). - ``SherpaOnnxVad`` - ``SherpaOnnxVadAsr`` - ``SherpaOnnxSpeakerIdentification`` + - ``SherpaOnnxSpeakerDiarization`` - ``SherpaOnnxAudioTagging`` - ``SherpaOnnxAudioTaggingWearOs`` diff --git a/docs/source/onnx/index.rst b/docs/source/onnx/index.rst index 54af1bacc..c083d1d55 100644 --- a/docs/source/onnx/index.rst +++ b/docs/source/onnx/index.rst @@ -50,6 +50,12 @@ Also, we show how to use it for speech recognition with pre-trained models. ./pretrained_models/index ./sense-voice/index +.. toctree:: + :maxdepth: 5 + :caption: Speaker diarization + + ./speaker-diarization/index + .. toctree:: :maxdepth: 5 :caption: Speaker Identification diff --git a/docs/source/onnx/speaker-diarization/android.rst b/docs/source/onnx/speaker-diarization/android.rst new file mode 100644 index 000000000..817a0b513 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/android.rst @@ -0,0 +1,21 @@ +Android APKs for speaker diarization +==================================== + +You can find Android APKs for speaker diarization at the following page + + ``_ + +For users from China, you can also visit + + ``_ + + +The source code for the APKs can be found at + + ``_ + +You can find the script for building the APKs at + + ``_ + +Please see :ref:`sherpa-onnx-android` for more details. diff --git a/docs/source/onnx/speaker-diarization/c.rst b/docs/source/onnx/speaker-diarization/c.rst new file mode 100644 index 000000000..1e3dde1d1 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/c.rst @@ -0,0 +1,8 @@ +C API examples +============== + +Please see + + ``_ + +and :ref:`sherpa-onnx-c-api`. diff --git a/docs/source/onnx/speaker-diarization/cpp.rst b/docs/source/onnx/speaker-diarization/cpp.rst new file mode 100644 index 000000000..877cd9fc8 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/cpp.rst @@ -0,0 +1,6 @@ +C++ API examples +================ + +Please see + + ``_ diff --git a/docs/source/onnx/speaker-diarization/csharp.rst b/docs/source/onnx/speaker-diarization/csharp.rst new file mode 100644 index 000000000..efddb1b43 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/csharp.rst @@ -0,0 +1,6 @@ +C# API examples +=============== + +Please see + + ``_ diff --git a/docs/source/onnx/speaker-diarization/dart.rst b/docs/source/onnx/speaker-diarization/dart.rst new file mode 100644 index 000000000..33367b748 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/dart.rst @@ -0,0 +1,6 @@ +Dart API examples +================= + +Please see + + ``_ diff --git a/docs/source/onnx/speaker-diarization/go.rst b/docs/source/onnx/speaker-diarization/go.rst new file mode 100644 index 000000000..b2bc248de --- /dev/null +++ b/docs/source/onnx/speaker-diarization/go.rst @@ -0,0 +1,6 @@ +Go API examples +=============== + +Please see + + ``_ diff --git a/docs/source/onnx/speaker-diarization/index.rst b/docs/source/onnx/speaker-diarization/index.rst new file mode 100644 index 000000000..7eea86e68 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/index.rst @@ -0,0 +1,32 @@ +Speaker Diarization +=================== + +This page describes how to use `sherpa-onnx`_ for speaker diarization. + +Pre-trained models for speaker segmentation can be found +at ``_ + +Pre-trained models for speaker embedding extraction can be found +at ``_ + +.. hint:: + + + +In the following, we describe different programming language APIs for speaker diarization. + +.. toctree:: + :maxdepth: 5 + + ./android.rst + ./c.rst + ./cpp.rst + ./csharp.rst + ./dart.rst + ./go.rst + ./java.rst + ./javascript.rst + ./kotlin.rst + ./pascal.rst + ./python.rst + ./swift.rst diff --git a/docs/source/onnx/speaker-diarization/java.rst b/docs/source/onnx/speaker-diarization/java.rst new file mode 100644 index 000000000..6f9b3315e --- /dev/null +++ b/docs/source/onnx/speaker-diarization/java.rst @@ -0,0 +1,9 @@ +Java API examples +================= + +Please see + + ``_ + +and :ref:`sherpa-onnx-java-api`. + diff --git a/docs/source/onnx/speaker-diarization/javascript.rst b/docs/source/onnx/speaker-diarization/javascript.rst new file mode 100644 index 000000000..094754d24 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/javascript.rst @@ -0,0 +1,36 @@ +JavaScript API examples +======================= + +We provide two npm packages. + +WebAssembly based npm package +----------------------------- + +You can find the package at + + ``_ + +This package does not support multi-threading. + +The example for speaker diarzation can be found at + + ``_ + +node-addon based npm package +---------------------------- + +You can find the package at + + ``_ + +This package supports multi-threading. + +Please see + + ``_ + +for installation. + +The example for speaker diarization can be found at + + ``_ diff --git a/docs/source/onnx/speaker-diarization/kotlin.rst b/docs/source/onnx/speaker-diarization/kotlin.rst new file mode 100644 index 000000000..39d7eb963 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/kotlin.rst @@ -0,0 +1,6 @@ +Kotlin API examples +=================== + +Please see + + ``_ diff --git a/docs/source/onnx/speaker-diarization/pascal.rst b/docs/source/onnx/speaker-diarization/pascal.rst new file mode 100644 index 000000000..e4c4a7f02 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/pascal.rst @@ -0,0 +1,7 @@ +Pascal API examples +=================== + +Please see + + ``_ + diff --git a/docs/source/onnx/speaker-diarization/python.rst b/docs/source/onnx/speaker-diarization/python.rst new file mode 100644 index 000000000..c19814b65 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/python.rst @@ -0,0 +1,10 @@ +Python API examples +=================== + +.. note:: + + You need to install `sherpa-onnx>=1.10.28`. + + +Please see ``_ +for usages. diff --git a/docs/source/onnx/speaker-diarization/swift.rst b/docs/source/onnx/speaker-diarization/swift.rst new file mode 100644 index 000000000..0092cdd90 --- /dev/null +++ b/docs/source/onnx/speaker-diarization/swift.rst @@ -0,0 +1,6 @@ +Swift API examples +================== + +Please see + + ``_