並び順

ブックマーク数

期間指定

  • から
  • まで

1 - 12 件 / 12件

新着順 人気順

"speech recognition"の検索結果1 - 12 件 / 12件

  • GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

    You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

      GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision
    • GitHub - alphacep/vosk-api: Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

      Vosk is an offline open source speech recognition toolkit. It enables speech recognition for 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish. More to come. Vosk models are small (50 Mb) but p

        GitHub - alphacep/vosk-api: Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
      • VOSK Offline Speech Recognition API

        РУС 中文 Vosk is a speech recognition toolkit. The best things in Vosk are: Supports 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish, Uzbek, Korean, Breton, Gujarati. More to come. Works offlin

        • Speech Recognition For Mac - rulesland

          Visiteurs depuis le 28/01/2019 : 862 Connectés : 1 Record de connectés : 14 This video shows hot to set up Dictation and how to use it with ease on your Mac for the purposes of speech recognition. I updated this post about speech to text software and dictation in January 2, 2018 When I was employed as a journalist, I spent a lot of time interviewing people. One of the most painful things I had to

            Speech Recognition For Mac - rulesland
          • NoodlでWeb Speech API Speech Recognitionを使う!Noodl Javascriptノードの使い方も解説 - Qiita

            このように、複数追加もできるようです。 Noodl1.3ではこのような処理はif文で書いていました。2.0のほうがスッキリかけそうですね。 change:function inputの値のどれかが変更されたときに実行される。 このプロジェクトのJavascriptノードの中身 サンプルでは、ラーメンをタップしたときにmySignalにtrueのシグナルを送り、音声認識を実行させています。 define({ // The input ports of the Javascript node, name of input and type inputs:{ // ExampleInput:'number', // Available types are 'number', 'string', 'boolean', 'color' and 'signal', mySignal:'signal'

              NoodlでWeb Speech API Speech Recognitionを使う!Noodl Javascriptノードの使い方も解説 - Qiita
            • GitHub - sdkcarlos/artyom.js: A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.

              Due to abuse of users with the Speech Synthesis API (ADS, Fake system warnings), Google decided to remove the usage of the API in the browser when it's not triggered by an user gesture (click, touch etc.). This means that calling for example artyom.say("Hello") if it's not wrapped inside an user event won't work. So on every page load, the user will need to click at least once time per page to all

                GitHub - sdkcarlos/artyom.js: A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
              • GitHub - NVIDIA/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

                Large Language Models and Multimodal Accelerate your generative AI journey with NVIDIA NeMo Framework on GKE (2024/03/16) An end-to-end walkthrough to train generative AI models on the Google Kubernetes Engine (GKE) using the NVIDIA NeMo Framework is available at https://github.com/GoogleCloudPlatform/nvidia-nemo-on-gke. The walkthrough includes detailed instructions on how to set up a Google Clou

                  GitHub - NVIDIA/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
                • GitHub - m-bain/whisperX: WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

                  This repository provides fast automatic speech recognition (70x realtime with large-v2) with word-level timestamps and speaker diarization. ⚡️ Batched inference for 70x realtime transcription using whisper large-v2 🪶 faster-whisper backend, requires <8GB gpu memory for large-v2 with beam_size=5 🎯 Accurate word-level timestamps using wav2vec2 alignment 👯‍♂️ Multispeaker ASR using speaker diariza

                    GitHub - m-bain/whisperX: WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
                  • GitHub - ccoreilly/vosk-browser: A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

                    You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                      GitHub - ccoreilly/vosk-browser: A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
                    • Wav2vec: Semi-supervised and Unsupervised Speech Recognition

                      Word2vec for audio quantizes phonemes, transforms, GAN trains on text and audio from Facebook AI. JS disabled! Watch Wav2vec: Semi-supervised and Unsupervised Speech Recognition on Youtube Watch video "Wav2vec: Semi-supervised and Unsupervised Speech Recognition" Wav2vec is fascinating in that it combines several neural network architectures and methods: CNN, transformer, quantization, and GAN tra

                        Wav2vec: Semi-supervised and Unsupervised Speech Recognition
                      • GitHub - argmaxinc/WhisperKit: Swift native on-device speech recognition with Whisper for Apple Silicon

                        You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                          GitHub - argmaxinc/WhisperKit: Swift native on-device speech recognition with Whisper for Apple Silicon
                        • wav2vec Unsupervised: Speech recognition without supervision

                          High-performance speech recognition with no supervision at all What the research is:Whether it’s giving directions, answering questions, or carrying out requests, speech recognition makes life easier in countless ways. But today the technology is available for only a small fraction of the thousands of languages spoken around the globe. This is because high-quality systems need to be trained with l

                            wav2vec Unsupervised: Speech recognition without supervision
                          1