タイトル「"speech recognition"」を検索

1 - 12 件 / 12件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

"speech recognition"の検索結果1 - 12 件 / 12件

GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision
- 56 users
- github.com/openai
- 学び
- 2022/09/17
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- OpenAI
- Whisper
- translate
- voice
- audio
- 機械学習
- AI
- Tech
GitHub - alphacep/vosk-api: Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
- 18 users
- github.com/alphacep
- テクノロジー
- 2022/05/16
Vosk is an offline open source speech recognition toolkit. It enables speech recognition for 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish. More to come. Vosk models are small (50 Mb) but p
VOSK Offline Speech Recognition API
- 16 users
- alphacephei.com
- テクノロジー
- 2021/07/31
РУС 中文 Vosk is a speech recognition toolkit. The best things in Vosk are: Supports 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish, Uzbek, Korean, Breton, Gujarati. More to come. Works offlin
- AI
- あとで読む
Speech Recognition For Mac - rulesland
- 9 users
- rulesland.eklablog.com
- 暮らし
- 2019/07/28
Visiteurs depuis le 28/01/2019 : 862 Connectés : 1 Record de connectés : 14 This video shows hot to set up Dictation and how to use it with ease on your Mac for the purposes of speech recognition. I updated this post about speech to text software and dictation in January 2, 2018 When I was employed as a journalist, I spent a lot of time interviewing people. One of the most painful things I had to
NoodlでWeb Speech API Speech Recognitionを使う！Noodl Javascriptノードの使い方も解説 - Qiita
- 7 users
- qiita.com/kisaichi07
- テクノロジー
- 2020/01/02
このように、複数追加もできるようです。 Noodl1.3ではこのような処理はif文で書いていました。2.0のほうがスッキリかけそうですね。 change:function inputの値のどれかが変更されたときに実行される。このプロジェクトのJavascriptノードの中身サンプルでは、ラーメンをタップしたときにmySignalにtrueのシグナルを送り、音声認識を実行させています。 define({ // The input ports of the Javascript node, name of input and type inputs:{ // ExampleInput:'number', // Available types are 'number', 'string', 'boolean', 'color' and 'signal', mySignal:'signal'
GitHub - sdkcarlos/artyom.js: A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
- 5 users
- github.com/sdkcarlos
- テクノロジー
- 2020/04/11
Due to abuse of users with the Speech Synthesis API (ADS, Fake system warnings), Google decided to remove the usage of the API in the browser when it's not triggered by an user gesture (click, touch etc.). This means that calling for example artyom.say("Hello") if it's not wrapped inside an user event won't work. So on every page load, the user will need to click at least once time per page to all
GitHub - NVIDIA/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
- 5 users
- github.com/NVIDIA
- テクノロジー
- 2020/11/30
Large Language Models and Multimodal Accelerate your generative AI journey with NVIDIA NeMo Framework on GKE (2024/03/16) An end-to-end walkthrough to train generative AI models on the Google Kubernetes Engine (GKE) using the NVIDIA NeMo Framework is available at https://github.com/GoogleCloudPlatform/nvidia-nemo-on-gke. The walkthrough includes detailed instructions on how to set up a Google Clou
GitHub - m-bain/whisperX: WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- 4 users
- github.com/m-bain
- テクノロジー
- 2023/03/16
This repository provides fast automatic speech recognition (70x realtime with large-v2) with word-level timestamps and speaker diarization. ⚡️ Batched inference for 70x realtime transcription using whisper large-v2 🪶 faster-whisper backend, requires <8GB gpu memory for large-v2 with beam_size=5 🎯 Accurate word-level timestamps using wav2vec2 alignment 👯‍♂️ Multispeaker ASR using speaker diariza
GitHub - ccoreilly/vosk-browser: A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
- 3 users
- github.com/ccoreilly
- テクノロジー
- 2022/05/19
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
Wav2vec: Semi-supervised and Unsupervised Speech Recognition
- 3 users
- vaclavkosar.com
- テクノロジー
- 2021/07/04
Word2vec for audio quantizes phonemes, transforms, GAN trains on text and audio from Facebook AI. JS disabled! Watch Wav2vec: Semi-supervised and Unsupervised Speech Recognition on Youtube Watch video "Wav2vec: Semi-supervised and Unsupervised Speech Recognition" Wav2vec is fascinating in that it combines several neural network architectures and methods: CNN, transformer, quantization, and GAN tra
GitHub - argmaxinc/WhisperKit: Swift native on-device speech recognition with Whisper for Apple Silicon
- 3 users
- github.com/argmaxinc
- テクノロジー
- 2024/03/01
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
wav2vec Unsupervised: Speech recognition without supervision
- 3 users
- ai.meta.com
- テクノロジー
- 2021/05/24
High-performance speech recognition with no supervision at all What the research is:Whether it’s giving directions, answering questions, or carrying out requests, speech recognition makes life easier in countless ways. But today the technology is available for only a small fraction of the thousands of languages spoken around the globe. This is because high-quality systems need to be trained with l
- 機械学習
- ai