タイトル「speech」を検索 - はてなブックマーク

81 - 120 件 / 182件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

speechの検索結果81 - 120 件 / 182件

GitHub - reriiasu/speech-to-text: Real-time transcription using faster-whisper
- 5 users
- github.com/reriiasu
- テクノロジー
- 2023/08/03
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
音声認識によるリアルタイム字幕&翻訳が可能な「Speech to Text Webcam Overlay」をZoomで使ってみた - DENET 技術ブログ
- 5 users
- blog.denet.co.jp
- テクノロジー
- 2021/03/07
Data2vec 2.0: Highly efficient self-supervised learning for vision, speech and text
- 5 users
- ai.meta.com
- テクノロジー
- 2022/12/14
Data2vec 2.0: Highly efficient self-supervised learning for vision, speech and text Many recent breakthroughs in AI have been powered by self-supervised learning, which enables machines to learn without relying on labeled data. But current algorithms have several significant limitations, often including being specialized for a single modality (such as images or text) and requiring lots of computat
- AI
- あとで読む
パプリカ on Twitter: "そもそも討論に応援は必要ないです。私が四ツ谷の某大ESSのアカデミックディベート所属時、政治家を多く輩出するW大弁論部と異種試合をした逸話を聞いたのですが、こちらのspeech中、あちらは外野が野次を飛ばす、哄笑する、それを討論と… https://t.co/jk76c2kXTc"
- 5 users
- twitter.com/papurika_dreams
- 学び
- 2019/11/22
そもそも討論に応援は必要ないです。私が四ツ谷の某大ESSのアカデミックディベート所属時、政治家を多く輩出するW大弁論部と異種試合をした逸話を聞いたのですが、こちらのspeech中、あちらは外野が野次を飛ばす、哄笑する、それを討論と… https://t.co/jk76c2kXTc
GitHub - snakers4/silero-models: Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
- 5 users
- github.com/snakers4
- テクノロジー
- 2022/06/20
Silero Models: pre-trained enterprise-grade STT / TTS models and benchmarks. Enterprise-grade STT made refreshingly simple (seriously, see benchmarks). We provide quality comparable to Google's STT (and sometimes even better) and we are not Google. As a bonus: No Kaldi; No compilation; No 20-step instructions; Also we have published TTS models that satisfy the following criteria: One-line usage; A
【映画】「英国王のスピーチ（The King's Speech）」(2010年) 観ました。（オススメ度★★★★★） - 「言葉こそ人生」読むだけ元気お届け人の"今ここを生きる心"の裏側
- 5 users
- imakokowoikiru.hatenablog.com
- エンタメ
- 2022/09/27
長年吃音に悩まされたイギリスの国王ジョージ６世と、その治療にあたったオーストラリア出身の言語療法士ライオネル・ローグの友情物語。先日ご逝去されたエリザベス２世女王のお父さんの話で、史実をベースに描かれています。兄が国王を継ぐと思っていたのにアメリカ人女性を選んで退位してしまうし（いつか見たような光景が・・・）、第二次世界大戦に向かうややこしいヨーロッパの状況下でヒトラー率いるドイツとの戦いを目前だし、王族ひいてはイギリス国民を守らないといけないし、と本当に心労が重なる状態だったはず。そんな状況の中で言語療法士ローグは、オーストラリア人の気さくすぎるおっちゃんだったというのもあって、最初は衝突もありながらも、国王の懸命のトレーニングを通して心通わせるようになり、やがてスピーチを成功に導く。映画のクライマックスで、マイクに誠心誠意訴えかけるジョージ６世の言葉は、一言一言が丁寧でゆっくり
【日本語字幕】ヒトラー首相就任演説 - Hitler Speech "Proclamation to the German Nation"
- 5 users
- www.youtube.com
- 政治と経済
- 2021/11/27
▼HISTORY CHANNEL チャンネル登録 http://bit.ly/2BQ4Kns アドルフ・ヒトラーとナチ党はドイツの今までの内閣や大統領、君主達が得ることのできなかった大きな権力を表面上合法的に手中にした。この権力掌握の過程は大きく分けて二つの時期に分類される。ナチ党が国内有数の政党になってから、1933年1月30日にヒトラー内閣が成立するまでの期間と、政権についたヒトラーとナチ党が国内外の政敵をほぼ一掃し、立法権・行政権・司法権の三権を含むドイツ国内の権力を、党・国家そしてヒトラーが支配するまでの期間である。後者の過程は政権獲得からほぼ2年以内の短期間であった。 00:00 ゲッベルスによる前座 08:16 会場実況 12:00 ヒトラー演説 14:48 第一次世界大戦におけるドイツ国民の罪 15:22 政治の現状 17:35 マルクス主義について 19:36 ワイマー
- 政治
DeepSpeech 0.6: Mozilla’s Speech-to-Text Engine Gets Fast, Lean, and Ubiquitous – Mozilla Hacks - the Web developer blog
- 5 users
- hacks.mozilla.org
- テクノロジー
- 2019/12/06
DeepSpeech 0.6: Mozilla’s Speech-to-Text Engine Gets Fast, Lean, and Ubiquitous The Machine Learning team at Mozilla continues work on DeepSpeech, an automatic speech recognition (ASR) engine which aims to make speech recognition technology and trained models openly available to developers. DeepSpeech is a deep learning-based ASR engine with a simple API. We also provide pre-trained English models
GitHub - NVIDIA/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
- 5 users
- github.com/NVIDIA
- テクノロジー
- 2020/11/30
Large Language Models and Multimodal Accelerate your generative AI journey with NVIDIA NeMo Framework on GKE (2024/03/16) An end-to-end walkthrough to train generative AI models on the Google Kubernetes Engine (GKE) using the NVIDIA NeMo Framework is available at https://github.com/GoogleCloudPlatform/nvidia-nemo-on-gke. The walkthrough includes detailed instructions on how to set up a Google Clou
A new AI-powered speech translation system for Hokkien pioneers a new approach for a primarily oral language
- 5 users
- ai.meta.com
- テクノロジー
- 2022/10/20
Meta’s new AI-powered speech translation system for Hokkien pioneers a new approach for an unwritten language Until now, AI translation has mainly focused on written languages. Yet nearly half of the world’s 7,000+ living languages are primarily oral and do not have a standard or widely used writing system. This makes it impossible to build machine translation tools using standard techniques, whic
- 台湾
- 人工知能
- facebook
- 技術
Text to Speech & AI Voice Generator
- 5 users
- elevenlabs.io
- テクノロジー
- 2023/11/24
AI Voice Changer: Change Your Voice For FreeAI Speech to Speech ConverterTransform your voice into another character and control its emotion and delivery. Easily create custom voices for games, videos, podcasts, and more with a single click. Perfect Delivery, Every TimeEdit and fine-tune your voiceovers using our voice changer. Get consistent, clear results that keep the feel and nuance of your or
- voice
- webservice
GitHub - log1stics/voice-generator-webui: A multi-speaker, multilingual speech generation tool
- 5 users
- github.com/log1stics
- テクノロジー
- 2023/04/18
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- 機械学習
AI Voice Generator, Text To Speech, #1 Best AI Voice
- 5 users
- speechify.com
- テクノロジー
- 2022/03/07
Best AI text to speech for Chrome, iOS, Android, Mac, & Edge.Speechify is the #1 rated AI text to speech app in its category with over 250,000 5 star reviews.
- webサービス
Elon Musk bans several prominent journalists from Twitter, calling into question his commitment to free speech | CNN Business
- 5 users
- www.cnn.com
- 世の中
- 2022/12/16
Your effort and contribution in providing this feedback is much appreciated.
- Twitter
- news
GitHub - coqui-ai/STT: 🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
- 5 users
- github.com/coqui-ai
- テクノロジー
- 2021/04/15
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- ディープラーニング
- あとで読む
GitHub - NATSpeech/NATSpeech: A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
- 4 users
- github.com/NATSpeech
- テクノロジー
- 2022/02/17
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- tech
- あとで読む
ひとこと感想：『Vitrtue Signaling: Essays on Darwinian Politics & Free Speech』 - 道徳的動物日記
- 4 users
- davitrice.hatenadiary.jp
- 学び
- 2020/12/10
Virtue Signaling: Essays on Darwinian Politics & Free Speech (English Edition) 作者:Miller, Geoffrey 発売日: 2019/09/18 メディア: Kindle版以前からすこし気になっていた本であり、ほしいものリストから頂いたので、読んだ。しかし、贈ってくれた方には申し訳ないのだが、かなりキツい本であった。本や文章としての魅力があまりにもなさ過ぎる*1 この本は、進化心理学者であるジェフリー・ミラーがネットなどで発表した論考やエッセイを集めてまとめたもの。ミラーは『恋人選びの心：性淘汰と人間性の進化』や『消費資本主義！：見せびらかしの進化心理学』の著者であり、これまでにも人間が持っている様々な特質を性淘汰やシグナリング理論で説明する議論を展開してきた。この本でも、「道徳的な徳の性淘汰（Sex
Google、The Live Transcribe Speech Engineをオープンソースで公開
- 4 users
- news.mynavi.jp
- テクノロジー
- 2019/08/21
Googleは同社Androidアプリケーション「Live Transcribe」のエンジンをオープンソースでGitHubに公開したことを現地時間16日に公式ブログで発表した。リアルタイムに音声を文字へと転写する「Live Transcribe」は、耳の不自由な方や難聴者のために今年初めに同社が公開したAndroid用のアプリで、日本語を含む70言語に対応している。「Live Transcribe」公式Webサイト音声の認識は先端のGoogle Cloud Speech APIによりほとんどの条件下では高精度の転写精度を実現するが、Cloud Speech APIが無制限に長いオーディオストリームの送信をサポートしていないことやクラウドに依存することでネットワーク遅延、データコストなどの国毎に異なる課題が生じる、など数カ月のユーザーテストの課題を紹介したうえで各地、各所で誰しもがアク
- あとで読む
グーグルから音声読み上げ（Text to Speech）を使った無料オーディオブック｜Sangmin Ahn
- 4 users
- note.com/sangmin
- テクノロジー
- 2020/12/23
こんにちは、Choimirai School のサンミンです。 0 はじめに前から紹介している音声読み上げ機能（Text to Speech、TTS）ですが、さらに進化し続けています。無料でアクセスできる本を ①TTSと②WaveNetを利用し、オーディオブックとして提供しているケースも増えています。読み上げの精度もたんたん人間に近づいている感じです。個人的な感想では、Gulliver's Travelsは言われないと分からないレベル。今回の note では、Google Playでダウンロードできる本を何冊が紹介させていただきます。 1 フィクションThe Legend of Sleepy Hollow Dracula Gulliver’s Travels The Strange Case of Dr Jekyll and Mr Hyde Frankenstein The War
Trump gives speech to event linked to controversial ‘cult’ on 9/11 anniversary
- 4 users
- www.independent.co.uk
- 世の中
- 2021/09/14
Thank you for registeringPlease refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in Trump gives virtual speech to event linked to controversial religious ‘cult’ on 9/11 anniversaryProminent GOP officials including Mike Pence and Mike Pompeo have also appeared at events of the Universal Peace Federation, which has links t
- 社会
GitHub - espeak-ng/espeak-ng: eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
- 4 users
- github.com/espeak-ng
- テクノロジー
- 2020/07/05
Features Supported languages Documentation eSpeak Compatibility History License Information The eSpeak NG is a compact open source software text-to-speech synthesizer for Linux, Windows, Android and other operating systems. It supports more than 100 languages and accents. It is based on the eSpeak engine created by Jonathan Duddington. eSpeak NG uses a "formant synthesis" method. This allows many
Why people get away with hate speech in India
- 4 users
- www.bbc.com
- 世の中
- 2022/04/14
As hate speech cases make headlines, experts say the problem isn't lack of laws but political will.
GitHub - m-bain/whisperX: WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- 4 users
- github.com/m-bain
- テクノロジー
- 2023/03/16
This repository provides fast automatic speech recognition (70x realtime with large-v2) with word-level timestamps and speaker diarization. ⚡️ Batched inference for 70x realtime transcription using whisper large-v2 🪶 faster-whisper backend, requires <8GB gpu memory for large-v2 with beam_size=5 🎯 Accurate word-level timestamps using wav2vec2 alignment 👯‍♂️ Multispeaker ASR using speaker diariza
Introducing Voicebox: The Most Versatile AI for Speech Generation | Meta
- 4 users
- about.fb.com
- テクノロジー
- 2023/06/17
Voicebox is a generative AI model that can help with audio editing, sampling and styling. This type of technology could be used in the future to help creators easily edit audio tracks, allow visually impaired people to hear written messages from friends in their voices, and enable people to speak any foreign language in their own voice. Today, we’re announcing a breakthrough in generative AI for s
Stephen King on Twitter: "Sir, free speech does not include the right to yell “Fire!” in a crowded theater. That is what Donald Trump was doi… https://t.co/KjxsqFzpGf"
- 4 users
- twitter.com/StephenKing
- 政治と経済
- 2021/01/09
Sir, free speech does not include the right to yell “Fire!” in a crowded theater. That is what Donald Trump was doi… https://t.co/KjxsqFzpGf
- トランプ
- アメリカ
Unlike Blizzard, Epic Games says it won’t ban players for political speech
- 4 users
- www.theverge.com
- 政治と経済
- 2019/10/10
Policy/Speech/TechUnlike Blizzard, Epic Games says it won’t ban players for political speech Unlike Blizzard, Epic Games says it won’t ban players for political speech / Woke Fortnite By Makena Kelly, a reporter who covers the politics and power influencing the tech industry. Before joining The Verge in 2018, she covered Congress and breaking news.
- 香港
- 中国
- ゲーム
Amazon Transcribe Now Supports Speech-to-text in 7 Additional Languages
- 4 users
- aws.amazon.com
- テクノロジー
- 2019/11/22
Amazon Transcribe now supports transcription for audio and video in Gulf Arabic, Swiss German, Hebrew, Japanese, Malay, Telugu, and Turkish languages. Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy to add speech-to-text capability to applications. Organizations can use Amazon Transcribe to create text transcripts of audio and video files quickly. Amazon Trans
- aws
Cloud Speech-to-Text の新たな対応言語でも改良されたモデルと機能が利用可能に | Google Cloud 公式ブログ
- 4 users
- cloud.google.com
- テクノロジー
- 2020/03/13
※この投稿は米国時間 2020 年 3 月 6 日に、Google Cloud blog に投稿されたものの抄訳です。通話分析や動画字幕の自動生成などのスピーチインターフェースは、人が周囲とやり取りする方法を変貌させ、新たなビジネス機会を創出しています。こうした変化の原動力となり、アイデアの実現を後押ししているのが音声認識技術です。 Google Cloud では、この素晴らしい技術をできる限り広範に利用できるものにするために日々尽力しています。Google Cloud のプロダクトや機能をより多くのお客様に提供し、世界中の企業で便利にご利用いただけるようにするため、このたび、新しい機能、モデル、言語を音声入力システムに導入いたしました。 Google Cloud Speech-to-Text は、ユーザーが送信した長尺、短尺の録音やストリーミングされた音声に含まれる発言を文字変換して
GitHub - coqui-ai/TTS: 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
- 4 users
- github.com/coqui-ai
- テクノロジー
- 2022/04/12
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
GitHub - facebookresearch/seamless_communication: Foundational Models for State-of-the-Art Speech and Text Translation
- 4 users
- github.com/facebookresearch
- テクノロジー
- 2023/08/23
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- github
Google Speech to Text APIを使ってブラウザでリアルタイム文字起こしする - Qiita
- 4 users
- qiita.com/kawazu255
- テクノロジー
- 2020/07/19
TD;TL Google Speech to Text APIとWeb Speech APIを併用することで実現する音声検出のみWeb Speech APIを使い、文字起こし自体はGoogle Speech to Text APIを使うことで、ブラウザ文字起こしにおいてリアルタイム感と精度の高さを両立する発端現在開発中のプロダクトの中で、Speech to Textの仕組みを導入するために様々な方法を調べていました。オンライン会議中の会話を文字起こししたり、アジェンダや議事録を一括で管理できるサービス「Telelogger」というサービスなのですが、コアとなる機能が会議中の会話の文字起こしです。サービスはWebアプリケーションとして提供するため、ブラウザでの文字起こしを想定しています。対象ブラウザをGoogle Chromeに絞った上で、最初はWeb Speech APIを試し
- Article
Universal Speech Model (USM): State-of-the-art speech AI for 100+ languages
- 4 users
- ai.googleblog.com
- テクノロジー
- 2023/03/07
Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more
X to be investigated for allegedly breaking EU laws on hate speech and fake news
- 4 users
- www.theguardian.com
- 政治と経済
- 2023/12/19
Elon Musk. The European Commission asked X to provide evidence of compliance with the new laws. Photograph: Antonio Masiello/Getty Images
- あとで読む
プロ話者 (声優・俳優など) 100 名から得られたコーパスである JVS (Japanese versatile speech) corpus が東大の高道助教によって公開されました - 糞糞糞ネット弁慶
- 4 users
- repose.hatenadiary.jp
- アニメとゲーム
- 2019/08/18
音声合成研究のために，コーパスをリリースしました．100名のプロ話者（声優・俳優）× 100発話(パラレル)を含んでいます．今すぐダウンロードできます！！https://t.co/FJXrl3owrX https://t.co/qGuUCSqIyA— Shinnosuke Takamichi (高道慎之介) (@forthshinji) August 17, 2019 Shinnosuke Takamichi (高道慎之介) - jvs_corpus このブログを読んでいる人間は全員知っているとは思いますが，東京大学の高道助教によって JVS (Japanese versatile speech) corpus が公開されました． JVS corpus は 100名のプロ話者から得られた様々な音声が含まれていますが，特に "parallel100" ... 話者間で共通する読み上げ音声
GitHub - pyannote/pyannote-audio: Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
- 4 users
- github.com/pyannote
- テクノロジー
- 2022/02/28
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
Web Speech APIを利用しブラウザで音声を認識する方法
- 4 users
- www.twilio.com
- テクノロジー
- 2021/06/05
製品コミュニケーションメッセージングマルチチャネルのテキストメッセージとメディアメッセージの送受信を180か国以上で
- web
Full transcript of Joe Biden's inauguration speech
- 4 users
- www.bbc.com
- 世の中
- 2021/01/21
'This is our historic moment of crisis and challenge'. Read the 46th president's address in full.
- US
GitHub - yl4579/StyleTTS2: StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
- 4 users
- github.com/yl4579
- テクノロジー
- 2023/11/20
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
- 4 users
- arxiv.org
- 学び
- 2021/09/15
Self-supervised approaches for speech representation learning are challenged by three unique problems: (1) there are multiple sound units in each input utterance, (2) there is no lexicon of input sound units during the pre-training phase, and (3) sound units have variable lengths with no explicit segmentation. To deal with these three problems, we propose the Hidden-Unit BERT (HuBERT) approach for
- あとで読む
War in Ukraine: Zelensky WW2 speech accuses Russia of Nazi atrocities
- 4 users
- www.bbc.com
- 世の中
- 2022/05/09
The Ukrainian leader compares Russia's invasion of his country to wartime Nazi bombings.