タイトル「speech」を検索 - はてなブックマーク

41 - 80 件 / 171件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

speechの検索結果41 - 80 件 / 171件

OpenAI TTS（Text to Speech）を Node.js で試してみた
- 9 users
- hyper-text.org
- テクノロジー
- 2023/11/08
OpenAI TTS（Text to Speech）を Node.js で試してみた先日開催された OpenAI Dev Day で新たに発表された、テキストから音声を生成する OpenAI TTS (Text To Speech) API が面白そうだったので、早速ですが Node.js 環境で簡単に試してみました。先日開催された OpenAI Dev Day では大幅な機能追加に加え、いくつかの新しい API も発表されました。その中で、テキストから音声を生成する OpenAI TTS (Text To Speech) API が面白そうだったので、早速ですが簡単に試してみることに。 Text to speech の概要や、API のリファレンスは下記にあります。 Text to speech - OpenAI API Create speech - API Referenc
- JavaScript
Lyra V2 - a better, faster, and more versatile speech codec
- 9 users
- opensource.googleblog.com
- テクノロジー
- 2022/10/04
The latest news from Google on open source releases, major projects, events, and student outreach programs. Since we open sourced the first version of Lyra on GitHub last year, we are delighted to see a vibrant community growing around it, with thousands of stars, hundreds of forks, and many comments and pull requests. There are people who fixed and formatted our code, built continuous integration
GitHub - google/lyra: A Very Low-Bitrate Codec for Speech Compression
- 9 users
- github.com/google
- テクノロジー
- 2021/04/06
The basic architecture of the Lyra codec is quite simple. Features are extracted from speech every 20ms and are then compressed for transmission at a desired bitrate between 3.2kbps and 9.2kbps. On the other end, a generative model uses those features to recreate the speech signal. Lyra harnesses the power of new natural-sounding generative models to maintain the low bitrate of parametric codecs w
- github
- linux
- google
1100以上の言語で音声からの文字起こしや文章の読み上げが可能な音声認識モデル「Massively Multilingual Speech(MMS)」をMetaが発表
- 9 users
- gigazine.net
- テクノロジー
- 2023/05/23
AI開発に注力しているMetaが、1100以上の言語で音声からの文字起こしや文章の読み上げが可能な音声認識モデル「Massively Multilingual Speech(MMS)」を発表しました。MMSは従来の大規模多言語音声認識モデルを大幅に上回る言語に対応しており、話者の少ない言語でもさまざまな情報にアクセスしやすくなると期待されています。 Today we're sharing new progress on our AI speech work. Our Massively Multilingual Speech (MMS) project has now scaled speech-to-text & text-to-speech to support over 1,100 languages — a 10x increase from previous work. Deta
Hate Speech’s Rise on Twitter Is Unprecedented, Researchers Find (Published 2022)
- 9 users
- www.nytimes.com
- 世の中
- 2022/12/03
Before Elon Musk bought Twitter, slurs against Black Americans showed up on the social media service an average of 1,282 times a day. After the billionaire became Twitter’s owner, they jumped to 3,876 times a day. Slurs against gay men appeared on Twitter 2,506 times a day on average before Mr. Musk took over. Afterward, their use rose to 3,964 times a day. And antisemitic posts referring to Jews
- 差別
- あとで読む
Shinzo Abe Shot: Shinzo Abe of Japan Dies After Being Shot During Speech (Published 2022)
- 9 users
- www.nytimes.com
- 世の中
- 2022/07/08
SEOUL — Though the current South Korean president, Yoon Suk-yeol, has vowed to mend ties with Japan, the two countries have had a history of icy relations, especially under Mr. Abe. Mr. Yoon sent a telegram to Mr. Abe’s wife on Friday, expressing his “condolences and sympathy to the bereaved family and the people of Japan over the loss of its longest-serving former prime minister and a respected p
- 歴史
- 日本
- 政治
【Speech to Text】Transcribeが日本語に対応しました！【日本語音声を文字へ】 | DevelopersIO
- 9 users
- dev.classmethod.jp
- テクノロジー
- 2019/11/22
こんにちは。コンサル部のKyoです。 Amazon Transcribeで日本語がサポートされましたので、検証してみました。 TranscribeはいわゆるSpeech to Textです。 Amazon Transcribe Now Supports Speech-to-text in 7 Additional Languages 今回対応したのは以下の7つの言語で、合計31の言語に対応しました。 Gulf Arabic, Swiss German, Hebrew, Japanese, Malay, Telugu, and Turkish languages やってみる Speech to textなので、元となる音声ファイルを準備します。今回はSimple Recorderを利用し、私が読み上げを行ってみました。注意点として、Simple Recorderのデフォルト設定では録音し
- あとで読む
Universal Speech Model (USM): State-of-the-art speech AI for 100+ languages
- 8 users
- ai.googleblog.com
- テクノロジー
- 2023/03/07
Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more
- 機械学習
- あとで読む
Low-cost measurement of face mask efficacy for filtering expelled droplets during speech | Science Advances
- 8 users
- advances.sciencemag.org
- 学び
- 2020/08/11
Low-cost measurement of face mask efficacy for filtering expelled droplets during speech View ORCID ProfileEmma P. Fischer1, View ORCID ProfileMartin C. Fischer2,3,*, View ORCID ProfileDavid Grass2, View ORCID ProfileIsaac Henrion4, View ORCID ProfileWarren S. Warren2,3,5,6 and Eric Westman71Department of Psychology & Neuroscience, Duke University, Durham, NC 27708, USA.2Department of Chemistry, D
Narakeet - Easily Create Voiceovers and Narrated Videos Using Realistic Text to Speech!
- 8 users
- www.narakeet.com
- エンタメ
- 2020/03/30
Easily Create Voiceovers Using Realistic Text to Speech Stop wasting time on recording your voice, editing out mistakes and synchronising picture with sound. Just type or upload your script, select one of our 700 voices, and get a professionally sounding audio or video in minutes. Try Narakeet realistic text to speech free, no need to register. Get Started Text to SpeechWord PDF EPUB… to Audio Sli
- markdown
- video
- text
- 素材
【言語聴覚療法（ ST ： Speech Therapy ）】半年に１度のトレーニング - 晴れ時々コジコジ blog
- 7 users
- haretokidokiyuki.com
- 世の中
- 2023/03/29
いつもありがとうございます。【言語聴覚療法（ ST ： Speech Therapy ）】半年に１度のトレーニング息子のジャグちゃん（名前の由来）は難病により言葉を発する事が難しく、療育専門の幼稚園の頃から言語聴覚療法をしています。いわゆる『ST』です。子どもの発達支援としてのSTは、ことばの遅れ、発音がはっきりしない、コミュニケーションが取りにくいなど、ことばに関する心配があるお子さまに対して、治療・指導を行う。食べることについて心配があるお子さまに対してのアドバイスを行う、等々。小学校に入ってからは幼稚園の時の様々な支援が無くなったので個人で探します。諸先輩方に聞いて、たまたま近くに評判の良い専門の病院があったのでそこに半年に１度通っています。幼稚園の頃のSTやOTの先生が、子ども発達支援系の専門の療法士はまだまだ少ないとよく言っていました。今通っている病院は、障がいのある
Google Colab で OpenAI API の Text-to-Speech を試す｜npaka
- 7 users
- note.com/npaka
- テクノロジー
- 2023/11/08
「Google Colab」で「OpenAI API」の「Text-to-Speech」を試したので、まとめました。前回 1. Text-to-Speech「Text-to-Speech」、テキストの読み上げを行うAPIです。6つの内蔵ボイスが付属しており、次の目的で使用できます。・書かれたブログ投稿のナレーション・複数言語の音声を生成・ストリーミングを使用したリアルタイムオーディオ出力 2. セットアップColabでのセットアップ手順は、次のとおりです。 (1) パッケージのインストール。 # パッケージのインストール !pip install openai(2) 環境変数の準備。以下のコードの <OpenAI_APIキー> にはOpenAIのサイトで取得できるAPIキーを指定します。(有料) import os os.environ["OPENAI_API_KEY"] = "
- API
- google
Speech by President of Ukraine Volodymyr Zelenskyy in the Parliament of Japan — Official website of the President of Ukraine
- 7 users
- www.president.gov.ua
- 政治と経済
- 2022/03/23
Speech by President of Ukraine Volodymyr Zelenskyy in the Parliament of Japan 23 March 2022 - 12:37 Dear Mr. Hosoda! Dear Mrs. Santō! Mr. Prime Minister Kishida! Distinguished Members of the Japanese Parliament! Dear Japanese people! It is a great honor for me, the President of Ukraine, to address you for the first time in the history of the Japanese Parliament. Our capitals are separated by a dis
300以上の言語で訓練されたGoogleの翻訳AI「Universal Speech Model(USM)」の最新情報が公開、将来的に1000以上の言語を翻訳可能にする計画
- 7 users
- gigazine.net
- テクノロジー
- 2023/03/08
機械学習によって翻訳ソフトウェアの性能は飛躍的に進歩していますが、地球上に存在する言語の中には話者が少なく、学習に必要なデータが不十分なものもあります。新たにGoogleが、YouTubeの字幕生成に利用される大規模言語モデル「Universal Speech Model(USM)」を300以上の言語でトレーニングし、比較的マイナーな言語を含む翻訳タスクで非常に優れた性能を発揮したことを報告しました。 Universal Speech Model https://sites.research.google/usm/ Universal Speech Model (USM): State-of-the-art speech AI for 100+ languages – Google AI Blog https://ai.googleblog.com/2023/03/universal-sp
- 人工知能
- techfeed
GitHub - fishaudio/fish-speech: Brand new TTS solution
- 7 users
- github.com/fishaudio
- テクノロジー
- 2024/07/03
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- github
NoodlでWeb Speech API Speech Recognitionを使う！Noodl Javascriptノードの使い方も解説 - Qiita
- 7 users
- qiita.com/kisaichi07
- テクノロジー
- 2020/01/02
このように、複数追加もできるようです。 Noodl1.3ではこのような処理はif文で書いていました。2.0のほうがスッキリかけそうですね。 change:function inputの値のどれかが変更されたときに実行される。このプロジェクトのJavascriptノードの中身サンプルでは、ラーメンをタップしたときにmySignalにtrueのシグナルを送り、音声認識を実行させています。 define({ // The input ports of the Javascript node, name of input and type inputs:{ // ExampleInput:'number', // Available types are 'number', 'string', 'boolean', 'color' and 'signal', mySignal:'signal'
Azure Cognitive Service Speech to Text API を調査し、Google Cloud Speech-to-Text APIと比較していく - OPTiM TECH BLOG
- 7 users
- tech-blog.optim.co.jp
- テクノロジー
- 2020/03/04
こんにちは、引越しなどの一連のゴタゴタが済んだ 2020年新卒入社予定の山口です。今回はAzure Cognitive Service Speech to Text API(以下AST)について調査を行ったので、その結果などを報告します。またGoogle Cloud Speech-to-Text API(以下GST)と比較も行ったので、それについても記述していきます。 ASTの導入 ASTの対応ファイル形式などの調査音声ファイル文字起こしプログラムの作成 ASTとGSTの比較 1. 実行結果の比較 2. 処理速度の比較 3. 料金面での比較まとめ ASTの導入今回はこのリンクの説明を元に導入をしていきます。 Azure側の設定説明リンク Azureアカウントの作成(microsoftアカウントが必要) リソースの作成今回は以下の画像のようにアカウントを作成しました。リソースを
- cloud
- api
- microsoft
- python
- あとで読む
English Text-to-speech software | Ondoku
- 6 users
- ondoku3.com
- テクノロジー
- 2020/12/14
text-to-speech software When you enter text in the text box below, you will hear it in your favorite voice. You can not only listen to the read text on the spot but also download it as an audio file (.mp3).
Exploring the Web Speech API
- 6 users
- www.voorhoede.nl
- テクノロジー
- 2020/07/05
Experimenting with voice on the web using the Web Speech Synthesis and Recognition API The Web Speech API is one of those web technologies that no one ever talks about or writes about. In this blog post, we are going to take a closer look at what the API is capable of, what its limitations and strengths are, and how web developers can utilize it to enhance the user’s browsing experience. “The Web
Musk gets Twitter for $44 billion, to cheers and fears of 'free speech' plan
- 6 users
- www.reuters.com
- テクノロジー
- 2022/04/25
5 minute readApril 26, 20223:45 AM UTCLast Updated agoMusk gets Twitter for $44 billion, to cheers and fears of 'free speech' plan NEW YORK, April 25 (Reuters) - Elon Musk clinched a deal to buy Twitter Inc (TWTR.N) for $44 billion cash on Monday in a transaction that will shift control of the social media platform populated by millions of users and global leaders to the world's richest person. It
- あとで読む
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models - Speech Research
- 6 users
- audioldm.github.io
- テクノロジー
- 2023/08/14
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining Haohe Liu 📮,1, Qiao Tian2,Yi Yuan1, Xubo Liu1, Xinhao Mei1,Qiuqiang Kong2 Yuping Wang2, Wenwu Wang1, Yuxuan Wang2, Mark D. Plumbley1 1CVSSP, University of Surrey, Guildford, UK 2Speech, Audio & Music Intelligence (SAMI), ByteDance 📮Corresponding author 😃 For text-to-audio generation, we generated a total of 350 audi
- 機械学習
- 人工知能
Lyra: A New Very Low-Bitrate Codec for Speech Compression
- 6 users
- ai.googleblog.com
- 学び
- 2021/02/26
Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more
Protocols, Not Platforms: A Technological Approach to Free Speech
- 6 users
- knightcolumbia.org
- テクノロジー
- 2019/12/13
Essays and Scholarship Protocols, Not Platforms: A Technological Approach to Free Speech Altering the internet's economic and digital infrastructure to promote free speech By Mike Masnick August 21, 2019 After a decade or so of the general sentiment being in favor of the internet and social media as a way to enable more speech and improve the marketplace of ideas, in the last few years the view ha
Modern CSS Tooltips And Speech Bubbles (Part 1) — Smashing Magazine
- 6 users
- www.smashingmagazine.com
- テクノロジー
- 2024/03/02
Tooltips are a very common pattern used in CSS for years. There are a lot of ways to approach tooltips in CSS, though some evoke headaches with all the magic numbers they require. In this article, Temani Afif presents modern techniques to create tooltips with the smallest amount of markup and the greatest amount of flexibility. In a previous article, we explored ribbon shapes and different ways to
- css
[英語スピーチ] スティーブジョブズ 2005スタンフォード大学卒業式演説| スティーブジョブズスピーチ | steve jobs | 日本語字幕 | 英語字幕 | Full speech
- 6 users
- www.youtube.com
- エンタメ
- 2021/07/09
スティーブ・ジョブズの2005年スタンフォード大学卒業式の伝説のスピーチです。初めてこの演説を聞いてから10年以上経ちました。当時スティーブ・ジョブズの人生は一本の映画みたいだと思いましたが、彼が亡くなった後本当に彼の映画が作られましたね。彼のスピーチは、本当に多くの人生の知恵が含まれています。本から学ぶ知恵とは違って、彼の生き方を通し知恵を学べます。彼のスピーチは彼の生き様が描かれています。みなさんがどのように感じるかは分かりませんが、私は10年前の感動した時と、今の感動は少し違いました。あまりにも有名な演説なので、すでに多くの方々がYouTubeにアップしていますが、デジタル編集して映像をきれいにし、大きめの文字/字幕を入れました。英語の文章の長さも演説の呼吸に合わせて短くしました。 2020年度スティーブ・ジョブズのスピーチで一年を始めてみませんか？が含まれています。本か
Speechify: Text to Speech Reader & AI Voice Generator
- 6 users
- speechify.com
- テクノロジー
- 2022/03/07
CUT YOUR READING TIME IN HALF. LET SPEECHIFY READ TO YOU.
- webサービス
The ‘Comfort Women’ Issue, Freedom of Speech, and Academic Integrity: A Study Aid - The Asia-Pacific Journal: Japan Focus
- 5 users
- apjjf.org
- テクノロジー
- 2021/02/21
Abstract: In December 2020, an article by J. Mark Ramseyer of Harvard University about the so-called ‘comfort women’ issue was published in the International Review of Law and Economics. This article caused widespread controversy amongst scholars, many of whom responded with serious criticisms of its content. On the other hand, some commentators argued that Ramseyer’s critics were seeking to suppr
- セキュリティ
VS Code Speech - Visual Studio Marketplace
- 5 users
- marketplace.visualstudio.com
- テクノロジー
- 2023/11/02
Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter. Speech extension for Visual Studio Code The Speech extension for Visual Studio Code adds speech-to-text capabilities to the chat interfaces in Visual Studio Code. No internet connection is required, the voice audio data is processed locally on your computer. Getting Started Install the GitHub Copilot Chat extension a
- あとで読む
Announcing the Preview of OpenAI Whisper in Azure OpenAI service and Azure AI Speech
- 5 users
- techcommunity.microsoft.com
- テクノロジー
- 2023/09/17
In July we shared with this audience that OpenAI Whisper would be coming soon to Azure AI services, and today – we are very happy to announce – is the day! Customers of Azure OpenAI service and Azure AI Speech can now use Whisper. The OpenAI Whisper model is an encoder-decoder Transformer that can transcribe audio into text in 57 languages. Additionally, it offers translation services from those l
- 人工知能
- AI
まうり塩🍊 FREEDOM OF SPEECH!!!!! on Twitter: "トロント大学心理学部の教授ジョーダン・ピーターソン。真実の追求のためには人を不快にするリスクを負う。つまりどんな人も不快にさせてはいけないとなると、真実の追求はできないという事。それを聞いて何も言い返せないフェミニストのキャシー・… https://t.co/UE0aSq8glU"
- 5 users
- twitter.com/anaiscalico
- 世の中
- 2020/09/07
トロント大学心理学部の教授ジョーダン・ピーターソン。真実の追求のためには人を不快にするリスクを負う。つまりどんな人も不快にさせてはいけないとなると、真実の追求はできないという事。それを聞いて何も言い返せないフェミニストのキャシー・… https://t.co/UE0aSq8glU
- *あとで読む
【映画】英国王のスピーチ（字幕版）（原題：The King's Speech） - どんこのブログ
- 5 users
- donko7bu.hatenablog.com
- エンタメ
- 2022/03/21
2010年（日本公開は2011年）、イギリス・オーストラリア・アメリカの映画。 www.gaga.co.jp 英国王のスピーチ（原題：The King's Speech）出演：コリン・ファースジェフリー・ラッシュヘレナ・ボナム＝カーターガイ・ピアースティモシー・スポールデレク・ジャコビジェニファー・イーリーマイケル・ガンボンアンソニー・アンドリュースロジャー・パロットイヴ・ベストフレイア・ウィルソンラモーナ・マルケスクレア・ブルームティム・ダウニーアンドリュー・ヘイヴィルエイドリアン・スカボロ　ほか監督：トム・フーパー youtu.be なんとなく観た割に、すごく良かったのだ。リンク
グーグルから音声読み上げ（Text to Speech）を使った無料オーディオブック｜Sangmin Ahn
- 5 users
- note.com/sangmin
- テクノロジー
- 2020/12/23
こんにちは、Choimirai School のサンミンです。 0 はじめに前から紹介している音声読み上げ機能（Text to Speech、TTS）ですが、さらに進化し続けています。無料でアクセスできる本を ①TTSと②WaveNetを利用し、オーディオブックとして提供しているケースも増えています。読み上げの精度もたんたん人間に近づいている感じです。個人的な感想では、Gulliver's Travelsは言われないと分からないレベル。今回の note では、Google Playでダウンロードできる本を何冊が紹介させていただきます。 1 フィクションThe Legend of Sleepy Hollow Dracula Gulliver’s Travels The Strange Case of Dr Jekyll and Mr Hyde Frankenstein The War
- あとで読む
Apple、Live Speech、Personal Voice、およびその他の新しいアクセシビリティ機能をプレビュー
- 5 users
- www.apple.com
- テクノロジー
- 2023/05/17
カリフォルニア州クパティーノ Appleは本日、認知、視覚、聴覚、身体機能のアクセシビリティのためのソフトウェア機能、および非発話者や発話能力を失うリスクのある人のためのイノベーティブなツールをプレビューしました。これらのアップデートは、ハードウェアとソフトウェアの進歩を利用し、デバイス上の機械学習を採用してユーザーのプライバシーを確保し、誰もが使える製品を作るというAppleの長年にわたる取り組みをさらに発展させるものです。 Appleは、様々な障がいのあるユーザーを代表する地域団体と密接に協力して、人々の生活に真に影響を与えるアクセシビリティ機能を開発しています。今年後半から、認知障がいを持つユーザーは、Assistive Accessによってさらに簡単に、自立してiPhoneとiPadを使えるようになります。非発話者はLive Speechを使って通話や会話の際にタイプして話せるよう
GitHub - espeak-ng/espeak-ng: eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
- 5 users
- github.com/espeak-ng
- テクノロジー
- 2020/07/05
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
田中康夫 Speech To Text Online on Twitter: "@loveyassy 「#しょぼいウイルスなのに全世界が大騒ぎ」の東浩紀（#現在ツイ垢失踪中）・三浦瑠麗@lullymiura・小林よしのり（ツイ垢不明）3賢人様の世紀の大鼎談ダイジェスト版です！… https://t.co/QXrXeOHBH2"
- 5 users
- twitter.com/MolotovAbe
- 学び
- 2020/04/10
@loveyassy 「#しょぼいウイルスなのに全世界が大騒ぎ」の東浩紀（#現在ツイ垢失踪中）・三浦瑠麗@lullymiura・小林よしのり（ツイ垢不明）3賢人様の世紀の大鼎談ダイジェスト版です！… https://t.co/QXrXeOHBH2
- バカ
GitHub - sdkcarlos/artyom.js: A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
- 5 users
- github.com/sdkcarlos
- テクノロジー
- 2020/04/11
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
GitHub - shi3z/speech-to-speech-japanese
- 5 users
- github.com/shi3z
- テクノロジー
- 2024/08/21
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- LLM
- japanese
- dev
- github
- ai
GitHub - m-bain/whisperX: WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- 5 users
- github.com/m-bain
- テクノロジー
- 2023/03/16
This repository provides fast automatic speech recognition (70x realtime with large-v2) with word-level timestamps and speaker diarization. ⚡️ Batched inference for 70x realtime transcription using whisper large-v2 🪶 faster-whisper backend, requires <8GB gpu memory for large-v2 with beam_size=5 🎯 Accurate word-level timestamps using wav2vec2 alignment 👯‍♂️ Multispeaker ASR using speaker diariza
GitHub - reriiasu/speech-to-text: Real-time transcription using faster-whisper
- 5 users
- github.com/reriiasu
- テクノロジー
- 2023/08/03
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
音声認識によるリアルタイム字幕&翻訳が可能な「Speech to Text Webcam Overlay」をZoomで使ってみた - DENET 技術ブログ
- 5 users
- blog.denet.co.jp
- テクノロジー
- 2021/03/07