並び順

ブックマーク数

期間指定

  • から
  • まで

1 - 40 件 / 61件

新着順 人気順

recognitionの検索結果1 - 40 件 / 61件

タグ検索の該当結果が少ないため、タイトル検索結果を表示しています。

recognitionに関するエントリは61件あります。 機械学習AIJavaScript などが関連タグです。 人気エントリには 『実務で使う固有表現抽出 / Practical Use of Named Entity Recognition』などがあります。
  • 実務で使う固有表現抽出 / Practical Use of Named Entity Recognition

    ■イベント 
:自然言語処理勉強会 https://sansan.connpass.com/event/190157/ ■登壇概要 タイトル:実務で使う固有表現抽出 発表者: 
DSOC R&D研究員 高橋 寛治 ▼Twitter https://twitter.com/SansanRandD

      実務で使う固有表現抽出 / Practical Use of Named Entity Recognition
    • GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

      You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

        GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision
      • GitHub - kha-white/manga-ocr: Optical character recognition for Japanese text, with the main focus being Japanese manga

        Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Transformers' Vision Encoder Decoder framework. Manga OCR can be used as a general purpose printed Japanese OCR, but its main goal was to provide a high quality text recognition, robust against various scenarios specific to manga: both vertical and horizontal text

          GitHub - kha-white/manga-ocr: Optical character recognition for Japanese text, with the main focus being Japanese manga
        • iOSで文字認識(Text Recognition)

          iOS 13以降で、待望だった「文字認識」機能が使えるようになりました。カメラなどで撮影した画像内にある文字を読み取る [1] ことができます。 iOS 9からあった「文字検出」との違い 文字認識は、Visionフレームワークの一機能として追加されました。 一方、Core ImageのCIDetectorというクラスでは、CIDetectorTypeTextというタイプを指定でき、テキストを検出することができます。 このCIDetectorTypeTextやCIFeatureTypeTextはiOS 9からあるものです。 しかしこちらは文字の「領域」を検出する機能です。何が書いてあるか、までは認識できませんでした。 そこで今まではTesseract[2]というオープンソースのOCRエンジンや、SwiftOCR[3]という(おそらく個人がメンテしている)OSSしか選択肢がなかったのですが、つ

            iOSで文字認識(Text Recognition)
          • GitHub - alphacep/vosk-api: Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

            Vosk is an offline open source speech recognition toolkit. It enables speech recognition for 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish. More to come. Vosk models are small (50 Mb) but p

              GitHub - alphacep/vosk-api: Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
            • VOSK Offline Speech Recognition API

              РУС 中文 Vosk is a speech recognition toolkit. The best things in Vosk are: Supports 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish, Uzbek, Korean, Breton, Gujarati. More to come. Works offlin

              • PimEyes: Face Recognition Search Engine and Reverse Image Search |

                Face Search Engine Reverse Image Search Upload photo and find out where images are published

                  PimEyes: Face Recognition Search Engine and Reverse Image Search |
                • GitHub - xuebinqin/U-2-Net: The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

                  ** (2022-Aug.-24) ** We are glad to announce that our U2-Net published in Pattern Recognition has been awarded the 2020 Pattern Recognition BEST PAPER AWARD !!! ** (2022-Aug.-17) ** Our U2-Net models are now available on PlayTorch, where you can build your own demo and run it on your Android/iOS phone. Try out this demo on and bring your ideas about U2-Net to truth in minutes! ** (2022-Jul.-5)** O

                    GitHub - xuebinqin/U-2-Net: The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
                  • GitHub - PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server,

                    🔥PaddleOCR 算法模型挑战赛 火热开启!报名时间1/15-3/31,30万元奖金池!快来一展身手吧😎! 🔨2023.11 发布 PP-ChatOCRv2: 一个SDK,覆盖20+高频应用场景,支持5种文本图像智能分析能力和部署,包括通用场景关键信息抽取(快递单、营业执照和机动车行驶证等)、复杂文档场景关键信息抽取(解决生僻字、特殊标点、多页pdf、表格等难点问题)、通用OCR、文档场景专用OCR、通用表格识别。针对垂类业务场景,也支持模型训练、微调和Prompt优化。 🔥2023.8.7 发布 PaddleOCR release/2.7 发布PP-OCRv4,提供mobile和server两种模型 PP-OCRv4-mobile:速度可比情况下,中文场景效果相比于PP-OCRv3再提升4.5%,英文场景提升10%,80语种多语言模型平均识别准确率提升8%以上 PP-OCRv

                      GitHub - PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server,
                    • Google contractors reportedly targeted homeless people for Pixel 4 facial recognition

                      Tech/GoogleGoogle contractors reportedly targeted homeless people for Pixel 4 facial recognition Google contractors reportedly targeted homeless people for Pixel 4 facial recognition / They need facial scans of people with darker skin By Sean Hollister, a senior editor and founding member of The Verge who covers gadgets, games, and toys. He spent 15 years editing the likes of CNET, Gizmodo, and En

                        Google contractors reportedly targeted homeless people for Pixel 4 facial recognition
                      • How Disney uses PyTorch for animated character recognition

                        Authors: Miquel Àngel Farré, Anthony Accardo, Marc Junyent, Monica Alfaro, Cesc Guitart at Disney Disney’s Content GenomeThe long and incremental evolution of the media industry, from a traditional broadcast and home video model, to a more mixed model with increasingly digitally-accessible content, has accelerated the use of machine learning and artificial intelligence (AI). Advancing the implemen

                          How Disney uses PyTorch for animated character recognition
                        • DeNA, MoT AI勉強会発表資料「顔認識と最近のArcFaceまわりと」 / Face Recognition & ArcFace papers

                          DeNA, Mobility TechnologiesのAI勉強会で発表した資料です ・顔認識分野周りってどんな感じなの ・特に、最近のArcFaceまわりの手法どうなってきてるの 紹介論文: AdaptiveFace (CVPR’19) AdaCos (CVPR’19) (MV-ArcFace (AAAI’20)) CurricularFace (CVPR’20) GroupFace (CVPR’20) Sub-center ArcFace (ECCV’20) MagFace (CVPR’21) ElasticFace (CVPRW’22) AdaFace (CVPR’22)

                            DeNA, MoT AI勉強会発表資料「顔認識と最近のArcFaceまわりと」 / Face Recognition & ArcFace papers
                          • Amplify+Angular+Recognitionを使って画像からテキストを読み取るアプリケーションをサクッと作ってみる | DevelopersIO

                            Amplify+Angular+Recognitionを使って画像からテキストを読み取るアプリケーションをサクッと作ってみる どうも!大阪オフィスの西村祐二です。 今回はAngularとAmplifyとRecognitionを使って画像からテキストを読み取るアプリケーションを作ってみたいと思います。 ゴールとなるアプリケーションは下記になります。 文字が含まれる画像をアップロードするとバックエンドのRecognitionのAPIをコールし、画像からテキストを抽出して表示するというアプリケーションです。 作ってみる 環境 aws-amplify: 2.2.2 amplify cli: 4.12.0 Angular CLI: 8.3.23 Node: 12.13.0 OS: darwin x64 Angular: 8.2.14 ... animations, common, compiler,

                              Amplify+Angular+Recognitionを使って画像からテキストを読み取るアプリケーションをサクッと作ってみる | DevelopersIO
                            • GitHub - facebookresearch/detectron2: Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

                              You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                GitHub - facebookresearch/detectron2: Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
                              • GitHub - pfliu-nlp/Named-Entity-Recognition-NER-Papers: An elaborate and exhaustive paper list for Named Entity Recognition (NER)

                                You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                  GitHub - pfliu-nlp/Named-Entity-Recognition-NER-Papers: An elaborate and exhaustive paper list for Named Entity Recognition (NER)
                                • Speech Recognition For Mac - rulesland

                                  Visiteurs depuis le 28/01/2019 : 862 Connectés : 1 Record de connectés : 14 This video shows hot to set up Dictation and how to use it with ease on your Mac for the purposes of speech recognition. I updated this post about speech to text software and dictation in January 2, 2018 When I was employed as a journalist, I spent a lot of time interviewing people. One of the most painful things I had to

                                    Speech Recognition For Mac - rulesland
                                  • GitHub - julius-speech/julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine

                                    You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

                                      GitHub - julius-speech/julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine
                                    • NoodlでWeb Speech API Speech Recognitionを使う!Noodl Javascriptノードの使い方も解説 - Qiita

                                      このように、複数追加もできるようです。 Noodl1.3ではこのような処理はif文で書いていました。2.0のほうがスッキリかけそうですね。 change:function inputの値のどれかが変更されたときに実行される。 このプロジェクトのJavascriptノードの中身 サンプルでは、ラーメンをタップしたときにmySignalにtrueのシグナルを送り、音声認識を実行させています。 define({ // The input ports of the Javascript node, name of input and type inputs:{ // ExampleInput:'number', // Available types are 'number', 'string', 'boolean', 'color' and 'signal', mySignal:'signal'

                                        NoodlでWeb Speech API Speech Recognitionを使う!Noodl Javascriptノードの使い方も解説 - Qiita
                                      • Suppression of RNA recognition by Toll-like receptors: the impact of nucleoside modification and the evolutionary origin of RNA - PubMed

                                        The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site. The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

                                          Suppression of RNA recognition by Toll-like receptors: the impact of nucleoside modification and the evolutionary origin of RNA - PubMed
                                        • Face Recognition @ ECCV2022

                                          DeNA, Mobility TechnologiesのAI勉強会で発表した資料です face recognition分野の最新論文のキャッチアップ。ECCV 2022。 紹介論文: ・Teaching Where to Look: Attention Similarity Knowledge Distillation for Low Resolution Face Recognition ・CoupleFace: Relation Matters for Face Recognition Distillation ・BoundaryFace: A mining framework with noise label self-correction for Face Recognition ・Towards Robust Face Recognition with Comprehensive

                                            Face Recognition @ ECCV2022
                                          • A Visual History of Interpretation for Image Recognition

                                            Image recognition (i.e. classifying what object is shown in an image) is a core task in computer vision, as it enables various downstream applications (automatically tagging photos, assisting visually impaired people, etc.), and has become a standard task on which to benchmark machine learning (ML) algorithms. Deep learning (DL) algorithms have, over the past decade, emerged as the most competitiv

                                              A Visual History of Interpretation for Image Recognition
                                            • Web Worker を使ってブラウザ上でポケモンの画像を解析したい! / Pokemon recognition from screenshots in browser using web worker

                                              Universal な Worker を用意しだしたのは良いけれど、なんやかんやで最後 worker_threads が要らなくなって Web Worker オンリーに完全移行したまでがオチです。 社内発表タイトルは「ブラウザ上でポケモンの画像を解析したい!」です。 2020/05/11 に LINE 社内でやった GW の自由研究の成果発表 LT 大会の資料です。 社内の話は一部削除し、外部向けに数枚追記しています。 5分の中ではプロダクトの説明が精一杯だったので、SSR/SPA の技術的な話はまたどこかで。

                                                Web Worker を使ってブラウザ上でポケモンの画像を解析したい! / Pokemon recognition from screenshots in browser using web worker
                                              • GitHub - sdkcarlos/artyom.js: A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.

                                                Due to abuse of users with the Speech Synthesis API (ADS, Fake system warnings), Google decided to remove the usage of the API in the browser when it's not triggered by an user gesture (click, touch etc.). This means that calling for example artyom.say("Hello") if it's not wrapped inside an user event won't work. So on every page load, the user will need to click at least once time per page to all

                                                  GitHub - sdkcarlos/artyom.js: A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
                                                • High-Performance Large-Scale Image Recognition Without Normalization

                                                  Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples. Although recent work has succeeded in training deep ResNets without normalization layers, these models do not match the test accuracies of the best batch-normalized networks, and are often unstable for l

                                                  • GitHub - NVIDIA/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

                                                    Large Language Models and Multimodal Accelerate your generative AI journey with NVIDIA NeMo Framework on GKE (2024/03/16) An end-to-end walkthrough to train generative AI models on the Google Kubernetes Engine (GKE) using the NVIDIA NeMo Framework is available at https://github.com/GoogleCloudPlatform/nvidia-nemo-on-gke. The walkthrough includes detailed instructions on how to set up a Google Clou

                                                      GitHub - NVIDIA/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
                                                    • Clearview AI | Facial Recognition

                                                      Clearview AI’s investigative platform allows law enforcement to rapidly generate leads to help identify suspects, witnesses and victims to close cases faster and keep communities safe. Learn More >

                                                        Clearview AI | Facial Recognition
                                                      • 名寄せ(entity recognition, deduplication) で使える特徴量 - Qiita

                                                        レコードやオブジェクトを教師あり学習・教師なし学習や検索エンジンで 名寄せ(Entity Recognition・Deduplication)するときに、それぞれのフィールドから特徴量を抜き出す必要があります。 意外とまとまって言及しているリファレンスは少ないので、 特に文字列のフィールドでよく使われる特徴量を上げてみました。 データベースのブロッキングに使われるものも含まれます。 特徴量の種類 分類は独自の基準に基づきます。 Token 固有表現 音素 分散表現/次元圧縮 検索スコア 距離・擬似距離 (レコードのペアの場合) 各特徴量の概要 1. Token 文字列から、さらに小さい構成単位を抽出します。 ただし、次元が大きいsparse matrixになるため、機械学習やクラスタリングで用いるには次元に対して大量のデータが必要か、工夫が必要です。 character ngram ご存じ

                                                          名寄せ(entity recognition, deduplication) で使える特徴量 - Qiita
                                                        • GitHub - open-mmlab/mmocr: OpenMMLab Text Detection, Recognition and Understanding Toolbox

                                                          You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                                            GitHub - open-mmlab/mmocr: OpenMMLab Text Detection, Recognition and Understanding Toolbox
                                                          • An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

                                                            While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not nece

                                                            • GitHub - m-bain/whisperX: WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

                                                              This repository provides fast automatic speech recognition (70x realtime with large-v2) with word-level timestamps and speaker diarization. ⚡️ Batched inference for 70x realtime transcription using whisper large-v2 🪶 faster-whisper backend, requires <8GB gpu memory for large-v2 with beam_size=5 🎯 Accurate word-level timestamps using wav2vec2 alignment 👯‍♂️ Multispeaker ASR using speaker diariza

                                                                GitHub - m-bain/whisperX: WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
                                                              • iOS 14で追加された音声認識機能(Sound Recognition)がちょっと怖いらしい 「絶対オンにしないな」「不気味すぎるよ」|ガジェット通信 GetNews

                                                                iOS 14 comes with support for Sound Recognition in Accessibility. Your phone can now listen for specific sounds – a baby crying, smoke alarm, water running, etc. – and notify you. Amazing feature for all kinds of users – inclusivity at its best. #WWDC2020 pic.twitter.com/3hIL8JuTyB— Federico Viticci (@viticci) June 23, 2020

                                                                  iOS 14で追加された音声認識機能(Sound Recognition)がちょっと怖いらしい 「絶対オンにしないな」「不気味すぎるよ」|ガジェット通信 GetNews
                                                                • Facial recognition identifies extremists storming the Capitol

                                                                  Correction: An earlier version of this story incorrectly stated that XRVision facial recognition software identified Antifa members among rioters who stormed the Capitol Wednesday. XRVision did not identify any Antifa members. The Washington Times apologizes to XRVision for the error. Facial recognition software has identified neo-Nazis and other extremists as participants in Wednesday’s assault o

                                                                    Facial recognition identifies extremists storming the Capitol
                                                                  • Simple Transformers — Named Entity Recognition with Transformer Models

                                                                    Simple Transformers — Named Entity Recognition with Transformer Models Simple Transformers is the “it just works” Transformer library. Use Transformer models for Named Entity Recognition with just 3 lines of code. Yes, really.

                                                                      Simple Transformers — Named Entity Recognition with Transformer Models
                                                                    • Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

                                                                      ■イベント 
:第六回 全日本コンピュータビジョン勉強会 https://kantocv.connpass.com/event/205271/ ■登壇概要 タイトル:Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition 発表者: 
DSOC R&D研究員  内田 奏 ▼Twitter https://twitter.com/SansanRandD

                                                                        Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
                                                                      • Recognition of aerosol transmission of infectious agents: a commentary - BMC Infectious Diseases

                                                                        Review Open Access Published: 31 January 2019 Recognition of aerosol transmission of infectious agents: a commentary Raymond Tellier1, Yuguo Li2, Benjamin J. Cowling3 & …Julian W. Tang4,5 Show authors BMC Infectious Diseases volume 19, Article number: 101 (2019) Cite this article Although short-range large-droplet transmission is possible for most respiratory infectious agents, deciding on whether

                                                                          Recognition of aerosol transmission of infectious agents: a commentary - BMC Infectious Diseases
                                                                        • GitHub - exadel-inc/CompreFace: Leading free and open-source face recognition system

                                                                          You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                                                            GitHub - exadel-inc/CompreFace: Leading free and open-source face recognition system
                                                                          • GitHub - DigitalNatureGroup/Remote_Voice_Recognition: リモートミーティングでの音声認識の活用事例

                                                                            You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                                                              GitHub - DigitalNatureGroup/Remote_Voice_Recognition: リモートミーティングでの音声認識の活用事例
                                                                            • Wav2vec: Semi-supervised and Unsupervised Speech Recognition

                                                                              Word2vec for audio quantizes phonemes, transforms, GAN trains on text and audio from Facebook AI. JS disabled! Watch Wav2vec: Semi-supervised and Unsupervised Speech Recognition on Youtube Watch video "Wav2vec: Semi-supervised and Unsupervised Speech Recognition" Wav2vec is fascinating in that it combines several neural network architectures and methods: CNN, transformer, quantization, and GAN tra

                                                                                Wav2vec: Semi-supervised and Unsupervised Speech Recognition
                                                                              • Pythonで手軽に顔認識をやってみる(face-recognition)

                                                                                はじめまして!エンジニアのUemaです。 近年では、スマホのロックの解除や入館時の認証など様々なことに顔認識の技術が使われています。 顔認識を利用するには機械学習、画像処理や数学などの様々な知識が必要で学習コストがかかり、顔認識を使ってアプリケーションを作ってみたいと考えている人もなかなか手が出ないと思います。 そんな人に朗報です! 手軽に顔認識を行えるface-recognitionというPythonライブラリが存在します! 今回は顔認識の入り口として、face-recognitionを実際に使ってみたいと思います。 face-recognitionとは Pythonコードやコマンドラインで手軽に顔を検出・認識することができるライブラリです。face-recognitionの顔認識モデルは99%の正解率を記録しているそうです。 インストール(mac) Pythonとhomebrewがイン

                                                                                  Pythonで手軽に顔認識をやってみる(face-recognition)
                                                                                • GitHub - ccoreilly/vosk-browser: A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

                                                                                  You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                                                                    GitHub - ccoreilly/vosk-browser: A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

                                                                                  新着記事