並び順

ブックマーク数

期間指定

  • から
  • まで

201 - 240 件 / 1158件

新着順 人気順

recognitionの検索結果201 - 240 件 / 1158件

  • Optical character recognition

    Video of the process of scanning and real-time optical character recognition (OCR) with a portable scanner Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo (for example the text on signs and billbo

      Optical character recognition
    • GitHub - worldveil/dejavu: Audio fingerprinting and recognition in Python

      You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

        GitHub - worldveil/dejavu: Audio fingerprinting and recognition in Python
      • PhotoSketch: Photoshop + Image Recognition = Awesome

        This technology is just mind-boggling. PhotoSketch may be the coolest program we've seen or written about since the invisible speakers. PhotoSketch is an "Internet Image Montage" project from five Chinese Computer Science and Technology students at Tsinghua University and the National University of Singapore. The basic premise, which they present in the form of a research paper [pdf], works like t

          PhotoSketch: Photoshop + Image Recognition = Awesome
        • [PDF]Pattern Recognition and Machine Learning

          • ImageNet Large Scale Visual Recognition Challenge

            The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have

            • GitHub - guillaume-chevalier/LSTM-Human-Activity-Recognition: Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier

              You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                GitHub - guillaume-chevalier/LSTM-Human-Activity-Recognition: Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier
              • Object recognition for free

                Caption: The first layers (1 and 2) of a neural network trained to classify scenes seem to be tuned to geometric patterns of increasing complexity, but the higher layers (3 and 4) appear to be picking out particular classes of objects. *Terms of Use: Images for download on the MIT News office website are made available to non-commercial entities, press and the general public under a Creative Commo

                  Object recognition for free
                • Speech Recognition Grammar Specification Version 1.0

                  W3C Recommendation 16 March 2004 This version: http://www.w3.org/TR/2004/REC-speech-grammar-20040316/ Latest version: http://www.w3.org/TR/speech-grammar/ Previous version: http://www.w3.org/TR/2003/PR-speech-grammar-20031218/ Editors: Andrew Hunt, ScanSoft Scott McGlashan, Hewlett-Packard Contributors: See Acknowledgements Please refer to the errata for this document, which may include some norma

                  • Advancing state-of-the-art image recognition with deep learning on hashtags

                    Advancing state-of-the-art image recognition with deep learning on hashtags Image recognition is one of the pillars of AI research and an area of focus for Facebook. Our researchers and engineers aim to push the boundaries of computer vision and then apply that work to benefit people in the real world — for example, using AI to generate audio captions of photos for visually impaired users. In orde

                      Advancing state-of-the-art image recognition with deep learning on hashtags
                    • International Association for Pattern Recognition – An association of non-profit, scientific, and professional organizations concerned with pattern recognition, computer vision, and image processing in a broad sense.

                      The International Association for Pattern Recognition is an association of non-profit, scientific, and professional organizations concerned with pattern recognition, computer vision, and image processing in a broad sense.

                      • Hananona - Flower Recognition Service - STAIR Lab.

                        STEP Select a photo by touching “camera icon”. You can take a photo if your device is a smartphone. Watch the preview, and touch “send” if it’s OK. Wait a second. AI will answer you the name (sometimes multiple candidates) of the flower.

                        • Augur · Recognition

                          Augur is a suite of APIs that powers device recognition for businesses of all sizes.

                          • GitHub - fljot/Gestouch: Gestouch: multitouch gesture recognition library for Flash (ActionScript) development.

                            You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                              GitHub - fljot/Gestouch: Gestouch: multitouch gesture recognition library for Flash (ActionScript) development.
                            • High-Performance Large-Scale Image Recognition Without Normalization

                              Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples. Although recent work has succeeded in training deep ResNets without normalization layers, these models do not match the test accuracies of the best batch-normalized networks, and are often unstable for l

                              • Free Automated Number Plate Recognition Software | ANPR news

                                PIXELCASE | Automatic number plate recognition software for cars, drones, phones and CCTV | USA | Australia | New Zealand | UK Automatic Licence plate recognition software for Police, rangers, officers and security | Vehicle ANPR | Drone ALPR | Mobile ANPR

                                • Hackszine.com: Gesture recognition for Javascript and Flash

                                  HACKS: Clever solutions to interesting problems. Welcome to Hackszine - from the makers of MAKE and CRAFT RSS The "$1 Recognizer" is a simple gesture recognition algorithm created by Andy Wilson from Microsoft Research and Jacob Wobbrock and Yang Li from the University of Washington. By simple, I mean that it's under 100 lines of code that you can quickly add to your application to give it gestur

                                  • GitHub - belltailjp/selective_search_py: Python-based implementation of the Selective Search for Object Recognition.

                                    You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                      GitHub - belltailjp/selective_search_py: Python-based implementation of the Selective Search for Object Recognition.
                                    • GitHub - TalAter/annyang: :speech_balloon: Speech recognition for your site

                                      You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                        GitHub - TalAter/annyang: :speech_balloon: Speech recognition for your site
                                      • The PASCAL Object Recognition Database Collection

                                        News 04-Apr-07: The VOC2007 challenge development kit is now available. Objectives To compile a standardised collection of object recognition databases To provide standardised ground truth object annotations across all databases To provide a common set of tools for accessing and managing the database annotations Standardising Each Database All images available in PNG format Annotated wi

                                        • Pinterest Acquires Image Recognition And Visual Search Startup VisualGraph | TechCrunch

                                          Startups Pinterest Acquires Image Recognition And Visual Search Startup VisualGraph Pinterest has just acquired two-man startup VisualGraph, which creates machine vision, image recognition, and visual search technologies. The company’s founder Kevin Jing and his partner David Liu are joining the Pinterest engineering team today. Pinterest says “the acquisition of VisualGraph will help us build tec

                                            Pinterest Acquires Image Recognition And Visual Search Startup VisualGraph | TechCrunch
                                          • 書籍紹介: Visual Object Recognition - n_hidekeyの日記

                                            Visual Object Recognition (Synthesis Lectures on Artificial Intelligence and Machine Learning) 作者: Kristen Grauman,Bastian Leibe出版社/メーカー: Morgan & Claypool Publishers発売日: 2011/02/28メディア: ペーパーバック購入: 12人 クリック: 182回この商品を含むブログを見る最近読んだ本の紹介をしたいと思います。 名前の通り画像認識に関するチュートリアル本で、出版されたのは2011年2月28日です。 著者がKristen GraumanとBastian Leibeの二人なので、これはと思い衝動買いしてしまいましたが、期待通りいい内容でした。 前半と後半で特定物体認識と一般物体認識の話に大きく分かれており、それぞれ基礎か

                                              書籍紹介: Visual Object Recognition - n_hidekeyの日記
                                            • Random Forest for kazoo04 recognition - Stimulator

                                              - 挨拶 - みなさんこんにちは。Kazoo04 Advent Calender 7日目を担当します、@vaaaaanquish ことばんくしです。よろしくお願いします。突然ですがみなさん、 "みなさん、かずー氏好きですか?" …そうですね。まあまあですね。今回はそんなKazoo04に捧げる記事を書いていきたいと思います。技術的、専門的な難しい内容は全然出てこないので気軽にどうぞ。 - 背景 - 「Kazoo04に会ってみたい」ここ数年、このようなワードがインターネットを闊歩するようになりました。人類の歴史、インターネットの歴史から見ても、ここまでKazoo04が切望された時代はおそらく初めてなのではないでしょうか。これは、Kazoo04によって生み出された言葉やクラスタが世界に多大な影響を与えた証拠であると言えるでしょう。しかしながら、Kazoo04は一個人であり、人の身。神ではありませ

                                                Random Forest for kazoo04 recognition - Stimulator
                                              • GitHub - NVIDIA/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

                                                Large Language Models and Multimodal Accelerate your generative AI journey with NVIDIA NeMo Framework on GKE (2024/03/16) An end-to-end walkthrough to train generative AI models on the Google Kubernetes Engine (GKE) using the NVIDIA NeMo Framework is available at https://github.com/GoogleCloudPlatform/nvidia-nemo-on-gke. The walkthrough includes detailed instructions on how to set up a Google Clou

                                                  GitHub - NVIDIA/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
                                                • 論文輪読資料「FaceNet: A Unified Embedding for Face Recognition and Clustering」

                                                  論文輪読資料「FaceNet: A Unified Embedding for Face Recognition and Clustering」

                                                    論文輪読資料「FaceNet: A Unified Embedding for Face Recognition and Clustering」
                                                  • libface - Face Recognition Library

                                                    Feb 2011: 0.1 RELEASE!!!! Download it here. Work on 0.2 has already begun and will bring improvements to the detection and recognition. The current trunk version is already better at detection. Dec 2010: We have implemented most of the features now. A lot of bugs have been ironed out. Hopefully version 0.1 will be out there very soon. In Apr 2010, Google Summer Of Code has announced that Aditya Bh

                                                    • Clearview AI | Facial Recognition

                                                      Clearview AI’s investigative platform allows law enforcement to rapidly generate leads to help identify suspects, witnesses and victims to close cases faster and keep communities safe. Learn More >

                                                        Clearview AI | Facial Recognition
                                                      • ブラウザで音声操作をする。(Speech Recognition API) - Qiita

                                                        AppleのSiriやGoogleのOK Googleの様に音声コマンドで様々な機能を操作する事が可能になっています。「ブラウザでも似たようなこと出来ないかな」と以前より思っておりました。 そこでHTML5の音声認識API - Speech Recognition API を利用して、ブラウザの要素を音声で操作してみました。 APIのサポート状況は現在のところChromeとAndroid Chromeのみです。caniuse.com - Speech Recognition API 音声認識の基本操作 W3Cのドキュメント - Web Speech API Specification - W3Cにこの様なサンプルが書かれています。 <textarea id="textarea" rows=10 cols=80></textarea> <button id="button" onclick=

                                                          ブラウザで音声操作をする。(Speech Recognition API) - Qiita
                                                        • Chinese police are using facial recognition sunglasses to track citizens

                                                          China’s police have a new weapon in their surveillance arsenal: sunglasses with built-in facial recognition. According to reports from local media, the glasses are being tested at train stations in the “emerging megacity” of Zhengzhou, where they’ll be used to scan travelers during the upcoming Lunar New Year migration. This is a period of extremely busy holiday travel, often described as the larg

                                                            Chinese police are using facial recognition sunglasses to track citizens
                                                          • 名寄せ(entity recognition, deduplication) で使える特徴量 - Qiita

                                                            レコードやオブジェクトを教師あり学習・教師なし学習や検索エンジンで 名寄せ(Entity Recognition・Deduplication)するときに、それぞれのフィールドから特徴量を抜き出す必要があります。 意外とまとまって言及しているリファレンスは少ないので、 特に文字列のフィールドでよく使われる特徴量を上げてみました。 データベースのブロッキングに使われるものも含まれます。 特徴量の種類 分類は独自の基準に基づきます。 Token 固有表現 音素 分散表現/次元圧縮 検索スコア 距離・擬似距離 (レコードのペアの場合) 各特徴量の概要 1. Token 文字列から、さらに小さい構成単位を抽出します。 ただし、次元が大きいsparse matrixになるため、機械学習やクラスタリングで用いるには次元に対して大量のデータが必要か、工夫が必要です。 character ngram ご存じ

                                                              名寄せ(entity recognition, deduplication) で使える特徴量 - Qiita
                                                            • Stanford University CS231n: Convolutional Neural Networks for Visual Recognition

                                                              Schedule and Syllabus Unless otherwise specified the course lectures and meeting times are: Monday, Wednesday 2:15-3:30 Bishop Auditorium in Lathrop Building ( map )

                                                              • Pattern Recognition and Machine Learning: Data Sets

                                                                  Pattern Recognition and Machine Learning: Data Sets
                                                                • Course in Speech Recognition

                                                                  Course in Speech and Speaker Recognition Spring Semester 2007 Purpose The purpose of this 5 p doctoral course is to give students with basic knowledge of speech technology a deeper understanding of techniques for speech and speaker recognition. Contents The course consists of lectures, practical assignments, exercises and the writing of a term paper on an individually selected topic. The following

                                                                  • Visual Recognition + Kotlin で撮影した画像で商品検索が出来る Android アプリを作ろう – IBM Developer

                                                                    2.1 Kotlin の概要 Kotlin は jetbrains が開発した Java Virtual Machine 上で動作するプログラミング言語です。Java との互換性が高く、可読性が高いことから Android アプリ開発に使われるようになり、近年人気が高まりつつある言語です。本チュートリアルのサンプルコードは Kotlin のプログラミング、Android アプリのプログラミングに興味があるどのレベルの方でもなるべく多くのことを得ることが出来るようにコーディングのノウハウも多量に詰め込みました。コード内で使われている Kotlin の言語仕様や Android 開発の基本的な概念を 1 から説明することは出来ないため、多少難しいと感じるところがあるかもしれませんが、そう感じられる方にこそ、このチュートリアルはとても有意義なものになると思います。難しいと感じたところを他の書籍や

                                                                      Visual Recognition + Kotlin で撮影した画像で商品検索が出来る Android アプリを作ろう – IBM Developer
                                                                    • GitHub - open-mmlab/mmocr: OpenMMLab Text Detection, Recognition and Understanding Toolbox

                                                                      You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                                                        GitHub - open-mmlab/mmocr: OpenMMLab Text Detection, Recognition and Understanding Toolbox
                                                                      • GitHub - mobimeo/node-yolo: Node bindings for YOLO/Darknet image recognition library

                                                                        You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                                                          GitHub - mobimeo/node-yolo: Node bindings for YOLO/Darknet image recognition library
                                                                        • Object and Concept Recognition for Content-Based Image Retrieval

                                                                          Department of Computer Science and Engineering University of Washington This reseach is funded by the National Science Foundation under grant nr. IIS-0097329, Project Report Ground Truth Database Object Recognition Demos Project Summary With the advent of powerful but inexpensive computers and storage devices and with the availability of the World Wide Web, image databases have moved from research

                                                                          • OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

                                                                            We present an integrated framework for using Convolutional Networks for classification, localization and detection. We show how a multiscale and sliding window approach can be efficiently implemented within a ConvNet. We also introduce a novel deep learning approach to localization by learning to predict object boundaries. Bounding boxes are then accumulated rather than suppressed in order to incr

                                                                            • Betaface | Advanced face recognition

                                                                              We offer ready components, such as face recognition SDKs, as well as custom software development services and hosted web services with a focus on image and video analysis, faces and objects recognition. Our technology is used by video and images archives, web advertising and entertainment projects, media content producers, video surveillance and security software solutions, end user and b2b softwa

                                                                              • kaggle TensorFlow Speech Recognition Challengeの上位者のアプローチを紹介する(後編) - Qiita

                                                                                kaggle TensorFlow Speech Recognition Challengeの上位者のアプローチを紹介する(後編)DeepLearning音声認識データサイエンスKaggleSpeechRecognition INTRODUCTION 前回に引き続き、kaggleのTensorflow Speech Recognition Challangeの上位者の アプローチを紹介いたします。 これはこの記事の続きです。 先にそちらをご覧ください。 今回は 1. Network Architecture 2. optimizer 3. resampling 4. normalization / standarization 5. data augmenation 6. silenceクラスへの対応 7. 未知のunknonwへの対応 8. 軽量化・高速化の工夫 9. LBのデータのトレ

                                                                                  kaggle TensorFlow Speech Recognition Challengeの上位者のアプローチを紹介する(後編) - Qiita
                                                                                • Microsoft researchers achieve new conversational speech recognition milestone - Microsoft Research

                                                                                  Last year, Microsoft’s speech and dialog research group announced (opens in new tab) a milestone in reaching human parity on the Switchboard conversational speech recognition task, meaning we had created technology that recognized words in a conversation as well as professional human transcribers. (opens in new tab) After our transcription system reached the 5.9 percent word error rate that we had

                                                                                    Microsoft researchers achieve new conversational speech recognition milestone - Microsoft Research