[B! Transformer] yudukikun5120のブックマーク

yudukikun5120 id:yudukikun5120

Transformerに関するyudukikun5120のブックマーク (15)

LaMDA: our breakthrough conversation technology
yudukikun5120 2024/07/12
Transformer
リンク
Transformer Circuits Thread
Some opinionated thoughts on why interpretability research may have qualitative aspects be more central than we're used to in other fields.
yudukikun5120 2024/07/08
Transformer

研究
リンク
Relating transformers to models and neural representations of the hippocampal formation
- 1 user
- arxiv.org
- 学び
yudukikun5120 2024/06/22
人工知能の哲学

Transformer
リンク
Transformer-based normative modelling for anomaly detection of early schizophrenia
- 1 user
- arxiv.org
- 学び
yudukikun5120 2024/06/20
Transformer

論文

言語哲学
リンク
Diagnostic Spatio-temporal Transformer with Faithful Encoding
- 1 user
- arxiv.org
- 学び
yudukikun5120 2024/06/18
位置符号化

Transformer

論文
リンク
New Tool: the Residual Stream Viewer — AI Alignment Forum
yudukikun5120 2024/06/13
Transformer
リンク
In-context Learning and Induction Heads
"Induction heads" are attention heads that implement a simple algorithm to complete token sequences like [A][B] ... [A] -> [B]. In this work, we present preliminary and indirect evidence for a hypothesis that induction heads might constitute the mechanism for the majority of all "in-context learning" in large transf ormer models (i.e. decreasing loss at increasing token indices). We find that induc
yudukikun5120 2024/06/13
Transformer

論文

人工知能の哲学
リンク
Image Captioners Are Scalable Vision Learners Too
- 1 user
- arxiv.org
- 学び
yudukikun5120 2024/06/06
コンピュータビジョン

Transformer
リンク
Thinking Like Transformers
- 2 users
- arxiv.org
- 学び
What is the computational model behind a Transf ormer? Where recurrent neural networks have direct parallels in finite state machines, allowing clear discussion and thought around architecture variants or trained models, Transf ormers have no such familiar parallel. In this paper we aim to change that, proposing a computational model for the transf ormer-encoder in the form of a programming language.
yudukikun5120 2024/05/18
Transformer
リンク
Thinking like Transformer
Thinking Like Transf ormers Paper by Gail Weiss, Yoav Goldberg, Eran Yahav Blog by Sasha Rush and Gail Weiss Library and Interactive Notebook: srush/raspy Transf ormer models are foundational to AI systems. There are now countless explanations of “how transf ormers work?” in the sense of the architecture diagram at the heart of transf ormers. svg However this diagram does not provide any intuition int
yudukikun5120 2024/05/18
マジで意味が分からん

Transformer
リンク
Visual Question Answering
VQA is a new dataset containing open-ended questions about images. These questions require an understanding of vision, language and commonsense knowledge to answer. 265,016 images (COCO and abstract scenes) At least 3 questions (5.4 questions on average) per image 10 ground truth answers per question 3 plausible (but likely incorrect) answers per question Automatic evaluation metric
yudukikun5120 2024/04/30
Transformer
リンク
Unsupervised Speech Recognition
- 2 users
- arxiv.org
- 学び
Despite rapid progress in the recent past, current speech recognition systems still require labeled training data which limits this techno logy to a small fraction of the languages spoken around the globe. This paper describes wav2vec-U, short for wav2vec Unsupervised, a method to train speech recognition models without any labeled data. We leverage self-supervised speech representations to segment
yudukikun5120 2024/04/30
Transformer
リンク
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
- 2 users
- arxiv.org
- 学び
We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler. wav2vec 2.0 masks the speech input in the latent space and solves a contrastive task defined over a quantization of the latent representations which are jointly learned. Experiments
yudukikun5120 2024/04/30
Transformer
リンク
Self-attention Does Not Need $O(n^2)$ Memory
yudukikun5120 2024/04/30
自然言語処理

Transformer
リンク
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
- 1 user
- arxiv.org
- 学び
yudukikun5120 2024/04/29
BERT

Transformer
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx