【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc) This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general
![深層学習とTensorFlow入門](https://cdn-ak-scissors.b.st-hatena.com/image/square/5774242c56bb125f685eebac4508bb66af109770/height=288;version=1;width=512/https%3A%2F%2Fcdn.slidesharecdn.com%2Fss_thumbnails%2Flecture1020-161021044644-thumbnail.jpg%3Fwidth%3D640%26height%3D640%26fit%3Dbounds)