3つの要点 ✔️ Attentionの計算量をO(n^2)からO(n log n)へと劇的に削減 ✔️ アクティベーションなどのメモリ使用量を大幅に削減 ✔️ 速度・メモリ共に実装効率を大きく改善しながらも、Transformerの性能を維持 Reformer: The Efficient Transformer written by Anonymous (Submitted on 13 Jan 2020 (v1), last revised 18 Feb 2020 (this version, v2)) Comments: Accepted at ICLR2021 Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Machine Learning (stat.ML) Official Comm 大規