Transformer之前是用RNN(recurrent structure). 而 T用自我attention

来源: 2023-02-10 20:37:43 [博客] [旧帖] [给我悄悄话] 本文已被阅读: