现在的视觉大模型比如SAM，SAM2，CLIP都是基于visual transformer

来源: 丁丁在美洲于 2024-09-01 15:26:45 [博客] [旧帖] [给我悄悄话] 本文已被阅读：次

WENXUECITY.COM does not represent or guarantee the truthfulness, accuracy, or reliability of any of communications posted by other users.