R1擅长的推理模型，领先O1但非碾压。最出色的还是用RL代替人工做微调，再度证明：AI胜过人力。

来源: uptrend 于 2025-01-27 15:03:26 [博客] [旧帖] [给我悄悄话] 本文已被阅读：次

WENXUECITY.COM does not represent or guarantee the truthfulness, accuracy, or reliability of any of communications posted by other users.