苦，是进化过程中形成的防护算法。甜，是进化过程中形成的诱惑算法。

来源: 宇之道于 2023-05-09 11:06:23 [博客] [旧帖] [给我悄悄话] 本文已被阅读：次

In Reinforcement Learning (RL), agents are trained on a reward and punishment mechanism.

WENXUECITY.COM does not represent or guarantee the truthfulness, accuracy, or reliability of any of communications posted by other users.