奇怪,我问同样的问题,得到相反的回答。

问:Is deepseek's r1 model distillated from ChatGPT

 

答:

No, DeepSeek’s R1 model is not a distillation of ChatGPT. It is an independently developed model with its own architecture and training approach. While both are large language models, DeepSeek’s R1 and ChatGPT are distinct creations by different organizations.

 

Distillation typically involves creating a smaller model that mimics the behavior of a larger model. However, DeepSeek’s R1 is not known to be based on distillation from ChatGPT. Instead, it utilizes a Mixture-of-Experts (MoE) architecture, which is different from the dense transformer architecture used in ChatGPT. This MoE model allows for more efficient computational performance, activating only a subset of parameters during inference rather than using the entire model.

 

While both models aim to excel at language tasks, including understanding and generation, they have different technical designs and training methodologies.

 

所有跟帖: 

是用的4o吗? -QualityWithoutName- 给 QualityWithoutName 发送悄悄话 QualityWithoutName 的博客首页 (0 bytes) () 01/30/2025 postreply 11:39:49

我只是下载了最新的ChatGPT, 不知道具体是什么版本 -方外居士- 给 方外居士 发送悄悄话 (0 bytes) () 01/30/2025 postreply 11:44:50

剛问了一下,是GPT-4 -方外居士- 给 方外居士 发送悄悄话 (0 bytes) () 01/30/2025 postreply 11:46:22

请您先登陆,再发跟帖!