"I came up with this whole idea while optimizing wllama to run deepseek-r1-distilled-qwen-1.5B faster. So the bigger deepseek helping optimize code to run the smaller deepseek."
他自己说的
所有跟帖:
•
这个和gpu 指令优化没关系。
-BeyondWind-
♂
(0 bytes)
()
01/29/2025 postreply
17:27:18
•
the bigger deepseek helping optimize code to run the smaller
-cn_abcd-
♂
(0 bytes)
()
01/29/2025 postreply
17:34:54