有比;单片的

本帖于 2025-11-03 23:05:40 时间, 由普通用户 胡雪盐8 编辑

https://www.tomshardware.com/tech-industry/artificial-intelligence/deepseek-research-suggests-huaweis-ascend-910c-delivers-60-percent-nvidia-h100-inference-performance?utm_source=chatgpt.com

 

 

Ascend 910C vs NVIDIA H100 vs AMD MI300X
Specification Huawei Ascend 910C NVIDIA H100 (SXM5) AMD MI300X
FP16 Performance 800 TFLOPS 989 TFLOPS (Sparsity: 1979) 1307.4 TFLOPS (Sparsity: 2614.8)
INT8 Performance ~1600 TOPS ~1979 TOPS (Sparsity: 3958) 2614.9 TOPS (Sparsity: 5229.8)
Memory 128GB HBM3 80GB HBM3 192GB HBM3e
Memory Bandwidth 3.2 TB/s 3.35 TB/s 5.3 TB/s
Power Consumption (TDP) ~310W (potentially higher) Up to 700W 750W
Software Ecosystem CANN, MindSpore, PyTorch, TensorFlow CUDA, cuDNN, TensorRT ROCm, HIP

Note: NVIDIA and AMD often quote performance with sparsity features; dense compute figures are used for a more direct comparison where possible.

 

 

 

Feature Huawei Ascend 920 (Claimed/Projected) NVIDIA H100 (SXM/PCIe)
Architecture Huawei Da Vinci Architecture (Chiplet-based) NVIDIA Hopper
Process Node SMIC 6nm (Projected) TSMC 4nm (Custom)
FP16/BF16 Compute 900 TFLOPS (BF16, per card) 1,513 TFLOPS (BF16, without sparsity)
FP8 Compute Not widely published/clear for 920, but its predecessor (910C) is lower. 3,026 TFLOPS (FP8, without sparsity)
Memory Bandwidth 4.0 TB/s (HBM3) 3.35 - 3.9 TB/s (HBM3)
GPU Memory (VRAM) Likely high (The predecessor, 910C, has 128GB HBM3) 80GB (HBM3)
Software Ecosystem CANN (Requires porting, less mature) CUDA (Industry standard, highly mature)
Primary Market China (Strong domestic focus) Global

所有跟帖: 

主要是生态CUDA依赖太高,切换成本很大 -霸天虎- 给 霸天虎 发送悄悄话 (186 bytes) () 11/03/2025 postreply 23:18:01

老共应该在高校提供免费的国内版本,让学生习惯中国的AI生态。年轻人习惯比较快 -硬码工- 给 硬码工 发送悄悄话 (0 bytes) () 11/04/2025 postreply 00:30:34

没有那么容易的 -霸天虎- 给 霸天虎 发送悄悄话 (253 bytes) () 11/04/2025 postreply 01:17:38

请您先登陆,再发跟帖!