有比;单片的

来源: 2025-11-03 22:53:55 [博客] [旧帖] [给我悄悄话] 本文已被阅读:

https://www.tomshardware.com/tech-industry/artificial-intelligence/deepseek-research-suggests-huaweis-ascend-910c-delivers-60-percent-nvidia-h100-inference-performance?utm_source=chatgpt.com

 

 

Ascend 910C vs NVIDIA H100 vs AMD MI300X
Specification Huawei Ascend 910C NVIDIA H100 (SXM5) AMD MI300X
FP16 Performance 800 TFLOPS 989 TFLOPS (Sparsity: 1979) 1307.4 TFLOPS (Sparsity: 2614.8)
INT8 Performance ~1600 TOPS ~1979 TOPS (Sparsity: 3958) 2614.9 TOPS (Sparsity: 5229.8)
Memory 128GB HBM3 80GB HBM3 192GB HBM3e
Memory Bandwidth 3.2 TB/s 3.35 TB/s 5.3 TB/s
Power Consumption (TDP) ~310W (potentially higher) Up to 700W 750W
Software Ecosystem CANN, MindSpore, PyTorch, TensorFlow CUDA, cuDNN, TensorRT ROCm, HIP

Note: NVIDIA and AMD often quote performance with sparsity features; dense compute figures are used for a more direct comparison where possible.

 

 

 

Feature Huawei Ascend 920 (Claimed/Projected) NVIDIA H100 (SXM/PCIe)
Architecture Huawei Da Vinci Architecture (Chiplet-based) NVIDIA Hopper
Process Node SMIC 6nm (Projected) TSMC 4nm (Custom)
FP16/BF16 Compute 900 TFLOPS (BF16, per card) 1,513 TFLOPS (BF16, without sparsity)
FP8 Compute Not widely published/clear for 920, but its predecessor (910C) is lower. 3,026 TFLOPS (FP8, without sparsity)
Memory Bandwidth 4.0 TB/s (HBM3) 3.35 - 3.9 TB/s (HBM3)
GPU Memory (VRAM) Likely high (The predecessor, 910C, has 128GB HBM3) 80GB (HBM3)
Software Ecosystem CANN (Requires porting, less mature) CUDA (Industry standard, highly mature)
Primary Market China (Strong domestic focus) Global