有比；单片的

来源: 胡雪盐8 于 2025-11-03 22:53:55 [博客] [旧帖] [给我悄悄话] 本文已被阅读：次

https://www.tomshardware.com/tech-industry/artificial-intelligence/deepseek-research-suggests-huaweis-ascend-910c-delivers-60-percent-nvidia-h100-inference-performance?utm_source=chatgpt.com

Ascend 910C vs NVIDIA H100 vs AMD MI300X

Specification	Huawei Ascend 910C	NVIDIA H100 (SXM5)	AMD MI300X
FP16 Performance	800 TFLOPS	989 TFLOPS (Sparsity: 1979)	1307.4 TFLOPS (Sparsity: 2614.8)
INT8 Performance	~1600 TOPS	~1979 TOPS (Sparsity: 3958)	2614.9 TOPS (Sparsity: 5229.8)
Memory	128GB HBM3	80GB HBM3	192GB HBM3e
Memory Bandwidth	3.2 TB/s	3.35 TB/s	5.3 TB/s
Power Consumption (TDP)	~310W (potentially higher)	Up to 700W	750W
Software Ecosystem	CANN, MindSpore, PyTorch, TensorFlow	CUDA, cuDNN, TensorRT	ROCm, HIP

Note: NVIDIA and AMD often quote performance with sparsity features; dense compute figures are used for a more direct comparison where possible.

Feature	Huawei Ascend 920 (Claimed/Projected)	NVIDIA H100 (SXM/PCIe)
Architecture	Huawei Da Vinci Architecture (Chiplet-based)	NVIDIA Hopper
Process Node	SMIC 6nm (Projected)	TSMC 4nm (Custom)
FP16/BF16 Compute	900 TFLOPS (BF16, per card)	1,513 TFLOPS (BF16, without sparsity)
FP8 Compute	Not widely published/clear for 920, but its predecessor (910C) is lower.	3,026 TFLOPS (FP8, without sparsity)
Memory Bandwidth	4.0 TB/s (HBM3)	3.35 - 3.9 TB/s (HBM3)
GPU Memory (VRAM)	Likely high (The predecessor, 910C, has 128GB HBM3)	80GB (HBM3)
Software Ecosystem	CANN (Requires porting, less mature)	CUDA (Industry standard, highly mature)
Primary Market	China (Strong domestic focus)	Global