But Blackwell was not created for such tasks
Nvidia has revealed new details about the first Blackwell generation GPU.
In particular, we were finally told how much faster the new generation GPU is than the old one in binary precision calculations (FP64). It turned out that it was significant, but not colossal — by 30%. Against the backdrop of the multiple difference in INT8 calculations, this is not particularly impressive.
To be more precise, the B100 GPU has a performance of about 45 TFLOPS, but at the moment this is not as important for the market as performance in AI tasks. For comparison, the performance of the AMD Instinct MI300X reaches 81.7 TFLOPS.
But in those tasks for which the Blackwell architecture was developed, the new solution is revealed in all its glory. In training a model with 1.8 trillion parameters, the GB200 outperforms the H100 by 30 times, while delivering 25 times higher energy efficiency and 25 times lower total cost of ownership.