NVIDIA has moved its GPU business far beyond just powering gaming computers and announced NVIDIA DGX-1, the first supercomputer for deep learning that hide inside no less than eight Tesla P100 graphics cards, which will make it the fastest supercomputer for deep learning in the world. Its performance reaches the incredible figure of 170 TFLOPs (FP16), it is also the first supercomputer to  use interconnect technology NVLInk that will connect eight Tesla P100 graphics cards. Each Graphics card has 16 GB of memory HBM2, yielding a consumption 3200W and it has 7TB of SSB storage.

NVIDIA DGX-1: The Supercomputer For Deep-learning


To get an idea of the power, when it was announced at GTC 2015, a system with Dual Xeon and Maxwell based four Graphics card gave a power of 3 TFLOPs with a bandwidth of 76 GB / s which completed the “Alexnet” training in 150 hours. Thanks to the 170 TFLOPs of  NVIDIA DGX-1 with a bandwidth of 768 GB / s  reduces training time in just 2 hours using only one node (Vs 250 nodes).

“Data scientists and AI researchers today spend far too much time on home-brewed high performance computing solutions,” Huang said in a press release. “The DGX-1 is easy to deploy and was created for one purpose: to unlock the powers of superhuman capabilities and apply them to problems that were once unsolvable.”

Nvidia DGX-1 System Specifications:

The NVIDIA DGX-1 system specifications include:

  • Up to 170 teraflops of half-precision (FP16) peak performance
  • Eight Tesla P100 GPU accelerators, 16GB memory per GPU
  • NVLink Hybrid Cube Mesh
  • 7TB SSD DL Cache
  • Dual 10GbE, Quad InfiniBand 100Gb networking
  • 3U – 3200W