Together with the RDNA2 graphic architecture, AMD announced its CDNA architecture oriented towards high performance data and computing centers thinking of Artificial Intelligence, Machine Learning, etc., in the form of Radeon Instinct accelerators.
To make them much more efficient in doing their job, AMD is eliminating components at silicon level that are not necessary in such graphics cards, like screen engine, multimedia engine, and other components associates with die. AMD takes advantage of this empty space to add fixed-function tensor calculation hardware, similar to Tensor cores that incorporate in Nvidia’s most modern graphics cards based on the Turing graphics architecture.
Thanks to this, AMD will be able to offer two architectures: one focused on games and another on computing, which should significantly improve performance in each specific area.
CDNA is developed under the 7nm lithography of TSMC (probably 7nm EUV) and will be accompanied by the latest generation memory, HBM2E, the interconnection of Infinity Fabric. Between 2021 and 2022 we will have its successor, CDNA2, which will use an ” advanced ” manufacturing process that could be 5nm.
At the software level, AMD will employ the latest ROCm open source software infrastructure manufactured at home, which will link CDNA GPUs with their EPYC CPUs by providing a unified programming model.
Much like Intel’s Compute eXpress Link (CXL) and PCI-Express gen 5.0, Infinity Fabric 3.0 will support shared memory pools between CPUs and GPUs, enabling scalability of the kind required by exascale supercomputers such as the US-DoE’s upcoming “El Capitan” and “Frontier.”
Cache coherent unified memory reduces unnecessary data-transfers between the CPU-attached DRAM memory and the GPU-attached HBM. CPU cores will be able to directly process various serial-compute stages of a GPU compute operation by directly talking to the GPU-attached HBM and not pulling data to its own main memory. This greatly reduces I/O stress.
“El Capitan” is an “all-AMD” supercomputer with up to 2 exaflops (that’s 2,000 petaflops) peak throughput. It combines AMD EPYC “Genoa” CPUs based on the “Zen4” microarchitecture, with GPUs likely based on CDNA2, and Infinity Fabric 3.0 handling I/O.
Undoubtedly, it seems like AMD has done its homework in the graphics cards segment, and now we have to wait to see its first products in this year.