After the benchmark results of Ashes of the Singularity with DirectX 12, questions started to revolve around the outcome of the results as the AMD’s GPU got superiority over NVIDIA’s GPU. Many believe that the difference was seen due to DirectX 12 because AMD’s graphics cards perform better with the new DirectX 12. Result of the benchmarks of Ashes of the Singularity was a disappointment for NVIDIA.

The disappointment could be due to various factors related to drivers, and hardware. But a member of tried to explain the reason of difference in fps which occurred in the result of benchmarks. He also posted some statistical data with his claim.

Maxwell’s Asychronous Thread Warp can queue up 31 Compute tasks and 1 Graphic task. Whilst the AMD’s GCN 1.1/1.2 is composed of 8 Asynchronous Compute Engines. Each one is able to queue 8 Compute tasks for a total of 64 coupled with 1 Graphic task by the Graphic Command Processor. He further posted this image.

AMD's DirectX 12 Advantage - GCN Architecture More Friendly To Parallelism Than Maxwell

If we notice the result of Ashes of the Singularity’s benchmarks, the NVIDIA’s GPU performed good with DirectX 11 enabled but lost the competition with DirectX 12. According to the member Mahigan, this happened because NVIDIA’s graphics cards can handle better Serial Scheduling as compared the Parallel Scheduling. “DirectX 11 is suited for Serial Scheduling therefore naturally nVIDIA has an advantage under DirectX 11.”

AMD's DirectX 12 Advantage - GCN Architecture More Friendly To Parallelism Than Maxwell

This means that AMDs GCN 1.1/1.2 is best adapted at handling the increase in Draw Calls now being made by the Multi-Core CPU under Direct X 12.

Therefore in game titles which rely heavily on Parallelism, likely most DirectX 12 titles, AMD GCN 1.1/1.2 should do very well provided they do not hit a Geometry or Rasterizer Operator bottleneck before nVIDIA hits their Draw Call/Parallelism bottleneck. The picture bellow highlights the Draw Call/Parallelism superioty of GCN 1.1/1.2 over Maxwell 2:

AMD's DirectX 12 Advantage - GCN Architecture More Friendly To Parallelism Than Maxwell


He further said,

“People wondering why Nvidia is doing a bit better in DX11 than DX12. That’s because Nvidia optimized their DX11 path in their drivers for Ashes of the Singularity. With DX12 there are no tangible driver optimizations because the Game Engine speaks almost directly to the Graphics Hardware. So none were made. Nvidia is at the mercy of the programmers talents as well as their own Maxwell architectures thread parallelism performance under DX12. The Developers programmed for thread parallelism in Ashes of the Singularity in order to be able to better draw all those objects on the screen. Therefore what we’re seeing with the Nvidia numbers is the Nvidia draw call bottleneck showing up under DX12. Nvidia works around this with its own optimizations in DX11 by prioritizing workloads and replacing shaders. Yes, the nVIDIA driver contains a compiler which re-compiles and replaces shaders which are not fine tuned to their architecture on a per game basis. NVidia’s driver is also Multi-Threaded, making use of the idling CPU cores in order to recompile/replace shaders. The work nVIDIA does in software, under DX11, is the work AMD do in Hardware, under DX12, with their Asynchronous Compute Engines.”

He thinks that the AMD’s GPUs benchmark results are low as compared with the NVIDIA’s GPUs in DirectX 11 is due the to reason that the AMD’s graphics cards are limited to 1-2 cores for the Graphics pipline. The GCN 1.1/1.2 is more suitable with the Parallelism. Here’s what he said,

“AMDs GCN 1.1/1.2 architecture is suited towards Parallelism. It requires the CPU to feed the graphics card work. This creates a CPU bottleneck, on AMD hardware, under DX11 and low resolutions (say 1080p and even 1600p for Fury-X), as DX11 is limited to 1-2 cores for the Graphics pipeline (which also needs to take care of AI, Physics etc). Replacing shaders or re-compiling shaders is not a solution for GCN 1.1/1.2 because AMDs Asynchronous Compute Engines are built to break down complex workloads into smaller, easier to work, workloads. The only way around this issue, if you want to maximize the use of all available compute resources under GCN 1.1/1.2, is to feed the GPU in Parallel… in comes in Mantle, Vulcan and Direct X 12.”