As the controversy related to performance of DirectX 12 continues, the developer of Oxide Games mentions and explains the reason of unexpected outcome of benchmark results of the new Ashes of the Singularity (although the game haven’t released yet).

In his response, he considered Asynchronous Shaders as the main reason behind AMD’s win over Nvidia in DirectX 12 gaming. To remind you, it is still an ongoing controversy as due to the reason that AMD’s old and low priced graphics card, specifically R9 290X managed to outperform Nvidia’s newly launched and up to dated (Maxwell 2.0 based), high priced graphics card after tests with DirectX 12.


Most of the publishers themselves ran the performance tests when the benchmark tool became available and they also posted their benchmark results with various of graphics cards from both the companies and the result appears to be the same. One developer of Oxide Games at the in his various posts tried to explain Asynchronous Shaders and its advantage for AMD’s hardwares.


The member Kollock posted,

In regards to the purpose of Async compute, there are really 2 main reasons for it:

1) It allows jobs to be cycled into the GPU during dormant phases. In can vaguely be thought of as the GPU equivalent of hyper threading. Like hyper threading, it really depends on the workload and GPU architecture for as to how important this is. In this case, it is used for performance. I can’t divulge too many details, but GCN can cycle in work from an ACE incredibly efficiently. Maxwell’s schedular has no analog just as a non hyper-threaded CPU has no analog feature to a hyper threaded one.

2) It allows jobs to be cycled in completely out of band with the rendering loop. This is potentially the more interesting case since it can allow gameplay to offload work onto the GPU as the latency of work is greatly reduced. I’m not sure of the background of Async Compute, but it’s quite possible that it is intended for use on a console as sort of a replacement for the Cell Processors on a ps3. On a console environment, you really can use them in a very similar way. This could mean that jobs could even span frames, which is useful for longer, optional computational tasks.

It didn’t look like there was a hardware defect to me on Maxwell just some unfortunate complex interaction between software scheduling trying to emmulate it which appeared to incure some heavy CPU costs. Since we were tying to use it for #1, not #2, it made little sense to bother. I don’t believe there is any specific requirement that Async Compute be required for D3D12, but perhaps I misread the spec.

AFAIK, Maxwell doesn’t support Async Compute, at least not natively. We disabled it at the request of Nvidia, as it was much slower to try to use it then to not.

Weather or not Async Compute is better or not is subjective, but it definitely does buy some performance on AMD’s hardware. Whether it is the right architectural decision for Maxwell, or is even relevant to it’s scheduler is hard to say.