50% better performance per watt, using chiplets for the first time

Continuing our coverage of AMD’s 2022 Financial Analyst Day, we have the issue of AMD’s upcoming RDNA 3 GPU architecture and the Navi 3X GPUs that will be built upon it. So far, AMD has been pretty quiet about what to expect with RDNA 3, but as RDNA 2 approaches its second anniversary and the first RDNA 3 products launch this year, AMD is offering some of the first important details about the GPU. architecture.

Let’s talk about performance first and foremost. The Navi 3X family, which will be built on a 5nm process (no doubt that of TSMCs) aims for a performance-per-watt uplift of more than 50% versus RDNA 2. This is a significant and comparable improvement to AMD saw switching from RDNA. (1) to RDNA 2. And while such a claim from AMD seemed garish two years ago, RDNA 2 has given AMD’s GPU teams a significant amount of newfound credibility.

Fortunately for AMD, unlike the 1-to-2 transition, they don’t have to figure out a way to come up with a 50% increase based on architecture and DVFS optimizations alone. The 5nm process means Navi 3X gets a full node enhancement from the TSMC N7/N6 based Navi 2X GPU family. As a result, AMD alone will see significant efficiency gains.

But that said, nowadays a single node jump on its own cannot deliver a 50% improvement per watt (RIP Dennard scale). So there are several architecture improvements planned for RDNA 3. This includes the next generation of AMD’s on-die Infinity Cache, and what AMD calls an optimized graphics pipeline. According to the company, the GPU Computing Unit (CU) is also being redesigned, but to what extent remains to be seen.

But the biggest news in this area is that AMD, confirmed by a year of rumors and several patent applications, will be using chiplets with RDNA 3. The GPU layer (as we know it) moves from a monolithic GPU to a chiplet-like design, using multiple smaller chips.

Chiplets are in some ways the holy grail of GPU engineering, as they provide GPU designers with options for scaling GPUs beyond current die size (crosshairs) and yield limits. That said, it’s also a holy grail because the sheer amount of data that has to be passed between different parts of a GPU (on the order of terabytes per second) is very difficult to do – and very necessary to do if you a multi-chip GPU to present itself as a single device. We’ve seen Apple tackle the task of essentially merging two M1 SoCs together, but it’s never been done before with a powerful GPU.

AMD specifically calls this an “advanced” chiplet design. That name is often thrown around when a chip is packaged using some sort of advanced, high-density interconnect like EMIB, which sets it apart from simpler designs like Zen 2/3 chiplets, which just route their signals through the organic packaging without any advanced technology. So while we eagerly await further details of what AMD is doing here, it wouldn’t be at all surprising to find that AMD uses some form of Local Si Interconnect (LSI) technology (such as the Elevated Fanout Bridge used for the MI200- family of accelerators) to bridge two RNDA 3 chiplets directly and closely.

At this point, AMD won’t go into more details about the architecture or Navi 3X GPUs. Today is a teaser and roadmap update for the analyst market, not an announcement of what we can only assume will be the Radeon RX 7000 family of video cards. Nevertheless, with the first RDNA 3 products launching later this year, a more formal announcement may not be far off. So we’re looking forward to hearing more about what will be a big change in the nature of GPU design and manufacturing.