Combine CDNA 3 and Zen 4 for MI300 Data Center APU in 2023

0

Along with updates to their Zen CPU architecture and RDNA client GPU architecture, AMD is also updating its roadmap this afternoon for its CDNA server GPU architecture and related Instinct products. And while client CPUs and GPUs are arguably on a fairly straightforward path for the next couple of years, AMD intends to shake up its server GPU offerings in depth.

Let’s start with AMD’s server GPU architectural roadmap first. Following AMD’s current CDNA 2 architecture, which is used in the Instinct MI200-series accelerators, will be CDNA 3. And unlike AMD’s other roadmaps, the company isn’t offering a two-year view here. . Instead, the server GPU roadmap is only out for a year — to 2023 — with AMD’s next server GPU architecture slated to launch next year.

Our first preview of CDNA 3 has quite a bit of detail. With a 2023 launch, AMD isn’t withholding information as much as elsewhere. As a result, they leak information on everything from architecture to basic information on one of the products CDNA 3 will go into – a data center APU made up of CPU and GPU chips.

Starting from the top, GPUs based on the CDNA 3 architecture will be built on a 5nm process. And like the CDNA 2-based MI200 accelerators before it, will rely on chips to combine memory, cache, and processor cores into a single package. Notably, AMD calls this a “3D chiplet” design, implying that not only are the chips stacked on a substrate, but some sort of chip is also going to be stacked on top of other chips, ala AMD’s V- Cache for Zen 3 processors.

This comparison is particularly relevant here because AMD is going to introduce its Infinity Cache technology into the CDNA 3 architecture. And, like the V-Cache example above, judging by AMD’s artwork, it looks like they will stack cache with logic as separate dies, rather than integrating it into a monolithic die like their client GPUs. Due to this stacked nature, Infinity Cache chiplets for CDNA 3 will go underneath processor chips, which AMD apparently puts the very, very power-hungry logic chips at the top of the stack in order to cool them efficiently.

CDNA 3 will also use AMD 4e Infinity Architecture generation. We’ll talk more about that in a separate article, but the short version is that for GPUs, IA4 goes hand-in-hand with AMD’s chiplet innovations. Specifically, it will enable the use of 2.5D/3D stacked chips with AI, allowing all chips in a package to share a unified and fully consistent memory subsystem. This is a big step up from IA3 and current MI200 accelerators, which although offering memory coherency, do not have a unified memory address space. So while the MI200 accelerators essentially operate as two GPUs on a single package, IA4 will allow the CDNA 3/MI300 accelerators to behave as a single chip, despite the disaggregated nature of the chiplets.

AMD’s diagrams also show that HBM memory is used here again. AMD does not specify which version of HBM, but given the 2023 horizon it’s a safe bet that it will be HBM3.

Architecturally, AMD will also take several steps to improve the AI ​​performance of its high-performance accelerator. According to the company, they are adding support for new mixed precision math formats. And while not explicitly stated today, AMD’s >5x improvement in performance per watt in AI workloads strongly implies that AMD is reworking and significantly expanding its die cores for CNDA 3, because 5x is far beyond what only fab upgrades can do. deliver.

MI300: AMD’s first disaggregated data center APU

But AMD doesn’t just stop at building a bigger GPU, or unifying the memory pool for a multi-chip architecture just to have the GPUs run from a shared memory pool. Instead, AMD’s ambitions are much bigger than that. With high-performance CPU and GPU cores at their disposal, AMD is taking integration a step further and building a disaggregated data center APU – a chip that combines CPU and GPU cores in a single package.

The Data Center APU, currently codenamed MI300, is something AMD has been building for a while now. With MI200 and Infinity Architecture 3 allowing AMD CPUs and GPUs to work together with a consistent memory architecture, the next step for some time has been to bring CPU and GPU closer together, both in terms of packaging and design. memory architecture.

For memory issues in particular, a unified architecture gives the MI300 some major advantages. From a performance perspective, this improves things by eliminating redundant copies of memory; processors no longer need to copy data to their own dedicated memory pool to access/modify that data. The unified memory pool also means that there is no second pool of memory chips required – in this case, the DRAM that would normally be attached to the CPU.

The MI300 will combine CDNA 3 GPU chips and Zen 4 CPU chips on a single processor. These two processor pools will in turn share the onboard HBM memory. And, presumably, the Infinity Cache as well.

As mentioned earlier, AMD is going to heavily leverage chiplets to achieve this. CPU cores, GPU cores, Infinity Cache and HBM are all different chiplets, some of which will be stacked on top of each other. So it will be a chip unlike anything AMD has built before, and it will be AMD’s most involved effort to incorporate chiplets into their product designs.

Meanwhile, AMD is very explicit that they are aiming for market leadership in memory bandwidth and application latency. Which, if AMD can pull it off, would be a significant achievement for the company. That said, they’re not the first company to pair HBM with CPU cores – Intel’s Sapphire Rapids Xeon CPU will claim that accomplishment – ​​so it’ll be interesting to see how well the MI300 performs in that regard.

On the more specific question of AI performance, AMD claims that the APU will deliver over 8 times the drive performance of the MI250X accelerator. Which is further proof that AMD is going to make big improvements to its GPU die cores over the MI200 series.

Overall, AMD’s server GPU trajectory is quite similar to what we’ve seen Intel and NVIDIA announce over the past few months. All three companies are moving towards combined CPU + GPU products; NVIDIA with Grace Hopper (Grace + H100), Intel with Falcon Shores XPUs (mix & match CPU + GPU), and now MI300 with its use of CPU and GPU chiplets in a single package. In all three cases, these technologies aim to combine the best of CPUs with the best of GPUs for workloads that aren’t purely tied to one or the other – and in AMD’s case, the company believes that they have the best CPUs and the best GPUs for it. treat.

Expect to see a lot more from CDNA 3 and MI300 over the next few months.

Share.

Comments are closed.