As we head into summer time, extra details about AMD’s upcoming GPU structure is lastly coming to gentle. Thus far there hasn’t been plenty of info to peruse, regardless of there being a deluge of leaks about Nvidia’s plans. To rectify this imbalance, web sleuths have been poring over the corporate’s drivers on the lookout for any morsel of information they will discover. They just lately hit pay grime care of AMD’s linux drivers for GPUs. AMD has since patched the data out of the motive force, whereas seemingly verifying it by doing so.
A Twitter consumer named Kepler (sarcastically) was first to identify the element. The driving force had a line that was labeled MCD_INSTANCE_NUM with the quantity six. This appears to verify six reminiscence controllers. If you happen to extrapolate that to every one being 64-bit, that equals a 384-bit reminiscence controller. That is an improve to the 256-bit bus on its flagship RDNA2 GPUs, the RX 6900/6950 XT. What’s attention-grabbing is AMD put these GPUs up in opposition to Nvidia’s RTX 3090, which has a 384-bit bus. AMD defined a wider bus wasn’t vital, because it had a trick up its sleeve: Infinity Cache. General, AMD was proper. It was in a position to go toe-to-toe in rasterization with Nvidia this spherical. Regardless of reaching parity with its rival, it looks like AMD isn’t taking any possibilities with RDNA3. AMD additionally changed this line of code with totally different textual content per week later, in line with Videocardz. As at all times, deleting the offending textual content simply heightens the intrigue.
This leak appears to verify the earlier hypothesis in regards to the design of the chip as nicely. As proven above, it’s lengthy been rumored to be a seven-chiplet GPU. Meaning a major graphics chiplet and 6 multi-cache dies, or MCDs. This might imply it’s going to sport as a lot as 192MB of Infinity Cache assuming 32MB per die. Kepler additionally predicts AMD may use 3D stacking on its flagship GPU, doubling that quantity to 384MB. In that case that might mark a radical increase within the quantity of Infinity Cache it’s utilizing. The present RX 6950 XT has simply 128MB.
Additionally, utilizing the 6950 XT as a benchmark, we will additionally anticipate reminiscence bandwidth to be nearly double for RDNA3. If it makes use of the identical 18Gb/s GDDR6 as the present GPU, it will be able to 864GB/s. That’s in comparison with the 6950’s 576GB/s most. It additionally doesn’t consider the advantages of Infinity Cache both. That will simply enable an RDNA3 GPU to attain 1TB/s of reminiscence bandwidth. This might match the reminiscence bandwidth of Nvidia’s RTX 3090 Ti.
One potential rationalization for AMD’s bandwidth increase lies within the total dimension of the cardboard. Prime-end RDNA3 playing cards have been rumored to subject as much as 12,288 cores. The highest-end Radeon 6950XT fielded 128MB of L3 cache to again up 5,120 GPU cores. If AMD bumps core counts this excessive, even a 192MB L3 cache won’t be ample. A 384MB L3 would truly improve the whole quantity of L3 relative to the variety of cores, whereas a 192MB L3 would nonetheless symbolize a modest lower.
Exams of AMD’s reminiscence bandwidth have constantly proven that Infinity Cache does cut back strain on reminiscence bandwidth, so no matter how a lot cache AMD fields, one factor is evident: If these rumors are true, the corporate determined it wanted to make use of each reminiscence bandwidth and Infinity Cache to meet up with Nvidia’s total efficiency reasonably than substituting one for the opposite.
For its half, Nvidia can also be rumored to be rising the cache sizes on its upcoming Ada Lovelace GPUs. Earlier experiences indicated Nvidia can be bumping L2 quantities by 16x, no less than on some fashions. It’s imagined to be including 16MB of L2 per 64-bit reminiscence controller, for a complete of 96MB. It presently makes use of simply 512KB of L2 on its GA102 die with 32-bit reminiscence controllers. This might mark a major improve in L2 quantities, as Nvidia makes an attempt to blunt AMD’s cache offensive.
As at all times, we must wait and see the place the chips fall when these two titanic GPUs go head-to-head later this 12 months. What’s particularly attention-grabbing this time round is each firms are utilizing the identical TSMC N5 course of. This may make for an unprecedented battle of MCM versus monolithic designs utilizing the identical fabrication node. One concern was introduced up just lately although, which is that TSMC clients have been trying to cut back their present orders. This has been in response to the latest GPU dump that’s occurred, in addition to world financial jitters. Nonetheless, that report acknowledged AMD wasn’t asking to chop its order of 5nm merchandise, however Nvidia was. This might result in a delay for the RTX 40-series launch. TSMC reportedly instructed Nvidia it might’t cut back its order, however it might push it again a bit.