Since for at least 2010 we’ve had laptops with integrated GPUs inside the chipset. These GPUs have historically been very lacking (I’d say, extremely so, to the point that a Tiger Lake 11400H CPU, which is quite powerful, wouldn’t reach 60fps on CSGO a 1080p with the iGPU. AMD SoCs fared better in that aspect, but are still very lacking, even in their most recent RDNA3-based iterations, due to the poor bandwidth these laptops usually ship with (at best, dual channel DDR5 ram, but mostly dual channel DDR4). As such, dedicated GPUs with their own GDDR6 RAM and big dies have been necessary for both laptops and desktops whenever performance is a requirement, and lowend dedicated GPUs have been considered for those manufacturers that want slim, performant laptops with a decent battery life.
At the same time, there have been 3 important milestones for the APU market:
- In 2007 the Xbox 360 shifted from a Dedicated GPU + CPU combo for a single GPU combining both in the same die. The PS3 still follows the usual architecture of separate GPU and CPU.
- Both Sony and Microsoft release the PS4 and Xbox One (and their future successors) with an APU combining both. The Xbox One is fed with DDR3 RAM (don’t know how many channels) + a fast ESRAM, and it seems the bandwidth was a huge problem for them and part of the reason why it performed worse than the PS4.
- Apple released the Apple-silicon powered Macbooks, shipping powerful GPUs inside the laptops on a single die. Powerful at the expense of being extremely big chips (see the M2 Max and Ultra), and maybe not as powerful as a 3070 mobile in most cases, but still quite powerful (and pricey, but I wonder if this is because of Apple or because APUs are, for the same performance level, more expensive, we’ll get to that).
- The Steam Deck is released, featuring a 4 cores/8threads CPU + RDNA2 GPU packed with a quad-channel DDR5 RAM at 5.5GHz, totalling 88GB/s.
Now, for price-sensitive products (such as the Steam Deck, or the other game consoles), APUs seem to be the way to go. You can even make powerful ones, as long as they have enough bandwidth. It’d seem to me that it’d be clear that APUs provide a much better bang for the buck for manufacturers and consumers, as long as they’re paired with a nice memory architecture. I understand desktop PCs care about upgreadability and modularity, but why is gaming APUs not a thing in laptops/cheap gaming OEM Desktops? With 16gb 4-channel DDR5 or even GDDR6 RAM, those things would compete really well against game consoles, while avoiding all the duplicated costs that are incurred in when pairing a laptop with a DGPU. And in the end, laptops already have custom motherboards, so what’s the issue at all? What are the reasons why even cheap gaming laptops pick RTX 3050’s instead of having some love from AMD?
Bonus question: How come the DDR3 RAM at 1066MHz in the Xbox One is 68.3GB/s while the Steam Deck, with a much newer 5500MHz RAM and quad-channel is able to provide just 88GB/s?
Simple reason. An entry level gaming laptop with a decent CPU and an acceptable NVIDIA dGPU for 1080p 60Hz gaming will typically cost less than $1000.
A premium AMD ultrabook with the fastest LPDDR5/x RAM with at least 16 GB memory that will actually not limit the iGPU, and with acceptable power limits will be closer to $1500, if not more.
The dGPU will run circles around the fastest laptop of the second type you can get.
I mean, comparing entry level gaming laptop to premium non-gaming laptop price and performance is WTF?
Cheap non-gaming laptop with top tier APU like 5800H/6800HS going to be 600/800$. Sure it will still be weaker than an laptop with GPU but price is not reason. Amd or Intel simply doesn’t do chips with powerfull iGPU included cuz it doesn’t make sense. Too little memory bandwidth and monolitic dies cost way more the bigger they are.
People could think, so why don’t they use GDDr6 like consoles. Well, cuz GDDr6 provide a lot of bandwidth but has much worse latency. Which makes CPU part slower.
Ultimate move would be if they used 64bit (aka single channel) LP/DDR5 and then put also GDDr6 memory controller for GPU. No need to unify memory (still not here, shared memory is different).
That would make best of worlds. Tho it would work only on laptops where everything is soldered down. On PCs it would be hell, or mobo would include GDDr6 memory soldered down even if no chips that would use it was used, which doesn’t make sense. And could make it even worse at some point (there would be probably ones with 4/8GB ggdr6 or without any) which is hell for customers while buying mobos
The bandwidth issues of these APUs can be solved with large on-die cache. But the problem with SRAM is that it takes up a lot of die space. And with CPU + GPU already on the die, there isn’t much space left for the cache.
However, this problem can be solved with chiplets. Though I’m not sure if there’s a market for ‘high-end’ APUs!
Such 3D cache also bring real issues with cooling and laptops doesn’t do that one efficiently.
About the bonus question, its all about bus width
Steam deck has a 128 bit memory bus, whereas the Xbox one has a 256 bit memory bus.
Channels are also mostly desktop terminology, they are technically part of the DDR spec but they change around so it’s easier to just compare bus width. Steam Deck has 4 32 bit channels, while Xbox One has 4 64 bit channels.
You are also confusing the clock speed with the transfer rate, the memory in Xbox One switches at 1.066GHz and transfers at 2.1GT/s, since its “DDR” or “dual data rate”, which means that it does a transfer with both the rising and the falling edge of the clock.
End result:
- 256bit * 2.1GT/s / 8 = 67GB/s
- 128bit * 5.5GT/s / 8 = 88GB/s
(the division by 8 is there to go from bits to bytes)
Its GDDR6
Not DDR1
Clock in DDR memory is complicated…GDDR5 used in the Xbox One is DDR memory. It just means Double Data Rate.
GDDR6 used in the Series X however isn’t. It’s technically Quad Data Rate, but they decided to keep the naming scheme.
Fun little trivia - When you fps cap on a dGPU will have significantly less power usage than iGPU at the same cap
If you are on laptop a 4060 will have nearly 2x the battery life of 780m if both capped to 30 fps
Surely that’s game dependent? Try it woth something ancient like HL1 and i imagine the iGPU would use less power. Interesting there is a crossover somewhere though.
Speaking of powerful APUs, the rumor mill points to a 40 CU (basically a 6700 XT) APU being in the works. I imagine AMD will be insisting on shared GDDR memory, because otherwise, it won’t have enough bandwidth to be worth it.
How can someone research so much but also completely miss basic specifications like bus width? Like really how does someone type this out and not ask themselves, ‘are these devices free to make or do they have a cost?’
These GPUs have historically been very lacking (I’d say, extremely so, to the point that a Tiger Lake 11400H CPU, which is quite powerful, wouldn’t reach 60fps on CSGO a 1080p with the iGPU.
serious error in the second sentence.
i5-11400H was intended for use with a discrete GPU.
It had a ‘MS Office’-grade IGPU.
It was part of the H45 series of CPUs, which consumed 45W.
Intel also made H35 (35W), 28W, and 15W CPUs.
For the i5-11400H you got 16 GPU execution units, while for the i5-11300H you got 80, and for the i5-11320H 96.
The i5-11400H had more cache and more watts for CPU tasks, but virtually no GPU.
When you say ‘historically’, this is also not correct. Tiger Lake uses Intel’s current Xe GPU. It’s actually modern in terms of GPU.
Since Alder Lake, Intel mobile chips are typically 80 or 96EU.
Integrated graphics are still lacking. To get 12 CUs you need to buy a 5ghz 8 core, even realistically a CPU like that would not bottleneck at even a 60 CU GPU. So they are still really imbalanced in the CPU to GPU ratio.
That issue can be solved with Intels tile approach, where they no longer use a monolithic die for everything, but a separate die for the GPU. So if the demand from partners is there, they can do like a 4+8 CPU tile and 128 EU GPU tile, instead of just making the expected 6+8 and 128 EU configuration. It allows far more freedom for product designs than just binning a monolithic die, but it’s on partners and consumers to ask for such combinations.
It’s actually less complicated than you’re making it. The reason OEMs don’t build those systems is because AMD and Intel don’t make those chips. The reason AMD and Intel don’t make large monolithic CPU/GPU designs for laptop is that up until now, the market just wasn’t there for such a product. What segments are you targeting with a strong CPU/GPU combination?
- Premium productivity products (Apple MBP segment)
- Gaming Laptops
- Workstation or Desktop replacement
- Steam deck style portable gaming devices
What are the challenges with capturing any of these segments?
For segment 1, you’re competing with Apple, who is an entrenched player in the market and very popular with software devs and creative professionals already. It’s worth mentioning that Apple drives profitability on their devices via upsells on storage/memory and software services, not just margins on CPUs sold like AMD/Intel. It’s also worth pointing out that AMD and Intel have been competing in this segment to varying degrees of success for as long as it has existed. Meteor Lake in particular is very clearly targeted at bringing Intel designs up to speed vs Apple silicon in idle and low load scenarios.
For 2, the biggest problem is that Nvidia is the entrenched player in the gaming GPU market. It’s an uphill battle to convince buyers to pay a premium for an Intel/AMD only gaming laptop on the basis of improved battery life alone. Especially when an Nvidia equipped dGPU design will probably offer higher peak performance and most users will game plugged in anyways.
For segment 3, your users are already sacrificing battery life and portability for max performance. If they can get a faster product using separate CPU/GPU chips, they will take that option.
Segment 4 is the obvious one where such a design is already the best choice. I expect to see new products in this space consistently over the next couple years and for those chips to make their way into traditional laptops at some point.
I generally think that these large monolithic designs will see increased adoption in various segments over time, but it’s going to be contingent on Intel/AMD delivering a product that is good enough to compete and win against Apple or Nvidia’s offerings. I just don’t think that’s the case yet outside of niche markets.
Maybe chiplet tech allows a much better yield of GPU + CPU [ + NPU ] on the same chip, with the resulting benefits of low latency / fast interconnect, shared cache etc.
Also offers a cheaper more flexible way to mix-n-match compute subunits / cores to suit market niches.
Well see a lot more of this in future.
Laptops use the APU’s GPU to save power instead of keeping the larger DGPU active, it also spreads out the heat by having them separate.
For larger chips for desktops, memory bandwidth/latency and yields becomes an issue.
By combining a GPU and CPU, each part can have it’s own defects, increasing your possible product stack for a small market segment.
CPUs like low latency, GPUs like high bandwidth. Quad channel DDR5 is still like, 150GB/s. A 6600XT and 7600M are 256GB/s. You’d want 300-400GB/s of memory bandwidth while being low latency, otherwise you’re losing performance for no reason.
Laptops use the APU’s GPU to save power instead of keeping the larger DGPU active, it also spreads out the heat by having them separate.
For larger chips for desktops, memory bandwidth/latency and yields becomes an issue.
By combining a GPU and CPU, each part can have it’s own defects, increasing your possible product stack for a small market segment.
CPUs like low latency, GPUs like high bandwidth. Quad channel DDR5 is still like, 150GB/s. A 6600XT and 7600M are 256GB/s. You’d want 300-400GB/s of memory bandwidth while being low latency, otherwise you’re losing performance for no reason.
apu’s have come a long way, but bandwidth is still a bottleneck. they’re great for price-sensitive products like the steam deck, but dedicated gpus are necessary for performance. not sure why gaming apus aren’t more common in laptops/desktops, but maybe it’s due to upgradability concerns. the whole ram comparison thing is confusing, but mt/s and bus width are better metrics than mhz and channel numbers.
apu’s have come a long way, but bandwidth is still a bottleneck. they’re great for price-sensitive products like the steam deck, but dedicated gpus are necessary for performance. not sure why gaming apus aren’t more common in laptops/desktops, but maybe it’s due to upgradability concerns. the whole ram comparison thing is confusing, but mt/s and bus width are better metrics than mhz and channel numbers.
I think you already made the main argument clear. It is a cost cutting measure, and a very effective one at that!
Today’s hardware is powerful enough that an APU is “enough” and again the Steam Deck is the most impressive of all of the examples.
But for a more premium product, there are 3 main pluses for CPU+dGPU, first, disparity in needed performance, if I’m gaming most of the power I need is GPU based, but if I’m doing CAD or CFD I’m suddenly CPU bound (ST for CAD, MT for CFD), and if I’m buying a 1.5k€ system, I do prefer it to be optimised for my use case and being able to choose CPU and GPU deparatedly allows for that, while for APU it’d become a skew nightmare! Second, upgradeability, as we have seen with things like 4th gen Intel processors keeping up until practically 3rd gen Ryzen took over as still relevant for gaming, where the main upgrade needed was the GPU, for a more premium product I do consider it important. And third, repairability, even if in some categories like laptops dGPU vs iGPU is not the main bottleneck for it, there is a big amount of computers I’ve rescued with a GPU swap, for sure.