Apple Thread - The most overrated technology brand?

What killed Steve Jobs?

  • Pancreatic Cancer

    Votes: 18 14.0%
  • AIDS from having gay sex with Tim Cook

    Votes: 111 86.0%

  • Total voters
    129

LegoTugboat

True & Honest Fan
kiwifarms.net
The ARM dev kits have been benchmarked, with interesting results so far.

The dev kits have four of the 8 cores enabled, and are also being slightly underclocked compared to the iPad Pro, running at 2.4 Ghz instead of 2.5.

Performance is better then a 3 GHz 2019 Surface Pro X, even though Geekbench had loosely converted to running on Apple ARM.

They're unsurprisingly underperforming against an iPad Pro, although that was being done by emulation.
 

Gustav Schuchardt

Trans exclusionary radical feminazi.
True & Honest Fan
kiwifarms.net
I've been thinking about the whole transition to ARM thing. There is some evidence that instruction set architecture has little effect on performance - it's all about power budgets

https://www.extremetech.com/extreme...-or-mips-intrinsically-more-power-efficient/2

1593534140449.png


I.e. performance in billions of instructions per second is more or less proportional to power regardless of the architecture.

However, one thing ARM chipsets have traditionally been poor at is the memory subsystem - LPDDRn is optimized for power and not performance. Geekbench isn't all that sensitive to memory subsystem performance because, as Linus Torvalds pointed out

https://www.realworldtech.com/forum/?threadid=136526&curpostid=136666
https://archive.vn/wip/iCwuj

And quite frankly, it's not even just the crypto ones. Looking at the other GB3 "benchmarks", they are mainly small kernels: not really much different from dhrystone. I suspect most of them have a code footprint that basically fits in a L1I cache.
So it seems like if you want to design an ARM chip with comparable performance outside of synthetic benchmarks with small kernels you need to up the power budget and really work on the memory subsystem.

Now it so happens that there is a standard for a high bandwidth memory called Wide IO. As the name suggests it's basically a wider bus than you typically get with DDR

https://www.extremetech.com/computi...es-between-wide-io-hbm-and-hybrid-memory-cube
https://archive.vn/TuCoq

Wide I/O is designed specifically to stack on top of SoCs and use vertical interconnects to minimize electrical interference and die footprint. This optimizes the package’s size, but also imposes certain thermal limitations, since heat radiated from the SoC has to pass through the entire memory die. Operating frequencies are lower, but a large number of I/O pins increases bandwidth by using a memory bus that’s up to 1024 bits wide.

Wide I/O is the first version of the standard, but it’s Wide I/O 2 that’s expected to actually reach the mass market — though some have argued that true adoption won’t come until Wide I/O 3, which should finally open a gap between itself and LPDDR4. The standard was ratified by JEDEC, but it’s often associated with Samsung due to that company’s extensive work on bringing it to market. Timing is unclear, but no major devices are expected to ship with Wide I/O in the first half of 2015. We may see some limited pickup in the back half of the year, possibly from Samsung’s own foundries.
1593534544956.png


Here's a nice comparison of DDR, LPDDR, Wide IO, HMC, and HBM:

1593534619966.png


Wide IO is good for mobile because you can put the DRAM package on top of the SOC in a process called 'package-on-package' or Pop.

I reckon if Apple is serious they'll use some exotic DRAM interface in their high-end ARM parts. There is some evidence they're hiring people for just such a project

https://jobs.apple.com/en-us/details/200003767/dram-product-engineer
https://archive.vn/8nU7c

Knowledge with state of the art packaging technology (pop, Tsv, etc.) and their relationship to DRAM signal/power integrity
Apple already solders DRAM to the motherboard so the non-upgradeable nature of Wide IO or HBM shouldn't bother them. And if they make their own ARM SOC they can use any memory interface they want, provided they can get one DRAM manufacturer to support it. They could even roll their own completely proprietary interface with more performance than Wide IO or HBM and sell it at many times the price.

Also 'We've got the fastest main memory interface in the industry' is just the sort of thing Tim Cook would want in his keynote introducing the new ARM Macs.

All this seems pretty plausible to me - you can attack the traditional weakness of ARM systems and do it in a way that isn't something Intel can easily do because traditionally they've gone along with the industry standard (aka 'herd') on DRAM technology - they just make sure their CPUs have support for the latest industry standard. Or sometimes the 'latest but one' - DDR5 has been a standard for a while and production is ramping up but very few desktop systems support it (archive).

Mind you Intel could put some HBM on a multichip module and use it as a massive 4th level cache for off module DDR4/DDR5 and still have something which works in this paradigm. They actually do something similar this for some mobile parts which have eDRAM, though it's not clear how wide the interface to that is.

Another question about ARM Macs is whether they'll still have Thunderbolt ports. Thunderbolt is dependent on PCI Express and there's something to said for power-efficient systems not using PCI Express but rather some internal, and mostly on-chip, bus - this is how the Windows on ARM systems work. Still, you could probably bridge that to PCIe and then just turn the bridge off when the ports are not in use. Or maybe they'll just drop Thunderbolt and come up with a proprietary, performant and expensive replacement.
 
Last edited:

Beavis

Dilweed
kiwifarms.net
Their laptops and next Mac mini will be the first computers that get the ARM chips. A MacBook Air with an 8 core A14X/Z whatever chip that‘s runs cooler, get‘s better battery life, and doesn’t cause the fan to rev up from a YouTube video will be nice. I think they’ll stick with intel for the iMac update later this year then give it ARM chips for the next update. Mac Pro will be the last one to go ARM and complete the transition. Who the hell knows if they’ll update the iMac Pro.
 

Sir Wesley Tailpipe

kiwifarms.net
Here's the thing, though. If Apple announced this, and the new ARMacs were at least on par, or god forbid, slower, then they would look like fucking idiots, and would be endlessly mocked and scorned.

In order for it to work, they'd have to be a decisive amount faster, more efficient, or both, then an Intel CPU. Possibly something similar to the original Intel Core Duo Mini to the G4 Mini.

So hopefully, the dev Minis are basically as fast as stabbed rats.
The dev minis are a bit better than an iPad Pro, but devs have been told the production ARM Macs will be much beefier. With what the timeline is for the first production models, expect the silicon to be a big leap, especially since they won’t be dealing with the same thermal envelope in full-size laptops and desktop machines.

There are leaks indicating that Intel-compatible code only runs on 4 cores in the dev units, which have 8 cores for native code, and the interpreted code runs respectably already.
 
  • Informative
Reactions: Kot Johansson

Smaug's Smokey Hole

no corona
kiwifarms.net
That's an interesting thought. Wouldn't that require creating a whole new instruction set? So we'd have to do yet another transition after that?
They focus a lot on Xcode in the presentation and pointing to Rosetta 2 as a transition point from old to new, if they wish to replace ARM - a CPU that is probably very customized for their purposes right now, and they make the compiler like every other component of the system - they might chip away at what it compiles to over the years until they can do a seamless switch sort of like they did with the GPU.
 

Smaug's Smokey Hole

no corona
kiwifarms.net
Apple already solders DRAM to the motherboard so the non-upgradeable nature of Wide IO or HBM shouldn't bother them. And if they make their own ARM SOC they can use any memory interface they want, provided they can get one DRAM manufacturer to support it. They could even roll their own completely proprietary interface with more performance than Wide IO or HBM and sell it at many times the price.

Also 'We've got the fastest main memory interface in the industry' is just the sort of thing Tim Cook would want in his keynote introducing the new ARM Macs.

All this seems pretty plausible to me - you can attack the traditional weakness of ARM systems and do it in a way that isn't something Intel can easily do because traditionally they've gone along with the industry standard (aka 'herd') on DRAM technology - they just make sure their CPUs have support for the latest industry standard. Or sometimes the 'latest but one' - DDR5 has been a standard for a while and production is ramping up but very few desktop systems support it (archive).

Mind you Intel could put some HBM on a multichip module and use it as a massive 4th level cache for off module DDR4/DDR5 and still have something which works in this paradigm. They actually do something similar this for some mobile parts which have eDRAM, though it's not clear how wide the interface to that is.
HBM is an interesting thought, if they make that part of the SoC they could reduce the complexity of the mainboard, but that's trading one cost/problem for an even bigger one. GPUs (and consoles) still get tremendous bandwidth by keeping DRAM chips on the board, one 32 bit bus per chip, and neither Joseph Hipster or the AiRMac itself needs a terabyte of bandwidth that HBM could provide. And the needs between GPU and CPU when it comes to memory is different though so that's something to be considered. It will be interesting to see what will happens when Apple design a computer like it was a console, except it won't play games.
 

Gustav Schuchardt

Trans exclusionary radical feminazi.
True & Honest Fan
kiwifarms.net
HBM is an interesting thought, if they make that part of the SoC they could reduce the complexity of the mainboard, but that's trading one cost/problem for an even bigger one. GPUs (and consoles) still get tremendous bandwidth by keeping DRAM chips on the board, one 32 bit bus per chip, and neither Joseph Hipster or the AiRMac itself needs a terabyte of bandwidth that HBM could provide. And the needs between GPU and CPU when it comes to memory is different though so that's something to be considered. It will be interesting to see what will happens when Apple design a computer like it was a console, except it won't play games.
I could see them using Wide IO for the Macbook Air targetted Apple Silicon and HBM for the Macbook Pro. Basically build them like GPUs. As far as need goes it's more like taking ARM chips from having crappy LPDDR memory subsystems inferior to a typical x86 one to really high-end GPU like ones which are far superior. I bet you'd see some impressive benchmarks from such a system.

HBM3 is supposed to be cheaper than HBM2. They've made it less wide but increased the clock

https://arstechnica.com/gadgets/2016/08/hbm3-details-price-bandwidth/
https://archive.vn/2QCN2

Meanwhile, Samsung has been working on making HBM cheaper by removing the buffer die, and reducing the number of TSVs and interposers. While these changes will have an impact on the overall bandwidth, Samsung has increased the individual pin speed from 2Gbps to 3Gbps, offsetting the reductions somewhat. HBM2 offers around 256GB/s bandwidth, while low cost HBM will feature approximately 200GB/s of bandwidth. Pricing is expected to be far less than that of HBM2, with Samsung targeting mass market products.
Even with HBM2 prices it's not completely unaffordable

https://segmentnext.com/2017/05/24/amd-rx-vega-hbm2-memory-cost/
https://archive.vn/wip/t5530

AMD Vega GPUs will use 8 GB HBM2 memory and this means that there will be two stacks used so this means that the cost of producing this will be $160. Comparing this to GDDR5, it is very expensive indeed. AMD is also using HBM2 memory for their corporate customers and the GPUs are custom made according to needed requirements of the customer.
An i5 chip used in the 2019 13 inch Macbook Pro costs $320

https://ark.intel.com/content/www/u...-8259u-processor-6m-cache-up-to-3-80-ghz.html

Recommended Customer Price $320.00
The A12Z is rumored to cost Apple about $30 to make. It's not completely implausible that an Apple ARM chip with 8GB HBM3 might end up costing less than that i5-8259U while outperforming it on memory-intensive benchmarks. Also, an Apple ARM chip with Wide IO DRAM would outperform and be cheaper than the crippled dual-core Intel i3-1000NG4 chips in Macbook Airs, and unlike the i3 wouldn't have an issue with the Macbook Air's ridiculously minimalistic cooling solution.

Of course, they could just half-ass it and ship something like the A12Z with LPDDR4 and then just point to the Geekbench benchmarks and ignore that performance in memory-intensive applications will suck.
 
  • Like
Reactions: Smaug's Smokey Hole

Smaug's Smokey Hole

no corona
kiwifarms.net
I could see them using Wide IO for the Macbook Air targetted Apple Silicon and HBM for the Macbook Pro. Basically build them like GPUs. As far as need goes it's more like taking ARM chips from having crappy LPDDR memory subsystems inferior to a typical x86 one to really high-end GPU like ones which are far superior. I bet you'd see some impressive benchmarks from such a system.

HBM3 is supposed to be cheaper than HBM2. They've made it less wide but increased the clock

https://arstechnica.com/gadgets/2016/08/hbm3-details-price-bandwidth/
https://archive.vn/2QCN2



Even with HBM2 prices it's not completely unaffordable

https://segmentnext.com/2017/05/24/amd-rx-vega-hbm2-memory-cost/
https://archive.vn/wip/t5530



An i5 chip used in the 2019 13 inch Macbook Pro costs $320

https://ark.intel.com/content/www/u...-8259u-processor-6m-cache-up-to-3-80-ghz.html



The A12Z is rumored to cost Apple about $30 to make. It's not completely implausible that an Apple ARM chip with 8GB HBM3 might end up costing less than that i5-8259U while outperforming it on memory-intensive benchmarks. Also, an Apple ARM chip with Wide IO DRAM would outperform and be cheaper than the crippled dual-core Intel i3-1000NG4 chips in Macbook Airs, and unlike the i3 wouldn't have an issue with the Macbook Air's ridiculously minimalistic cooling solution.

Of course, they could just half-ass it and ship something like the A12Z with LPDDR4 and then just point to the Geekbench benchmarks and ignore that performance in memory-intensive applications will suck.
HBM would be overkill for them and if they make their own IMC/north bridge with the ARM chips they could create a wide bus of their own, 32bit per chip, just like a GPU or a console and if a new console can come out at $400 with so many layers to the PCB then sure as hell a Mac can be made using Apples own components and sell for $1800.
I'm not ragging on you, I'm just spitballing about the next few years.
 
  • Like
Reactions: Gustav Schuchardt

Gustav Schuchardt

Trans exclusionary radical feminazi.
True & Honest Fan
kiwifarms.net
HBM would be overkill for them and if they make their own IMC/north bridge with the ARM chips they could create a wide bus of their own, 32bit per chip, just like a GPU or a console and if a new console can come out at $400 with so many layers to the PCB then sure as hell a Mac can be made using Apples own components and sell for $1800.
I'm not ragging on you, I'm just spitballing about the next few years.
I see what you mean and it's definitely a possibility - console-like memory subsystem rather than HBM GPU like. I'm sort of hoping they try something a bit more high tech though.

E.g. here's an article on TSMC and Apple collaborating on Wide IO like techniques

https://www.macrumors.com/2018/06/12/apple-3d-chip-packaging-patents/
https://archive.vn/wip/IWcN1

While versions of TSMC's InFO packaging have brought performance improvements to Apple devices, such as better thermal management and improved package height, it has largely not been a direct enabler of improved electrical performance. This is set to change with future packaging techniques and is already seen in some products that utilize interposers for higher density interconnects to on-package memory, such as High Bandwidth Memory (HBM).

The primary memory candidate for inclusion in such a package would be conforming to the Wide I/O set of standards described by JEDEC, and mentioned by name in several of the patents. This memory improves on LPDDR4 by increasing the number of channels and reducing the transfer speed per channel, thus increasing the overall bandwidth but lowering the energy required per bit.

Interposers do, however, pose several issues for mobile devices. Significantly, they introduce another vertical element to the package, increasing total height. Interposers must also be fabricated on silicon wafers just like active ICs, with their dimensions driven by the footprint of all devices that need to be included in the package. These solutions are typically termed as "2.5D" due to some components being placed laterally with respect to one another rather a true stacking of chips.

Rather than adopt interposers for its products as a next step in advanced packaging, the direction of Apple's focus, according to several patent applications [1][2][3][4], appears to be on true "3D" techniques, with logic die such as memory being placed directly on top of an active SoC. Additionally, a patent application from TSMC seems to suggest a level of coordination between Apple and TSMC in these efforts.
 
  • Like
Reactions: Smaug's Smokey Hole
Tags
None