For all the criticism (mine included) surrounding Apple’s delay of various Apple Intelligence features, I found this different perspective by Ben Thompson fascinating and worth considering:
What that means in practical terms is that Apple just shipped the best consumer-grade AI computer ever. A Mac Studio with an M3 Ultra chip and 512GB RAM can run a 4-bit quantized version of DeepSeek R1 — a state-of-the-art open-source reasoning model — right on your desktop. It’s not perfect — quantization reduces precision, and the memory bandwidth is a bottleneck that limits performance — but this is something you simply can’t do with a standalone Nvidia chip, pro or consumer. The former can, of course, be interconnected, giving you superior performance, but that costs hundreds of thousands of dollars all-in; the only real alternative for home use would be a server CPU and gobs of RAM, but that’s even slower, and you have to put it together yourself. Apple didn’t, of course, explicitly design the M3 Ultra for R1; the architectural decisions undergirding this chip were surely made years ago. In fact, if you want to include the critical decision to pursue a unified memory architecture, then your timeline has to extend back to the late 2000s, whenever the key architectural decisions were made for Apple’s first A4 chip, which debuted in the original iPad in 2010. Regardless, the fact of the matter is that you can make a strong case that Apple is the best consumer hardware company in AI, and this week affirmed that reality.
Anecdotally speaking, based on the people who cover AI that I follow these days, it seems there are largely two buckets of folks who are into local, on-device models: those who have set up pricey NVIDIA rigs at home for their CUDA cores (the vast minority); and – the undeniable majority – those who run a spectrum of local models on their Macs of different shapes and configurations (usually, MacBook Pros). If you have to run high-end, performance-intensive local models for academic or scientific workflows on a desktop, the M3 Ultra Mac Studio sounds like an absolute winner.
However, I’d point out that – again, as far as local, on-device models are concerned – Apple is not shipping the best possible hardware on smartphones.
While the entire iPhone 16 lineup is stuck on 8 GB of RAM (and we know how memory-hungry these models can be), Android phones with at least 12 GB or 16 GB of RAM are becoming pretty much the norm now, especially in flagship territory. Even better in Android land, what are being advertised as “gaming phones” with a whopping 24 GB of RAM (such as the ASUS ROG Phone 9 Pro or the RedMagic 10 Pro) may actually make for compelling pocket computers to run smaller, distilled versions of DeepSeek, LLama, or Mistral with better performance than current iPhones.
Interestingly, I keep going back to this quote from Mark Gurman’s latest report on Apple’s AI challenges:
There are also concerns internally that fixing Siri will require having more powerful AI models run on Apple’s devices. That could strain the hardware, meaning Apple either has to reduce its set of features or make the models run more slowly on current or older devices. It would also require upping the hardware capabilities of future products to make the features run at full strength.
Given Apple’s struggles, their preference for a hybrid on-device/server-based AI system, and the market’s evolution on Android, I don’t think Apple can afford to ship 8 GB on iPhones for much longer if they’re serious about AI and positioning their hardware as the best consumer-grade AI computers.