The marketing name is "on-device intelligence." The filing name is more honest: "Kernel-level load balancing across neural engines." That is the title of Apple's granted patent US12277492B2, issued April 15, 2025, and it describes the part of on-device AI that no keynote slide bothers to show: how the operating system divides machine-learning work across a chip's dedicated neural-processing cores.
Here is what on-device AI actually is, before the adjectives. Instead of shipping your photo or your voice to a data center and waiting for an answer, the device runs the model itself, on a block of silicon — Apple calls its version the Neural Engine — built specifically for the matrix math that neural networks need. The privacy and latency benefits are real: the data never leaves the phone, and there is no round trip to a server. But a dedicated AI accelerator is only as good as the work you can keep feeding it.
That feeding problem is exactly what the grant addresses. The patent's framing — load balancing at the kernel level, the lowest layer of the operating system — points to the OS itself deciding which neural-engine resources handle which slice of a model, so the cores stay busy rather than stalling. The named inventors (including Sundararaman Hariharasubramanian and Andrew Yanowitz) and the CPC classifications in G06N (neural-network computing) and G06F 9/50 (resource allocation) line up with that reading: this is scheduling IP, not model IP.
Why does a scheduling patent matter more than another model announcement? Because the bottleneck in consumer AI has quietly moved. The models that run on a phone are largely commoditized techniques; the differentiator is whether the hardware can run them fast and cool enough that a user never notices. A grant on splitting work across accelerator cores is a claim on that differentiator — the efficiency layer, not the intelligence layer.
Apple's broader on-device posture shows up elsewhere in its portfolio too. The company holds US11893486B2 (issued February 6, 2024) on digital watermarking of machine-learning models — a method for marking a model so its provenance can be checked — and US12651020B2, a June 2026 grant titled "Digital assistant intelligence engine." Together they sketch a company building the plumbing around on-device models: how to run them, how to mark them, how to wire them into an assistant.
The discipline here, in this column's house style: a granted claim covers its specific language, not the entire idea of "AI on a phone." None of these patents proves Apple ships exactly the claimed method in a given product. But read together they answer the question the marketing dodges — what is actually novel about on-device intelligence? Less the model, more the scheduler that keeps the silicon fed.