A developer has found a way to push Apple’s M4 Neural Engine beyond its official job description, enabling local AI training on MacBook Air M4 systems instead of just inference. The workaround is still an unofficial hack, but it hints at a bigger truth: Apple’s chips may be more capable than the software stack is willing to admit.
The project, shared by security researcher and developer @0x0SojalSec, reportedly opens up to 15.8 TFLOPS of compute for full training workloads, including backpropagation on transformer models. That is the kind of number that makes local experimentation much less annoying for smaller AI models, especially if you are tired of renting cloud GPUs just to test an idea.
How the Neural Engine workaround works
Rather than using Apple’s Core ML or Metal frameworks, the team built a custom Model Intermediate Language, or MIL, to talk directly to the Neural Engine. They keep data in RAM to avoid slow storage writes, which helps the process stay fast and stable. When training gets stuck, an exec() restart trick is used to recover, checkpoint, and continue.
That approach is clever for another reason: it sidesteps the neat, polished path Apple prefers and goes straight at the hardware. Apple has positioned the Neural Engine mainly as an inference block, advertising up to 38 TOPS for that purpose, so this sort of reverse engineering is exactly the kind of thing Cupertino would rather keep in the lab notebooks.
What the early tests show
According to the shared project, early runs can complete transformer training steps in just milliseconds on M4 chips. The code is already on GitHub, and the appeal is obvious: local AI work becomes more private, cheaper, and less dependent on someone else’s server queue.
- Reported compute unlocked: 15.8 TFLOPS
- Workload type: training, including backpropagation on transformer models
- Apple’s official position: Neural Engine tuned for inference, not training
Why this matters for Apple users
Apple has spent years selling the idea that its silicon can do serious AI work, but under tightly managed software rules. This hack shows the control layer matters almost as much as the chip itself, and it raises a simple question: if hobbyists can get this far now, what happens once developers have had more time to poke at the M4 and beyond?
The near-term winner here is the local AI crowd, especially people building smaller models for Macs and iPads. The loser is the old assumption that Apple’s Neural Engine is useful only in the narrow lane Apple drew for it.

