Xiaomi has turned up the volume on its AI ambitions again, launching an UltraSpeed mode for MiMo-V2.5-Pro, the flagship model in its MiMo family. The Xiaomi MiMo-V2.5-Pro UltraSpeed mode pushes performance past 1,000 tokens per second on general-purpose GPUs, but the trade-off is simple: it costs 3 times more than standard API access.

That is the kind of upgrade that makes sense only if you are actually using the model at scale. For casual experimentation, the bill will sting; for enterprises chasing lower latency, the speed jump is the selling point. Xiaomi is also keeping access tight for now, which suggests the company knows this is as much a capacity play as a product launch.

MiMo-V2.5-Pro UltraSpeed performance

Xiaomi says MiMo-V2.5-Pro was developed with TileRT and reaches the new pace through what it calls ”joint design” of the model and its base system. The figure to beat inside Xiaomi’s own lineup is MiMo-V2-Flash, which the company said was generating 150 tokens per second when it launched in December 2025.

  • MiMo-V2.5-Pro UltraSpeed: more than 1,000 tokens per second
  • Speed: about 10 times faster than standard MiMo-V2.5-Pro API access
  • Price: 3 times higher than the standard API

For context, 150 tokens per second is already fast enough to outrun human reading speed by a wide margin. Xiaomi is now trying to make ”fast enough” look quaint, which is exactly what AI vendors do when they want developers to notice the infrastructure story as much as the model itself.

Who can try Xiaomi MiMo-V2.5-Pro UltraSpeed from 9 to 23 June 2026

The UltraSpeed trial runs as an application-based test from 9 to 23 June 2026. Xiaomi says it will favor enterprises and professional developers with genuine high-speed needs, which is a polite way of saying random curiosity is not the target customer here.

Approved users get two weeks of free chat access, but with guardrails: 10 queued requests per account per day, sessions capped at 30 minutes, and automatic resource release after 5 minutes of inactivity. That kind of rationing usually signals scarce compute, and in AI that scarcity often matters more than the marketing slogan.

Xiaomi’s MiMo lineup keeps expanding

MiMo is Xiaomi’s open-source family of multimodal large language models, designed to handle text, images, and audio while also mimicking human-style reasoning on complex tasks. The company previously released MiMo-V2-Flash into open access, and UltraSpeed now gives the lineup a more premium tier aimed at heavier users.

The bigger question is whether Xiaomi can keep enough high-speed capacity available once more developers start knocking. If the trial goes smoothly, UltraSpeed looks like the sort of feature that could become a serious differentiator for Xiaomi’s AI business; if it does not, it will be remembered as a very fast way to find out where the bottlenecks live.

Source: Ixbt

Leave a comment

Your email address will not be published. Required fields are marked *