Posted inAI

Xiaomi’s MiMo model hits 1,000 tokens per second

Xiaomi has pushed its MiMo large language model family into eye-watering territory: MiMo-V2.5-Pro now has an UltraSpeed mode that the company says breaks the 1,000 tokens-per-second barrier. Built with TileRT and designed to run on general-purpose GPUs, the 1-trillion-parameter model is being pitched as a system-and-model co-design win rather than just a raw model upgrade. […]