MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second (mimo.xiaomi.com) AI

Xiaomi has released MiMo-V2.5-Pro-UltraSpeed, a 1-trillion-parameter model claim­ing up to ~1000 tokens/second decode speed via collaboration with TileRT, using FP4 quantization and DFlash speculative decoding on commodity GPUs. The associated API is offered at a limited-time promotional price from June 9–23, 2026 (application-based access), with trial chat access during the same window.

June 08, 2026 15:45 Source: Hacker News