MTG Bench: Testing how well LLMs can play Magic (mtgautodeck.com) AI

The article presents “MTG Bench,” a benchmark that tests multiple LLMs on simulated Magic: The Gathering turns using an MCP-based library for deck operations, reporting overall scores and cost-per-turn (with best results led by gpt-5.5 medium at 95.4) and discussing common failure modes like illegal move simulations and tool-call mistakes.

June 12, 2026 00:15 Source: Hacker News