Benchmarks in Leipzig (arxiv.org) AI

“Benchmarks in Leipzig” reports a new dataset of 100 research-level mathematics questions compiled by 49 mathematicians during a 3-day workshop in Leipzig, with outcomes tracked across multiple LLM evaluation stages.

June 06, 2026 14:10 Source: Hacker News