Show HN: Mdarena – Benchmark your Claude.md against your own PRs (github.com) AI
mdarena is an open-source tool that benchmarks Claude.md instructions by mining real merged PRs from your codebase, running the generated patches against the repo’s actual test suites, and comparing the results to the gold diffs. It reports test pass/fail, patch overlap, and token/cost-related metrics, using history-isolated checkouts to avoid information leakage. The project also includes a SWE-bench-compatible workflow and notes mixed results when consolidating guidance versus using per-directory instructions.
April 06, 2026 01:06
Source: Hacker News