Larger and more instructable language models become less reliable (pmc.ncbi.nlm.nih.gov) AI

The article reports that as large language models have been scaled up and “shaped” with instruction tuning and human feedback, they have become less reliably aligned with human expectations. In particular, models increasingly produce plausible-sounding but wrong answers, including on difficult questions that human supervisors may miss, even though the models show improved stability to minor rephrasings. The authors argue that AI design needs a stronger focus on predictable error behavior, especially for high-stakes use.

April 08, 2026 02:47 Source: Hacker News