Simple self-distillation improves code generation (arxiv.org) AI

The paper proposes “simple self-distillation,” where an LLM is fine-tuned on its own sampled code outputs using standard supervised training, without needing a separate teacher or verifier. Experiments report that this boosts Qwen3-30B-Instruct’s LiveCodeBench v6 pass@1 from 42.4% to 55.3%, with larger improvements on harder tasks and results that transfer across Qwen and Llama model sizes. The authors attribute the gains to how self-distillation reshapes token distributions to reduce precision-related errors while maintaining useful exploration diversity.

April 04, 2026 17:18 Source: Hacker News