Running Claude Code Offline on an M3 Pro with Qwen3.6 (har-ki.github.io) AI

The article explains how to run Claude Code locally in an air-gapped setup using an Apple M3 Pro with Ollama and a Qwen3.6 35B MoE model, including a step-by-step configuration and four key fixes to prevent timeouts and ensure settings like “no thinking” work on the MLX runner. It reports that, once configured, performance is largely limited by hardware-driven prefill time for a 32K context window, with memory bandwidth and available GPU-visible unified memory determining how fast sessions complete.

June 11, 2026 21:37 Source: Hacker News