AI

Summary

Generated about 12 hours ago.

What stood out in June

  • Frontier access and regulation tightened. Multiple reports say U.S. actions led to Anthropic suspending access to Fable 5/Mythos 5 for foreign nationals; related coverage also highlighted export-control triggers tied to Amazon-linked discussions (e.g., The Verge, Axios). States also investigated OpenAI (e.g., Reuters).
  • Agentic AI, reliability, and cost pressures. Articles and tooling emphasized agent workflows (memory/knowledge formats, coding loops) while others warned about hidden costs, reliability drift, and governance/guardrail limits.
  • Health, education, and safety debates broadened. Coverage ranged from AI toys for kids to AI use in policing/courts and learning outcomes.

Model releases

Stories

Automating Myself Out of Development (thoughtfultechnologist.com) AI

Nune Isabekyan describes progressively automating her own Claude Code development workflow by moving from interactive local sessions to an EC2-based, scheduled, GitHub-issue “planning board” system that runs via a cron-driven daemon and stops at labeled checkpoints for her review.

Shepherd's Dog: A Game by the Most Dangerous AI Model (koenvangilst.nl) AI

Koen van Gilst reports testing an Anthropic AI model he calls “too dangerous,” saying that after a long reasoning session and significant token cost it generated a complete “Shepherd’s Dog” game as a single 2,319-line HTML file, which he says matches his vision and is the first time an AI produced it in one go.

TycoonLE: A Jax reinforcement learning environment for long-horizon planning (github.com) AI

TycoonLE is a JAX-based reinforcement learning environment for economically grounded, long-horizon planning in a simulated logistics/transport economy, where agents allocate capital, build routes, move cargo, manage debt, and optimize delayed rewards. The repo emphasizes action legality and uses a fixed-shape interface designed to work with JAX transformations (e.g., jit/vmap/scan), along with an audit/replay UI to inspect route choices, cargo flows, financing behavior, and profit over time. It also includes TycoonBench, a companion benchmark report for comparing performance on TycoonLE planning tasks.

Open Source AI Must Win (opensourceaimustwin.com) AI

The article argues that open-source AI is essential for “operational freedom,” warning that if intelligence is only available through a few closed institutions, society could lose the ability to study, deploy, audit, and adapt AI systems without permission.

US Government directive to suspend access to Fable 5 and Mythos 5 (anthropic.com) AI

Anthropic says a US government export control directive requires it to suspend access to its Fable 5 and Mythos 5 models for all users, including non-US users, after the government raised concerns about a potentially narrow “jailbreak” method that could allow reading a specific codebase to find software flaws. The company says its “defense in depth” safeguards make universal jailbreaks unlikely and that the reported capability appears widely available in other models, adding that it is working to restore access and plans to share more details soon.

New AI model tracked: Moonshot AI Kimi K2.7 Code (llm-stats.com) AI

LLM-stats.com reports that MoonshotAI released “Kimi K2.7 Code,” a June 12, 2026 multimodal, coding-focused model built on Kimi K2.6, with claims of improved long-horizon coding completion and instruction following plus lower “thinking-token” usage. The page lists pricing starting at $0.95 per million input tokens and $4.00 per million output tokens, and notes a Modified MIT license that restricts commercial use.

Can I Buy Your KV Cache? (arxiv.org) AI

The arXiv paper “Can I Buy Your KV Cache?” proposes that publishers precompute and sell a model’s key-value (KV) cache for documents so agents can skip repeated, compute-heavy “prefill,” while claiming token-exact reuse with no accuracy loss; it argues compute savings can outweigh KV shipping costs if the cache is hosted server-side like prompt-caching.

From AGI to ASI (arxiv.org) AI

The arXiv report “From AGI to ASI” examines how AI might progress after reaching human-level AGI, outlining the transition to artificial general superintelligence and four possible pathways (scaling, paradigm shifts, recursive improvement, and multi-agent collectives) along with potential frictions and bottlenecks.

Why the AI Renaissance Keeps Not Arriving (jamesfbaker.substack.com) AI

The newsletter argues that today’s AI systems cause “manifold collapse,” producing individually strong but increasingly similar outputs because post-training (e.g., RLHF) steers models toward a narrow set of high-reward behaviors, which can reduce idea diversity and synchronize mistakes across society, stalling any true “renaissance” of expanding frontiers.

How to Setup a Local Coding Agent on macOS (ikyle.me) AI

The article walks through setting up a local “coding agent” on macOS by running llama.cpp (with Metal acceleration) as an OpenAI-compatible /v1 server using Gemma 4 26B plus an MTP speculative-decoding draft model, then connecting it to Pi configured to accept both text and images via the Gemma 4 multimodal projector.

My Claude Code Setup (illuminatedcomputing.com) AI

The post describes how the author runs “Claude Code” with a dedicated Linux user account to avoid giving the model direct access to the author’s secrets, while still allowing typical developer workflows via separate SSH/Git/Postgres setups and tmux sessions; the author also discusses the trade-offs and remaining concerns around privilege-escalation and Docker, considering whether a VM might be safer in some cases.

AI Engineering the Acceleration Whiplash (faros.ai) AI

Faros Research’s AI Engineering Report 2026 argues that rapid AI adoption has created an “acceleration whiplash,” with code throughput rising but downstream reliability costs increasing, including a sharp rise in incidents, bugs, churn, review delays, and more PRs merged without review.

Maxproof (arxiv.org) AI

The arXiv paper “MaxProof” proposes a population-level test-time scaling approach for generating mathematical proofs, using an M3 model trained on proof generation, verification, and critique-conditioned repair. At test time, it treats the model as generator/verifier/refiner/ranker, searches over many candidate proofs, and selects the final result via tournament selection—reporting 35/42 on IMO 2025 and 36/42 on USAMO 2026.

AI Economics for Dummies (mcsweeneys.net) AI

McSweeney’s “AI Economics for Dummies” uses a set of exaggerated, humorous examples to satirize how AI companies’ business models and finance metrics—especially around IPO hype—are often presented to the public.

Kimi K2.7-Code: open-source coding model with better token efficiency (huggingface.co) AI

Moonshot AI’s Kimi K2.7-Code is an open-source, coding-focused agentic model built on Kimi K2.6, claiming improved token efficiency (about 30% fewer thinking tokens) and better performance on long-horizon software engineering tasks. The Hugging Face model card includes deployment and usage examples via Transformers, vLLM, and SGLang, along with reported benchmark results and specifications such as a 256K context length and native INT4 quantization.

The Future of Email (fastmail.com) AI

Fastmail argues that as AI assistants increasingly read, filter, and act on emails, verifying sender identity through email authentication standards—SPF, DKIM, and DMARC—will become crucial to prevent spoofing and phishing before messages reach users’ inboxes.