AI

June 14, 2026

Summary

Generated 2 months ago.

Empty!

Stories

Formal Methods and the Future of Programming (blog.janestreet.com) AI

Jane Street’s Yaron Minsky argues that while the firm previously found full formal methods too costly compared with benefits, recent advances in agentic coding are making the tradeoff more favorable—both by increasing the need to verify messy, invariant-breaking agent output and by using formal methods as a powerful feedback mechanism alongside testing and type systems. He says the company is now building a team focused on formal methods, leveraging its control over the language (including OxCaml) and a user base that can support near-term improvements and longer-term proof-oriented directions, with hiring planned in London and New York.

Reinventing Control Theory One Feature at a Time: The Fallacy of Agentic Loops (medium.com) AI

The article argues that “agentic loops” in AI coding—adding agents to monitor, review, and iterate over each other’s work—amount to a fragmented, hype-driven rediscovery of control theory without the full methodology needed for safe, reliable operation. It warns that probabilistic agents validating one another are not automatically a robust control system unless stop conditions, trusted signals, authority, boundaries, and fallback paths are explicitly designed, and it urges teams and leadership to address hard operational and financial questions before deploying such loops.

Frontier AI companies will never exceed the capability frontier again (andrewtrask.substack.com) AI

The Substack post argues that “frontier” AI companies will no longer be able to surpass today’s capability frontier, claiming that ensembles and decentralized networks of smaller models increasingly outperform single top-tier systems on speed, accuracy, and cost, due to scaling/ensemble effects and improved inference efficiency like caching and indexing.

Don't trust large context windows (garrit.xyz) AI

Garrit argues that LLMs have a “smart” attention region and a “dumb” region within the context window, so advertised context sizes (100k+ to millions of tokens) are often mostly marketing and effective performance drops as the window fills—especially for coding agents. The post suggests avoiding the degraded part by restarting sessions and handing off stable written artifacts (specs/plans/skills) rather than relying on auto-compaction summaries that occur after degradation.

Show HN: I run a vision model on every screenshot, locally, on a 4GB GPU (github.com) AI

ScreenMind (open source) is presented as a privacy-first “screen memory” that captures screenshots when the screen changes, analyzes them locally with Gemma 4 multimodal capabilities (plus OCR and semantic embeddings), and lets users search and chat over their screen history. The project claims all processing runs on-device with no telemetry after the initial model download, offers modes for faster vs deeper analysis, and includes features like voice memo/meeting transcription, analytics, and integrations via an MCP server and other tools.

Making Claude a Chemist (anthropic.com) AI

Anthropic says its Claude models are increasingly useful for chemistry by testing them on NMR spectroscopy tasks, comparing predictions from multiple Claude versions (Opus 4.7/4.6, Sonnet 4.6) against dedicated NMR tools using data from 20 recently published compounds. The company reports Opus 4.7 produced notably accurate 1D NMR peak positions and splitting patterns, and also performed “inverse” structure elucidation from NMR peak lists plus formula (and, for harder cases, an added starting-material hint), reaching correct structures in all simpler cases and in most harder ones.

The future of Siri, or: why private inference isn’t private enough (blog.cryptographyengineering.com) AI

Cryptography engineer Matthew Green argues that Apple’s planned “private” Siri/AI via Private Cloud Compute and confidential inference may limit direct access by Apple and Google, but privacy is not assured once Siri-style agents must interact with external services for real-world actions, creating new avenues for data leakage through queries and the agent’s discretion.

LLMs Pre-Commodify Ideas (summerlightning.substack.com) AI

The post argues that ideas generated and shared through LLMs are increasingly “pre-commodified,” arriving around the same time to multiple people with unclear provenance because the models recombine temporally deep training data into a shared latent space; the author contrasts “Boomers” (legal/slow, consistent origin claims) with “Sooners” (front-running ideas and profiting from later diffusion), and suggests that establishing provenance—and new threats like data poisoning—will become central complements to AI deployment and distribution.

Rio 3.5 Open 397B – from Rio de Janeiro's city government (huggingface.co) AI

A Hugging Face model card describes Rio 3.5 Open 397B, an open, multimodal “frontier-class” AI model from Rio de Janeiro’s municipal IT company IplanRIO, post-trained from Qwen 3.5 397B and released under the MIT license. The card highlights its SwiReasoning framework for dynamically switching between latent and explicit reasoning to improve the accuracy/efficiency trade-off, lists key model specs (Mixture-of-Experts with ~397B total/~17B active parameters and a ~1M token context window), and provides benchmark results plus implementation examples for Transformers, vLLM, and SGLang.

Chatbot teddies for three‑year‑olds? Why AI toys are risky for kids (rnz.co.nz) AI

RNZ reports on research and concerns that AI-powered toys like chatbot teddies may be especially risky for very young children, because their “human-like” and overly validating language can build strong emotional trust and attachment. The article also warns about “infinite chat” driving prolonged engagement, potential exposure to adult topics, and privacy/data-collection issues from seemingly personal conversations, particularly when toys are used without adult supervision.