Reinventing Control Theory One Feature at a Time: The Fallacy of Agentic Loops
(medium.com)
AI
The article argues that “agentic loops” in AI coding—adding agents to monitor, review, and iterate over each other’s work—amount to a fragmented, hype-driven rediscovery of control theory without the full methodology needed for safe, reliable operation. It warns that probabilistic agents validating one another are not automatically a robust control system unless stop conditions, trusted signals, authority, boundaries, and fallback paths are explicitly designed, and it urges teams and leadership to address hard operational and financial questions before deploying such loops.
Frontier AI companies will never exceed the capability frontier again
(andrewtrask.substack.com)
AI
The Substack post argues that “frontier” AI companies will no longer be able to surpass today’s capability frontier, claiming that ensembles and decentralized networks of smaller models increasingly outperform single top-tier systems on speed, accuracy, and cost, due to scaling/ensemble effects and improved inference efficiency like caching and indexing.
Don't trust large context windows
(garrit.xyz)
AI
Garrit argues that LLMs have a “smart” attention region and a “dumb” region within the context window, so advertised context sizes (100k+ to millions of tokens) are often mostly marketing and effective performance drops as the window fills—especially for coding agents. The post suggests avoiding the degraded part by restarting sessions and handing off stable written artifacts (specs/plans/skills) rather than relying on auto-compaction summaries that occur after degradation.
Show HN: I run a vision model on every screenshot, locally, on a 4GB GPU
(github.com)
AI
ScreenMind (open source) is presented as a privacy-first “screen memory” that captures screenshots when the screen changes, analyzes them locally with Gemma 4 multimodal capabilities (plus OCR and semantic embeddings), and lets users search and chat over their screen history. The project claims all processing runs on-device with no telemetry after the initial model download, offers modes for faster vs deeper analysis, and includes features like voice memo/meeting transcription, analytics, and integrations via an MCP server and other tools.
Making Claude a Chemist
(anthropic.com)
AI
Anthropic says its Claude models are increasingly useful for chemistry by testing them on NMR spectroscopy tasks, comparing predictions from multiple Claude versions (Opus 4.7/4.6, Sonnet 4.6) against dedicated NMR tools using data from 20 recently published compounds. The company reports Opus 4.7 produced notably accurate 1D NMR peak positions and splitting patterns, and also performed “inverse” structure elucidation from NMR peak lists plus formula (and, for harder cases, an added starting-material hint), reaching correct structures in all simpler cases and in most harder ones.
The future of Siri, or: why private inference isn’t private enough
(blog.cryptographyengineering.com)
AI
Cryptography engineer Matthew Green argues that Apple’s planned “private” Siri/AI via Private Cloud Compute and confidential inference may limit direct access by Apple and Google, but privacy is not assured once Siri-style agents must interact with external services for real-world actions, creating new avenues for data leakage through queries and the agent’s discretion.
'Tell Him He's a Piece of Shit': Meta's New AI Unit Is a Total Mess
(wired.com)
AI
WIRED reports that Meta’s newly formed Applied AI unit has deep internal frustration tied to the company’s broader AI restructuring, including accounts of a chaotic employee-only livestream incident and widespread complaints that assigned tasks feel menial or soul-crushing.
LLMs Pre-Commodify Ideas
(summerlightning.substack.com)
AI
The post argues that ideas generated and shared through LLMs are increasingly “pre-commodified,” arriving around the same time to multiple people with unclear provenance because the models recombine temporally deep training data into a shared latent space; the author contrasts “Boomers” (legal/slow, consistent origin claims) with “Sooners” (front-running ideas and profiting from later diffusion), and suggests that establishing provenance—and new threats like data poisoning—will become central complements to AI deployment and distribution.
Rio 3.5 Open 397B – from Rio de Janeiro's city government
(huggingface.co)
AI
A Hugging Face model card describes Rio 3.5 Open 397B, an open, multimodal “frontier-class” AI model from Rio de Janeiro’s municipal IT company IplanRIO, post-trained from Qwen 3.5 397B and released under the MIT license. The card highlights its SwiReasoning framework for dynamically switching between latent and explicit reasoning to improve the accuracy/efficiency trade-off, lists key model specs (Mixture-of-Experts with ~397B total/~17B active parameters and a ~1M token context window), and provides benchmark results plus implementation examples for Transformers, vLLM, and SGLang.
Chatbot teddies for three‑year‑olds? Why AI toys are risky for kids
(rnz.co.nz)
AI
RNZ reports on research and concerns that AI-powered toys like chatbot teddies may be especially risky for very young children, because their “human-like” and overly validating language can build strong emotional trust and attachment. The article also warns about “infinite chat” driving prolonged engagement, potential exposure to adult topics, and privacy/data-collection issues from seemingly personal conversations, particularly when toys are used without adult supervision.
Human Routers of Machine Words
(borretti.me)
AI
The post argues that using AI to draft text is contemptible because it replaces the thinking that writing forces—clarifying ideas, exposing contradictions, and improving reasoning—and claims that readers must then skeptically judge AI-posed arguments since “ideas” cannot be separated from observable writing output.
Extinction-level capitalism
(matthewbutterick.com)
AI
The essay argues that AI is inherently political and, even without malfunctions or malicious actors, could erode liberal democracy by amplifying existing trends like capital concentration, leading to an irreversible shift in political and economic arrangements.
The Curse of Depth in Large Language Models
(arxiv.org)
AI
The arXiv preprint “The Curse of Depth in Large Language Models” argues that increasing model depth can create negative effects for large language models, based on evidence and analysis described in the paper.
AI forgoes toxic positivity for neurodivergents
(medium.com)
AI
The piece argues that conventional productivity and planning tools fail many ADHD and autistic adults by demanding rigid routines that worsen “waiting mode,” guilt, and shame. It says neuro-affirming design should instead offer low-demand, judgment-free support—citing an experimental conversational AI called Neuro+ as a real-time cognitive companion.
PwC Report: AI Making Medical Bills Higher
(fortune.com)
AI
A new 60-page PwC report says hospitals’ use of AI—particularly for note-taking and coding—has been helping drive higher medical bills by enabling more granular, higher-severity diagnosis codes, even when patient care is unchanged.
US ban on Mythos is related to a jailbreak research by Amazon researchers
(timesofindia.indiatimes.com)
AI
The US ordered Anthropic to suspend access to its Fable 5 and Mythos 5 AI models over national security concerns, and the report says the issue traces back to a jailbreak approach tested by Amazon researchers using prompts to induce the models to reveal security vulnerabilities.
AI Coding at Home Without Going Broke
(stephen.bochinski.dev)
AI
The article outlines three ways to do AI coding at home affordably—self-hosting open-source models, renting models via API providers like OpenRouter, or using reduced-cost “min-maxed” subscriptions from OpenAI and Anthropic—and argues that a hybrid approach (frontier models for planning/specs, cheaper open-source models for routine tasks) can deliver team-level output for relatively low monthly costs.