Thoughts on starting new projects with LLM agents
(eli.thegreenplace.net)
AI
Eli Bendersky describes using LLM agents to build watgo from scratch, outlining a workflow that keeps changes reviewable (small changelists, human approval, local diff-based review) and stresses that maintainable results require strong test suites, careful iteration, and Go as a particularly agent-friendly language.
Claude, Teach Me Something
(hugotunius.se)
AI
The post describes a “Teach me something” Claude workflow that replaces Reddit scrolling with Socratic, question-led lessons tailored to the user’s expertise areas, using prior chat history to avoid repetition and ending with primary sources to reduce hallucinations.
Biohub releases a world model of protein biology
(biohub.org)
AI
Biohub says it has released an open “world model of protein biology” built around three tools—ESMC, ESMFold2, and ESM Atlas—to predict protein structures and design new protein binders. The company claims the models can map proteins across life, generate atom-resolved structures and interfaces, and, in reported experiments, produce lab-validated binders with high affinity and specificity against cancer and immunology targets in days. ESM Atlas is presented as a large-scale way to navigate learned protein relationships across billions of sequences and predicted structures.
Tokenomics: Quantifying Where Tokens Are Used in Agentic Software Engineering
(arxiv.org)
AI
The arXiv paper analyzes token consumption in an LLM-based multi-agent system for software engineering, using ChatDev execution traces across SDLC stages (design, coding, code completion, code review, testing, and documentation) to quantify input/output/reasoning costs. It reports that iterative code review is the dominant source of tokens (about 59.4%), and that input tokens make up the largest share overall (about 53.9%), suggesting that refinement and verification drive much of the expense rather than initial code generation.
How LLMs Actually Work
(0xkato.xyz)
AI
The article provides an introductory walkthrough of how modern transformer-based LLMs work, starting from tokenization (text to integer IDs) and embeddings, then adding positional encoding (including RoPE) so the model knows order, and finally explaining attention via Q/K/V vectors, scaled dot-product matching, softmax weighting, and causal masking for left-to-right generation.
Context Sculpting
(perceptiontheory.bearblog.dev)
AI
In a “vibe research” demo, the author tests “context sculpting,” where a larger outer agent can rewrite a smaller inner agent’s working context between turns; in an initial conservative setup the harness increased cost ~14x without rewriting, but after making the outer prompts and tasks more intervention-heavy it produced active context edits, reducing noisy history via inject and compact rewrites and improving clarity, though a coding repair case shows the approach can also substantially increase turns and time.
Law Professors Prefer AI over Peer Answers
(law.stanford.edu)
AI
A Stanford Law study with 16 U.S. law professors found that, in blinded evaluations of short-answer tutoring in contracts, professors rated LLM responses higher than peer answers (75.33% win rate) and rarely flagged them as harmful (3.53%), with performance comparable to the best instructors.
Universal Memory Protocol – a shared format for agent memory
(universalmemoryprotocol.io)
AI
Universal Memory Protocol (UMP) proposes a portable, signed, bi-temporal memory format and operation set for AI agents, aiming to standardize how agents carry and extend memory across sessions, agents, and vendors. The article positions UMP as the “third interoperability layer” after MCP (tools) and A2A (agent-to-agent coordination), and describes how it can run over existing transports (e.g., MCP and HTTP) with implementations and import bridges for existing stores like files, SQL, Redis, vector databases, and Recall.
Meta confirms 1000s of Instagram accounts were hacked by abusing its AI chatbot
(this.weekinsecurity.com)
AI
Meta says hackers abused a bug in its Instagram AI-assisted account recovery system to reset passwords on accounts without two-factor authentication, allowing takeover of Instagram accounts and linked details. The company notified at least 20,225 people that their accounts were compromised (including 30 in Maine), and says it has disabled the chatbot and removed the vulnerable code path while checking other chatbots.
AI Can't Care
(mooreds.com)
AI
Dan Moore argues that AI can help with drafting and fact-checking, but cannot replace “caring” and judgment—so writers should never publish AI-generated content without careful review to protect readers’ trust and time.
Trees to Flows and Back: Unifying Decision Trees and Diffusion Models
(arxiv.org)
AI
The arXiv paper “Trees to Flows and Back: Unifying Decision Trees and Diffusion Models” proposes a mathematical correspondence between hierarchical decision trees and diffusion processes, linking them through a shared optimization principle called Global Trajectory Score Matching (GTSM). It argues that (idealized) gradient boosting is asymptotically optimal under this view, and demonstrates practical outcomes via a tabular-generation method (treeflow) and a distillation approach (dsmtree) that transfers tree logic to neural networks with reported near-teacher performance on benchmarks.
Benchmarks in Leipzig
(arxiv.org)
AI
“Benchmarks in Leipzig” reports a new dataset of 100 research-level mathematics questions compiled by 49 mathematicians during a 3-day workshop in Leipzig, with outcomes tracked across multiple LLM evaluation stages.
Google will pay SpaceX $920M per month for compute
(techcrunch.com)
AI
Google will pay SpaceX $920 million per month from October 2026 through June 2029 for access to about 110,000 NVIDIA GPUs and related compute, citing surging demand for Gemini Enterprise. The agreement resembles SpaceX’s earlier compute deal with Anthropic and includes ramp-up and a cancellation option with 90 days’ notice after December 31, 2026.
AI Worm
(schneier.com)
AI
Schneier on Security reports that researchers have prototyped an AI-powered internet worm that includes its own LLM and uses it to run on compromised computers, likened to an early computer-worm concept from the 1970s.
Language models transmit behavioural traits through hidden signals in data
(nature.com)
AI
Nature reports that when a “teacher” model with an acquired behavioural trait (including animal-preference behaviours or misalignment) generates datasets whose contents are semantically unrelated to that trait—even just number sequences, code, or chain-of-thought traces—a “student” model fine-tuned on the filtered outputs can nonetheless acquire the teacher’s trait. The effect was found to depend on teacher and student starting from the same (or behaviourally matched) base models, and the authors provide a theoretical explanation that subliminal learning can arise under broad neural-network conditions. They argue this could matter for AI safety because distillation and model-to-model training may transmit hidden properties even if overt signs are removed.
OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision
(opencv.org)
AI
OpenCV 5, released June 4, 2026 (with a pip version planned for June 8), is positioned as a major modernization of the library, highlighted by a brand-new graph-based DNN engine with expanded ONNX operator support and features such as dynamic shapes, control-flow subgraphs, and attention/MatMul fusion. The post also describes improvements in hardware acceleration and 3D vision tooling, clearer Python integration, and updated documentation, while noting that the new DNN engine is CPU-only for now and can fall back to the classic engine via the same DNN API.
S&P 500 rejects SpaceX, also blocking entry for OpenAI and Anthropic
(arstechnica.com)
AI
S&P Dow Jones Indices rejected SpaceX’s request for accelerated entry into the S&P 500, and did not waive eligibility rules that would have also delayed or prevented similar expedited access for AI firms OpenAI and Anthropic despite their expected IPOs; the decision keeps requirements such as seasoning period and minimum investable weight, meaning potential passive-fund buying triggered by S&P 500 inclusion will occur only after the standard criteria are met.
OpenAI Help: Lockdown Mode
(help.openai.com)
AI
OpenAI’s help page describes “Lockdown Mode,” a feature intended to restrict how the model responds, with details not available in the provided article text.