AI

Summary

Generated about 15 hours ago.

What stood out in June

  • Frontier access and regulation tightened. Multiple reports say U.S. actions led to Anthropic suspending access to Fable 5/Mythos 5 for foreign nationals; related coverage also highlighted export-control triggers tied to Amazon-linked discussions (e.g., The Verge, Axios). States also investigated OpenAI (e.g., Reuters).
  • Agentic AI, reliability, and cost pressures. Articles and tooling emphasized agent workflows (memory/knowledge formats, coding loops) while others warned about hidden costs, reliability drift, and governance/guardrail limits.
  • Health, education, and safety debates broadened. Coverage ranged from AI toys for kids to AI use in policing/courts and learning outcomes.

Model releases

Stories

MAI-Thinking-1 (microsoft.ai) AI

Microsoft introduced MAI-Thinking-1, describing it as a medium-sized AI reasoning model built on enterprise-grade, clean licensed data and trained without distillation from third-party models. The company says it performs strongly on software engineering benchmarks (matching Claude Opus 4.6 on SWE-Bench Pro) and shows high mathematical reasoning scores (including AIME results), and that the model is optimized for enterprise use with features like a 256k token context window. MAI-Thinking-1 is available in private preview on Microsoft Foundry, with public preview planned for Microsoft’s MAI Playground.

Gmail thinks I'm stupid, so I left (moddedbear.com) AI

A writer says Gmail’s unsolicited AI features—like auto-generated message summaries, draft replies, and constant “help me write” prompts—felt disrespectful enough to push them to start moving away from their 16-year Gmail account, switching to Fastmail for more control.

How we index images for RAG (kapa.ai) AI

Kapa.ai describes how it improves RAG over technical documentation images by generating and storing one-time text captions for each image at indexing time, then retrieving those captions with ordinary text at query time instead of using a multimodal model on every request. The company argues query-time vision is too expensive, payload-constrained, and often loses fine details needed for charts/tables, while ingestion-time transcription can preserve load-bearing information. In experiments across three customer projects, image captions were reported as measurably better than a text-only baseline with only a small per-query cost increase (about 1%–6%) and correct image citation 94%–99% of the time.

CLI tool that packages data science projects for LLM context windows (github.com) AI

The GitHub project data2prompt introduces a CLI tool that turns data-science project folders (including notebooks and data files like CSV/SQL/XLSX) into LLM-ready prompts sized for model context windows. It emphasizes token-aware formatting and LLM attention-friendly output by sampling and truncating large data files, extracting text/code from .ipynb while skipping heavy embedded content, and offering both markdown and XML output formats.

Bringing Up DeepSeek-V4-Flash on AMD MI300X (fergusfinn.com) AI

A Doubleword worklog details the process of getting DeepSeek-V4-Flash working on AMD MI300X, highlighting major blockers in FP8 “fnuz” vs OCP dialect support, missing/buggy AITER tuned-kernel paths for specific sparse attention shapes on the MI300X gfx942 core, and the need to use HIP graphs carefully to keep captured execution static. After addressing correctness issues and optimizing sparse MLA decode and MXFP4 MoE bookkeeping, they report a small performance uplift on a benchmark (+8.6% to 2699 output tokens/s per GPU) and argue MI300X can be cost-effective despite remaining software gaps that are expected to improve on newer AMD parts.

Hermes Desktop (hermes-agent.nousresearch.com) AI

Nous Research’s Hermes Desktop is an open-source MIT-licensed agent that connects to multiple messaging platforms, maintains persistent memory, schedules automations, and uses isolated subagents for tasks, with web browsing/vision tooling and multiple sandbox backends for experimentation.

GitHub Copilot App (github.com) AI

GitHub is rolling out a technical preview of the GitHub Copilot app, a new desktop experience for agent-driven development that lets users pick up issues or PRs, run agents, review diffs, and merge (or have agents close the loop). The preview also supports parallel agent sessions, agents tracked across repositories, and workflow extensions via MCP servers and custom skills, with access limited to existing Copilot Pro/Pro+/Max/Business/Enterprise users via a waitlist.

Anthropic scales Claude Mythos to critical infrastructure in 15 countries (techcrunch.com) AI

Anthropic is expanding Project Glasswing to around 150 new organizations in more than 15 countries by granting them access to its Claude Mythos model, which it says can help identify critical software vulnerabilities. The initiative targets sectors like power, water, healthcare, communications, and hardware, where successful attacks could have large-scale consequences, and includes partners such as Okta, several major South Korean firms, NATO, and the EU cybersecurity agency ENISA.

Testing Google's Gemini Spark AI agent: it's incredible, and creepy (theverge.com) AI

The Verge’s David Pierce reports testing Google’s Gemini Spark always-on AI agent for actions like managing Gmail and searching Docs, then using it to generate an unusually detailed, personalized trip plan for a Hershey, Pennsylvania family weekend. He says Spark’s ability to pull in information (including items he didn’t explicitly provide) and schedule nuanced details feels both impressive and “deeply creepy,” highlighting concerns about privacy and how much personal data users must share for such agents to be useful.

Rethinking Search as Code Generation (research.perplexity.ai) AI

Perplexity’s research article argues that traditional “monolithic” search interfaces are too rigid for code-capable AI agents, and proposes a new “Search as Code” (SaC) architecture where models access atomized search primitives via an SDK and assemble custom retrieval pipelines using generated Python code in a secure sandbox.

Microsoft launches Project Solara, device platform for AI (geekwire.com) AI

Microsoft says it has launched “Project Solara,” a new device platform built on an enterprise Android-based system (MDEP) that lets hardware run organization-specific AI agents instead of traditional apps, showcased with two early reference devices: a desk hub and a wearable badge. The company plans to have partners and hardware makers turn the designs into implementations for different industries, with pilots expected from AccuWeather, Best Buy, CVS Health, Levi’s, and Target, while the devices are managed and supported through Microsoft security and Azure cloud services.

Using wavelets and entropy coding to analyze code structure (yogthos.net) AI

The post introduces WaveScope, an MCP server that uses multi-scale wavelet transforms (plus simple keyword weighting) to score and locate structural “boundaries” in source code for LLM agents, enabling zoomed-in fine context and zoomed-out summaries without full-file context. It explains how sliding Ricker wavelets across a 1D signal of line scores yields peak positions across scales, assembles these into fine/medium/coarse zoom bands for the model, and can further quantify structural irregularity via entropy/complexity tools to help triage or refactor code. The author argues this sits between grep-based search and full AST parsing by being language-agnostic while still capturing hierarchical structure.

Show HN: Build Your Own AI Agent CLI in 150 Lines (go-micro.dev) AI

Go Micro’s blog post explains how its “micro chat” AI-agent CLI works in roughly 150 lines by breaking the system into four parts: discovering callable service endpoints as tools, wiring an LLM model to execute those tool calls, maintaining conversation history for follow-ups, and running a REPL loop that prints tool calls and final answers.

Intelligent Terminal 0.1 (devblogs.microsoft.com) AI

Microsoft announced Intelligent Terminal 0.1, an open-source experimental fork of Windows Terminal that adds native “agent” integration via an agent pane, automatic error detection, and an agent management panel for tracking sessions and background tasks (with GitHub Copilot CLI as the default ACP-compatible option).

Surface RTX Spark Dev Box (microsoft.com) AI

Microsoft has introduced the Surface RTX Spark Dev Box, a developer-focused desktop for AI work, shipping with a developer-optimized Windows 11 Pro setup preconfigured for tools like Visual Studio Code, GitHub Copilot, WSL, and PowerShell 7. The company says it pairs an NVIDIA RTX GPU with up to a petaflop of AI compute, includes 128GB of unified memory, and is built with a cooled aluminum chassis, along with security features such as Windows 11 Secured-core and support for BitLocker, Microsoft Defender, Entra ID, and Intune.

Promoting Advanced Artificial Intelligence Innovation and Security (whitehouse.gov) AI

The White House order (June 2, 2026) directs federal agencies to prioritize cybersecurity defenses for advanced AI and national security systems, establish guidance and a voluntary AI vulnerability-management “clearinghouse,” and create a classified benchmarking process for assessing and designating “covered frontier models.” It also calls for expanded hiring pathways for information cybersecurity specialists and instructs the Attorney General to prioritize enforcement of federal laws against AI-enabled unauthorized access or cybercrime.

Amazon joins Microsoft in sending message to employees (finance.yahoo.com) AI

Amazon has shut down an internal AI usage leaderboard (KiroRank) and told employees not to use AI just to rack up token consumption, following a backlash that inflated metrics without producing business value; it is instead moving to an “normalized deployments” measure focused on AI-assisted code that actually ships. The article says the shift mirrors moves by other Big Tech firms—including Microsoft, Meta, and Uber—after leaders found that heavy AI usage drove costs but not proportional outcomes.

Americans don't know how to fight AI. So they're fighting data centers (vox.com) AI

A Vox analysis argues that American opposition to AI data centers is less about the facilities’ local nuisances than about broader public dread of AI and the lack of credible federal policy for managing AI’s economic and social risks, driving people to use local moratoria as a stand-in lever. The piece also contends that environmental arguments for blocking data-center expansion are often overstated and says the real solution is a wider national debate and regulation that protects and expands human agency as AI spreads.