Agent Reading Test (agentreadingtest.com) AI

Agent Reading Test is a benchmark that scores how well AI coding agents can reliably read different kinds of documentation web pages, including cases where content is truncated, hidden by CSS, rendered only via JavaScript, or buried in tabs and navigation chrome. Each test page uses hidden “canary” tokens and tasks based on real documentation failure modes, then compares which tokens the agent reports after completing the work. The results are submitted for a max score of 20 and are intended to highlight silent failure modes in agent web-fetch pipelines across platforms.

Eighteen Years of Greytrapping – Is the Weirdness Finally Paying Off? (nxdomain.no)

The article recounts an 18-year experiment running “greytrapping” (automatically collecting and creating fake, bait email destinations) on the author’s mail infrastructure, reporting that by August 7, 2025 the number of spamtraps exceeded Norway’s population (about 5.62M vs. 5.60M). It describes how the setup evolved from early greylisting and PF/spamd-based filtering into a largely automated process for harvesting trapped addresses and logging trap history, alongside reflections on how email has become more centralized and vendors/cloud providers increasingly discourage self-hosting. While noting periodic “weird” developments in the wider anti-spam ecosystem, it argues that simple network/protocol techniques once offered substantial operational benefits and continues to frame greytrapping as an evidence-driven, low-drama part of running a mail server.

Show HN: Ghost Pepper – 100% local hold-to-talk speech-to-text for macOS (github.com) AI

Show HN Ghost Pepper is a macOS menu-bar app that provides hold-to-talk speech-to-text entirely on-device: press Control to record, release to transcribe, and paste the result. It uses WhisperKit for transcription and a local Qwen-based model to clean up filler words and self-corrections, with no cloud APIs and no data written to disk. The project also documents setup requirements (Microphone and Accessibility permissions) and an enterprise/MDM path to pre-approve Accessibility.

The cult of vibe coding is insane (bramcohen.com)

Bram Cohen argues that “vibe coding” is a misguided dogfooding-style practice where developers avoid examining code under the hood and instead rely on vague conversations with AI. He says this leads to obviously fixable redundancy and messes being left unaddressed, even when humans could easily review and classify parts of the codebase. Cohen contends that AI can help with refactoring and cleanup when guided by human inspection, and that poor software ultimately comes down to choices, not the tools themselves.

Battle for Wesnoth: open-source, turn-based strategy game (wesnoth.org)

The Battle for Wesnoth is a free, open-source, high-fantasy turn-based strategy game offering single-player campaigns and online or local multiplayer battles. The site highlights hundreds of units across multiple factions, extensive modding support using WML and Lua, and a library of player-made add-ons available via an official server. It also distinguishes between a stable release for general play and a development version aimed at experienced players and content creators.

Sky – an Elm-inspired language that compiles to Go (github.com)

Sky is an experimental Elm-inspired programming language that compiles to Go, aiming to let developers build fullstack apps from a single codebase and ship one portable binary. It combines Hindley-Milner type inference and Elm-style pattern matching with a server-driven UI model (Sky.Live) that renders via DOM diffing and SSE rather than needing a separate client framework. The project includes Go interop via type-safe FFI bindings and emphasizes that the compiler and tooling are self-hosted in Sky itself.

Why do Macs ask you to press random keys when connecting a new keyboard? (unsung.aresluna.org)

The article explains that when macOS sets up a new keyboard, it asks you to press specific keys to help determine the keyboard’s physical layout variant (e.g., US/ANSI vs European/ISO vs Japanese/JIS), since keyboards don’t reliably report their exact key positions to the computer. The “random key” prompts work across many regional layouts and differing legends, allowing macOS to place keys correctly and, in some cases like Japanese keyboards, ensure the right characters are produced. Apple’s own keyboards skip the prompt because the system can identify their model layout, while many third-party keyboards don’t provide trustworthy identification data.

Launch HN: Freestyle: Sandboxes for AI Coding Agents (freestyle.sh) AI

Launch HN’s Freestyle describes a system for running AI coding agents inside full Linux VM sandboxes, including creating per-agent repos from templates, forking VMs, and executing build/test/review workflows. The post highlights fast VM startup, live forking and pause/resume (to reduce cost while idle), and features like bidirectional GitHub sync and configurable webhook triggers. Freestyle positions its approach as real VMs (not containers) with strong isolation and support for multiple virtualization layers.

Reducto releases Deep Extract (reducto.ai) AI

Reducto has launched “Deep Extract,” an agent-based structured document extraction update that repeatedly extracts, verifies against the source document, and re-extracts until accuracy thresholds are met. The company says it improves performance on long, complex documents—using verification criteria and optional citation bounding boxes—reporting up to 99–100% field accuracy in its production beta. Deep Extract is available via the Extract endpoint configuration (deep_extract: true).

New fibre optic data transmission speed record of 450Tbps (ucl.ac.uk)

Researchers led by UCL have set a new fibre optic transmission record of 450 terabits per second over an existing, commercially installed optical fibre link between London data infrastructure. The result uses additional wavelength bands beyond the standard C- and L-bands to expand the number of channels, with the team sending data over a 39-kilometre round trip. While it is not expected to change home internet speeds soon, the work is positioned as a proof of concept for upgrading capacity between data centres, with possible commercial adoption in several years.

U.S. Lawmakers Work on Unified Site-Blocking Bill to Counter Online Piracy (torrentfreak.com)

U.S. lawmakers led by Sen. Thom Tillis and Rep. Zoe Lofgren are reportedly working on a unified bill to expand court-ordered site blocking aimed at foreign piracy, building on earlier separate proposals. The effort is framed as more urgent after a Supreme Court decision limited ISPs’ liability for subscriber piracy. The draft is expected to target both ISPs and large DNS resolvers, and it may be considered for introduction before Tillis’s term ends in January 2027, potentially alongside or in combination with a separate House proposal from Rep. Darrell Issa.

I Replaced Kafka, Redis, and RabbitMQ with One Tool – A Deep Dive into NATS (medium.com)

The article argues that NATS can replace a stack of Kafka (streaming/durability), Redis (in-memory pub/sub), and RabbitMQ (messaging) by providing pub/sub, request/reply, queue groups, and JetStream within one system. It explains key behaviors such as Core NATS being ephemeral (dropping messages to slow/disconnected consumers) versus JetStream adding disk persistence, and highlights how subject naming (tokens, wildcards, and hierarchy) determines routing and stream/filter granularity. The author also compares NATS request/reply to gRPC/RPC patterns, emphasizing that it’s built from basic pub/sub using temporary inbox subjects and protocol-level “no responders” handling.

US-Iranian War: a best case scenario (kamilkazani.substack.com)

The author argues that a US attack on Iran is likely to become a long, costly war rather than a quick removal of leadership, because Iran is not a simple “autocracy” and has stronger-than-assumed popular and political cohesion. They suggest the “best case” for the US would be to rapidly pursue a Vietnam-style exit—minimizing losses and later attempting rapprochement based on mutual recognition and respect.

A Cryptography Engineer's Perspective on Quantum Computing Timelines (words.filippo.io)

A cryptography engineer argues that timelines for breaking widely used elliptic-curve cryptography with quantum computers have accelerated, citing recent papers from Google and Oratomic that reduce required qubits and could enable practical man-in-the-middle risks. The author says the risk window is now too short to rely on “hybrid” approaches or waiting, and recommends fast migration to post-quantum signatures and key exchange—while flagging that some non-PQ hardware trust anchors (like current TEEs) may not be quantum-safe. They conclude that organizations should start shipping quantum-resistant cryptography now, even if exact dates remain uncertain.

sc-im Spreadsheets in Your Terminal (github.com)

sc-im is an open-source, ncurses-based spreadsheet calculator for the terminal with a vim-like editing interface. It supports large grids, undo/redo, CSV/TAB/XLSX/ODS import and export, formatting, sorting/filtering, plotting via GNUPlot, and scripting through Lua plus external modules. The project is maintained by a single developer and encourages users to star or donate to support ongoing development.

The secretive plan for a Maine data center collapsed in 6 days (bangordailynews.com) AI

A proposed $300 million AI data center in Lewiston’s downtown Bates Mill began unraveling even before the public learned much about it. City councilors received a detailed proposal shortly before a vote, held two closed-door sessions, and released information to the public only six days before the Dec. 16 decision—prompting swift backlash over environmental concerns, transparency, and limited review time. The council voted unanimously to reject the plan, with officials pointing to the developer’s lack of early public engagement as a key factor, amid broader Maine debates and emerging state-level moratorium efforts.

Claude Code is unusable for complex engineering tasks with the Feb updates (github.com) AI

A GitHub issue on Anthropic’s Claude Code reports a quality regression for complex engineering work after February updates, with the reporter saying the model began ignoring instructions, making incorrect “simplest fixes,” and performing worse long-session tool workflows. The author attributes the change to reduced “extended thinking” (including a staged rollout of thinking content redaction) and provides log-based metrics showing less code reading before edits and increased stop/“hook” violations. They say the behavior has made Claude Code “unusable” for their team and ask for transparency or configuration to ensure deeper reasoning for power users.