AI

Summary

Generated about 17 hours ago.

TL;DR: The day’s AI coverage spanned model efficiency research, offline/open-model assistants, and the growing “agent/tooling” governance debate around MCP and provider routing.

Model research & tooling

  • Extreme low-bit quantization: The Salomi GitHub repo tests near-binary transformer quantization for GPT-2–class models and reports that strict 1.00 bpp post-hoc binary quantization doesn’t hold up; more credible results appear around ~1.2–1.35 bpp using approaches like Hessian-guided vector quantization and mixed precision.
  • Offline assistant on-device: AbodeLLM brings an Android AI assistant that runs open models (e.g., LLaMA, DeepSeek) locally, with optional multimodal inputs and expert controls.
  • Model marketplace/routing: OpenRouter lists Arcee AI’s “Trinity Large Thinking” with pricing and routing/fallback behavior across providers.

Agents, eval, and data governance

  • MCP trust gaps: A critique of Perplexity’s MCP stance argues the main issue isn’t just token overhead, but missing trust-aware controls for sensitive data after authorization (suggesting sensitivity metadata, trust-tier registries, and runtime enforcement).
  • Eval rigor: “The revenge of the data scientist” warns that LLM harnessing often repeats data-science pitfalls—weak experimental design, unreliable judges, and poor validation.

Politics and broader discourse

  • AI forecasts: Kronaxis claims a synthetic-voter method predicts UK local elections with ~75% winner accuracy on limited by-election validation, with country-level seat predictions.
  • Meta-critique of tech elites: The Nation argues Silicon Valley leaders promote an anti-intellectual narrative that dismisses deep-learning thinking while profiting from it.

Stories

Salomi, a research repo on extreme low-bit transformer quantization (github.com) AI

Salomi is a GitHub research repo exploring extreme low-bit (near-binary) transformer quantization and inference for GPT-2–class models, with code, experiments, and evaluation tooling. It specifically tests whether strict 1.00 bpp post-hoc binary quantization can match or beat higher quantization baselines and concludes it does not hold up under rigorous evaluation. The repo instead reports more credible results around ~1.2–1.35 bpp using methods such as Hessian-guided vector quantization, mixed precision, and magnitude-recovery, and directs readers to curated assessment and validation documents over older drafts.

Show HN: Mkdnsite – Markdown-native web server for humans (HTML) and agents (md) (github.com) AI

Mkdnsite is an open-source “Markdown-native” web server that serves a directory or GitHub repo of .md files without a static-site build step. It renders HTML for browsers and uses HTTP content negotiation to return raw Markdown for AI agents (e.g., via Accept: text/markdown), along with an auto-generated /llms.txt and an optional MCP endpoint. The project supports Bun/Node/Deno, runtime editing without redeploy, and includes features like search, theming, math (KaTeX), Mermaid, and syntax highlighting.

Show HN: Semantic atlas of 188 constitutions in 3D (30k articles, embeddings) (constitutionalmap.ai) AI

Constitutional Map AI is a web tool that builds a 3D semantic atlas of constitutional law by embedding thousands of constitutional articles from 188 constitutions. It clusters the text into thematic “neighborhoods” and lets users compare countries on a shared semantic space using keyword or semantic search, with metrics like coverage and entropy. The site’s data is sourced from the Constitute Project and the code is open source, with a note that AI clustering or segmentation errors are possible.

The Anti-Intellectualism of the Silicon Valley Elite (thenation.com) AI

The article argues that Silicon Valley’s top figures—citing figures like Peter Thiel and Marc Andreessen—promote an anti-intellectual worldview that treats deep learning as unnecessary, even while profiting from it. It links this stance to attacks on higher education and the humanities, skepticism toward inquiry that could challenge the managerial class, and a broader desire for insulation from accountability. The piece also criticizes how AI and tech “shortcuts” can be used to replace thinking, while the same elite dismisses the people and disciplines that make that knowledge possible.

AbodeLLM – An offline AI assistant for Android devices, based on open models (github.com) AI

AbodeLLM is an Android app that runs an offline AI assistant using open-source models such as LLaMA and DeepSeek, with chat processed entirely on-device and no internet required. It supports optional multimodal inputs (vision and audio depending on models), context retention, and an “Expert Mode” for tuning generation and cache/token limits. The project includes installation steps and a list of supported model variants along with minimum hardware requirements.

The Claude Code Leak (build.ms) AI

An article argues that the alleged leak of Claude Code’s source code matters less than the broader lessons it highlights: product-market fit and seamless model-to-agent integration outweigh the quality or even the cleanliness of the underlying code. The writer also discusses how the code appears to be “bad” yet still supports a valuable product, why observability and automation may be more important than implementation details, and how the ensuing DMCA and clean-room rewrites reflect ongoing copyright tensions in AI development.

Trinity Large Thinking (openrouter.ai) AI

OpenRouter lists Arcee AI’s open-source “Trinity Large Thinking” model and its pricing on the platform, including per-token input/output costs and usage statistics. The page explains how OpenRouter routes requests to multiple providers with fallbacks to improve uptime, and how to enable reasoning output via a request parameter and the returned reasoning_details.

Perplexity Says MCP Sucks (suthakamal.substack.com) AI

The author argues that Perplexity’s critique of MCP’s token overhead is directionally right but misses the bigger issue: MCP doesn’t provide trust-aware controls for where sensitive data goes after authorization, so different kinds of regulated data are treated identically. They propose adding sensitivity metadata to tool responses, a shared trust-tier registry for inference providers, and runtime enforcement (including redaction/blocking or attestation) to prevent unsafe routing. The piece also notes similar trust gaps in WebMCP and frames MCP’s performance debate as secondary to missing data-governance primitives.

Show HN: 65k AI voters predict UK local elections with 75% accuracy (kronaxis.co.uk) AI

Kronaxis reports a forecast for the 7 May 2026 UK local elections using 65,000 synthetic “voters” built from Census 2021 demographics plus a personality and political-history model. After testing the approach against 10 recent English by-elections and applying a calibration correction for consistent bias, the company claims about 75% winner accuracy on that limited validation set. For the first 20 councils in its release, it predicts Reform UK wins 18 of 20, with Labour narrowly holding Manchester and Greens winning Bristol, while predicting Conservatives take no council seats. The post emphasizes that calibration used the same by-elections as evaluation and will need to be validated by the actual election results.

Ukrainian drone holds position for 6 weeks (defenceleaders.com) AI

A Ukrainian remotely operated, machine-gun armed UGV (TW 12.7) reportedly stayed on station at a contested crossroads for over six weeks, moving forward daily and withdrawing to cover at night. The system answered multiple calls for fire, helping suppress Russian activity and support infantry tasks, highlighting growing maturity and reliability of Ukraine’s domestically produced strike ground robots. The article also stresses the need for operator training, protected recovery methods to avoid risking personnel, and manufacturer testing to improve sensors and turrets under realistic conditions.

The revenge of the data scientist (hamel.dev) AI

The post argues that much of “LLM harnessing” and evaluation is still traditional data science, despite claims that the field is declining or that engineering teams can rely on APIs and generic tooling. It highlights common eval pitfalls—such as using generic metrics, unverified LLM judges, weak experimental design, low-quality data/labels, and over-automation—and explains how data scientists would approach each with trace analysis, error breakdowns, proper validation, and domain-expert labeling.

Obfuscation is not security – AI can deobfuscate any minified JavaScript code (afterpack.dev) AI

The AfterPack blog argues the “Claude Code source leak” didn’t expose hidden code: Claude Code’s CLI JavaScript was already publicly accessible on npm, with only a source map accidentally revealing additional internal comments and file structure. It also contends the bundled code is minified rather than truly obfuscated, and that AI/AST parsing can extract large amounts of prompts, tool descriptions, and configuration strings directly from the minified bundle. Anthropic says the issue was a packaging mistake and not a security breach, noting similar source map exposure occurred before.