LLM Digest

Live feed

AI news for platform & agent engineers

Ranked signal · finite reading

The AI brief that ends.

One shared ranking. Scan what changed, save what matters, and stop when the finish line appears.

Today's top signals

hamel.dev · 2026-06-29

“It’s Hard to Eval” Is a Product Smell

For the past 3 years, AI evals have been my professional focus. 1 The most common objection I hear to evals is “our product is hard to eval”. This objection is a product smell. Artifacts that are hard for you to verif... Context & related coverage →

langchain.com · 2026-07-02

OpenWiki: Open Source Repo Documentation for Coding Agents

OpenWiki generates and maintains codebase documentation so coding agents can find the repo context they need without loading everything into one instruction file. Context & related coverage →

github.com · 2026-06-30

vllm v0.24.0

MiniMax-M3 : Added support for the new MiniMax-M3 model , with a fast follow-on of BF16/FP8 indexer via MSA , MXFP4 support , FP8 sparse GQA , and extensive... · DeepSeek-V4 keeps maturing : Following its debut, DeepS... Context & related coverage →

simonwillison.net · 2026-06-29

Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding

Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding This is an interesting new open weights (MIT licensed) model, the first model release from DeepReinforce. [...] with variants including 9B Dense, 31B Dense, 35B MoE... Context & related coverage →

simonwillison.net · 2026-07-02

llm-coding-agent 0.1a0

Release: llm-coding-agent 0.1a0 Another Fable 5 experiment. Now that my LLM library has evolved into more of an agent framework it's time to see what a simple coding agent would look like built on it. I started a new... Context & related coverage →

infoq.com · 2026-06-30

Elastic Open-Sources Atlas Agent Memory Based on Cognitive Science

Elastic open-sourced Atlas, a system built on Elasticsearch that maintains three categories of memory for agents. Atlas integrates with agents via MCP and maintains per-user isolation of memories. When evaluated on qu... Context & related coverage →

langchain.com · 2026-06-30

Harbor x LangChain: A Unified Stack for Evaluating Agents

Evaluating long-running, stateful agents needs a new kind of runner. Here's how Deep Agents, LangSmith sandboxes, and observability plug into Harbor. Context & related coverage →

vllm.ai · 2026-06-29

Micro-Agent: Beat Frontier Models with Collaboration inside Model API

How vLLM Semantic Router turns vllm-sr/auto into a bounded micro-agent runtime for Confidence, Ratings, ReMoM, Fusion, Workflows, and benchmark-shaped collaboration. Context & related coverage →

magazine.sebastianraschka.com · 2026-06-27

Using Local Coding Agents

Using Open-Weight Models in Local Coding Harnesses as an Alternative to Claude Code and Codex Subscriptions Context & related coverage →

huggingface.co · 2026-06-30

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

Context & related coverage →

arxiv.org · 2026-07-02

AgenticSTS: A Bounded-Memory Testbed for Long-Horizon LLM Agents

Memory for a long-horizon LLM agent is a contract about what each future decision is allowed to see. The simplest contract appends past observations, tool calls, and reflections to every prompt, which makes prior cont... Context & related coverage →

cloud.google.com · 2026-07-01

AlloyDB AI Functions - now with revolutionary performance boosts and cost savings

AlloyDB is an AI-native database—it isn’t just a passive data store, it intelligently understands and processes your data. With AlloyDB, you get industry-leading vector and hybrid search, near 100% accurate natural la... Context & related coverage →

Prefer it summarized? Read the daily recap →

The finishable AI feed for platform & agent engineers LLM Digest is a low-hype, ranked daily brief of AI news for platform and infrastructure engineers — model releases, frontier-lab research, inference and serving updates, agent tooling, and selected papers. One shared, transparent ranking for everyone; no engagement-optimized infinite scroll. Above is a static snapshot of the current top items; the live, filterable feed needs JavaScript. These pages are fully readable without it:

Daily recap — what changed in AI today, in 10 minutes.

Weekly recap — what you missed this week.

Storylines — follow a developing story day by day.

Playbook — actionable cards: the problem, what to apply, the expected result.

Knowledge map — agent-engineering obstacles mapped to solutions.

Foundations — evidence-tiered explanations behind agent-building practice.

Voices — influential AI engineers and their writing.

Email digest · JSON feed