LLM Digest

Live feed

AI news for platform & agent engineers

Ranked signal · finite reading

The AI brief that ends.

One shared ranking. Scan what changed, save what matters, and stop when the finish line appears.

Today's top signals

langchain.com · 2026-07-02

OpenWiki: Open Source Repo Documentation for Coding Agents

OpenWiki generates and maintains codebase documentation so coding agents can find the repo context they need without loading everything into one instruction file. Context & related coverage →

github.com · 2026-06-30

vllm v0.24.0

MiniMax-M3 : Added support for the new MiniMax-M3 model , with a fast follow-on of BF16/FP8 indexer via MSA , MXFP4 support , FP8 sparse GQA , and extensive... · DeepSeek-V4 keeps maturing : Following its debut, DeepS... Context & related coverage →

simonwillison.net · 2026-07-02

llm-coding-agent 0.1a0

Release: llm-coding-agent 0.1a0 Another Fable 5 experiment. Now that my LLM library has evolved into more of an agent framework it's time to see what a simple coding agent would look like built on it. I started a new... Context & related coverage →

infoq.com · 2026-06-30

Elastic Open-Sources Atlas Agent Memory Based on Cognitive Science

Elastic open-sourced Atlas, a system built on Elasticsearch that maintains three categories of memory for agents. Atlas integrates with agents via MCP and maintains per-user isolation of memories. When evaluated on qu... Context & related coverage →

langchain.com · 2026-06-30

Harbor x LangChain: A Unified Stack for Evaluating Agents

Evaluating long-running, stateful agents needs a new kind of runner. Here's how Deep Agents, LangSmith sandboxes, and observability plug into Harbor. Context & related coverage →

arxiv.org · 2026-07-01

QuasiMoTTo: Quasi-Monte Carlo Test-Time Scaling

Scaling inference compute, by generating many parallel attempts per problem, is a costly but reliable lever for improving language model capabilities. By default these attempts are generated independently, wasting inf... Context & related coverage →

simonwillison.net · 2026-06-30

Have your agent record video demos of its work with shot-scraper video

shot-scraper video is a new command introduced in today's shot-scraper 1.10 release which accepts a storyboard.yml file defining a routine to run against a web application and uses Playwright to record a video of that... Context & related coverage →

arxiv.org · 2026-07-01

Are Performance-Optimization Benchmarks Reliably Measuring Coding Agents?

Repository-level performance-optimization benchmarks such as GSO, SWE-Perf and SWE-fficiency evaluate coding agents by applying patches to real repositories and comparing runtime against unoptimized baselines and offi... Context & related coverage →

huggingface.co · 2026-06-30

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

Context & related coverage →

arxiv.org · 2026-07-01

TiRex-2: Generalizing TiRex to Multivariate Data and Streaming

We introduce TiRex-2, a recurrent xLSTM-based time series foundation model that generalizes the univariate TiRex to multivariate forecasting with both past and future covariates. Real-world forecasting is inherently s... Context & related coverage →

infoq.com · 2026-07-01

Presentation: Graph RAG: Building Smarter Retrieval Workflows with Knowledge Graphs

Cassie Shum discusses the architectural evolution of GraphRAG and why data foundations are critical for advanced AI workflows. She explains how traditional vector RAG falls short when addressing global context, multi-... Context & related coverage →

github.com · 2026-07-02

claude-code v2.1.199

Stacked slash-skill invocations like /skill-a /skill-b do XYZ now load all leading skills (up to 5), not just the first · Fixed SSL certificate errors (TLS-inspecting proxies, missing NODE_EXTRA_CA_CERTS , expired cer... Context & related coverage →

Prefer it summarized? Read the daily recap →

The finishable AI feed for platform & agent engineers LLM Digest is a low-hype, ranked daily brief of AI news for platform and infrastructure engineers — model releases, frontier-lab research, inference and serving updates, agent tooling, and selected papers. One shared, transparent ranking for everyone; no engagement-optimized infinite scroll. Above is a static snapshot of the current top items; the live, filterable feed needs JavaScript. These pages are fully readable without it:

Daily recap — what changed in AI today, in 10 minutes.

Weekly recap — what you missed this week.

Storylines — follow a developing story day by day.

Playbook — actionable cards: the problem, what to apply, the expected result.

Knowledge map — agent-engineering obstacles mapped to solutions.

Foundations — evidence-tiered explanations behind agent-building practice.

Voices — influential AI engineers and their writing.

Email digest · JSON feed