OpenWiki generates and maintains codebase documentation so coding agents can find the repo context they need without loading everything into one instruction file. Context & related coverage →
Software tests and code evolve together: a code change should be followed by new or updated tests that record the new software behavior. Yet existing test generation and update benchmarks often isolate the test from t... Context & related coverage →
Compiler missed optimizations refer to cases in which compilers failed to optimize certain code. It takes many compiler developers' efforts to implement or patch such missed optimizations. In this paper, we present a... Context & related coverage →
Release: llm-coding-agent 0.1a0 Another Fable 5 experiment. Now that my LLM library has evolved into more of an agent framework it's time to see what a simple coding agent would look like built on it. I started a new... Context & related coverage →
Evaluating long-running, stateful agents needs a new kind of runner. Here's how Deep Agents, LangSmith sandboxes, and observability plug into Harbor. Context & related coverage →
shot-scraper video is a new command introduced in today's shot-scraper 1.10 release which accepts a storyboard.yml file defining a routine to run against a web application and uses Playwright to record a video of that... Context & related coverage →
On-policy self-distillation (OPSD) has emerged as a practical method for training large language models (LLMs) to reason, where a single model acts as both the teacher and the student with different levels of informat... Context & related coverage →
Changed AskUserQuestion dialogs to no longer auto-continue by default; opt into an idle timeout via /config · Changed the "default" permission mode to "Manual" across the CLI, --help , VS Code, and JetBrains; --permis... Context & related coverage →
Cloudflare details Town Lake, an internal unified data platform, and Skipper, an AI analytics agent unifying access to operational, billing, security, and business data. The platform processed ~91K billing queries, wi... Context & related coverage →
The speakers discuss Agent RFT, OpenAI’s platform for fine-tuning reasoning models via real-time tool interactions and custom reward signals. They explain how reinforcement learning solves complex credit assignment ch... Context & related coverage →