LLM Digest

AI Daily Recap

14 articles · 4 categories

View as JSON

‹Day›

The finishable daily brief

What happened in AI — Jun 11, 2026

Thursday, Jun 11, 2026
14 articles · 4 categories

read top to bottom · then stop

In 30 seconds

Xiaomi open-sourced MiMo Code, claiming it beats Claude Code on 200+ step agentic tasks.
OpenAI is acquiring Ona to give Codex persistent, secure cloud environments for long-running agents.
AWS reports frontier teams hitting 4.5x–10x productivity gains and releases the open-source Agent-EvalKit.
Anthropic walks back a safeguard policy researchers feared could sabotage frontier work with Claude.
Anthropic also unveils a DXC enterprise alliance and launches Claude Corps.
Sarah Guo's essay reframes the open-models debate around model labs vs. agent labs and what's 'untrainable.'

Coding agents were the story of the day. Xiaomi open-sourced MiMo Code, an agentic harness it says outlasts Claude Code on ultra-long, 200-plus-step tasks, while OpenAI moved to acquire Ona to give Codex secure, persistent cloud environments for long-running agents. Underneath the headlines, AWS published hard numbers — frontier teams reporting 4.5x and occasionally 10x productivity gains — and shipped Agent-EvalKit, an Apache-2.0 toolkit for measuring whether those agents actually work.

Anthropic had a busy news cycle of its own: a new enterprise alliance with DXC, the launch of Claude Corps, and a notable retreat on a safeguard policy that researchers had warned could 'sabotage' frontier work done with Claude. The company says it will make Fable 5's frontier-development safeguards visible rather than silent.

Around the edges, OpenAI leaned into policy and science — backing the EU's content-transparency Code of Practice and showcasing an astrophysicist using Codex to simulate black holes — and Sarah Guo's widely shared essay reframed the open-models debate as a fight between model labs and agent labs over what's ultimately 'untrainable.'

Coding Agents & Developer Tooling 5 items

The day's center of gravity: new agentic coding harnesses, infrastructure to run agents at length, and tools to evaluate and govern them.

Xiaomi's MiMo Code beats Claude Code at ultra-long, 200+ step tasks

search_agent_engineering_newsDetails

Xiaomi releases an open-source agentic coding harness it claims outperforms Claude Code on very long, multi-step tasks — a notable entrant from a non-Western lab.

OpenAI to acquire Ona

openai_blogDetails

OpenAI plans to fold Ona into Codex to add secure, persistent cloud environments — the substrate long-running enterprise agents need.

How frontier teams are reinventing AI-native development

aws_ml_blogDetails

AWS profiles teams redesigning how software gets built around AI, citing 4.5x productivity gains and, in some cases, more than 10x.

Evaluate AI agents systematically with Agent-EvalKit

aws_ml_blogDetails

An Apache-2.0 toolkit that brings evaluation infrastructure to AI coding assistants including Claude Code, Kiro CLI, and Kilo Code.

Outpost — Capability-based API access for AI agents

hackernews_aiDetails

An open-source project proposing capability-scoped API access as a safer way to hand agents real-world permissions.

Anthropic: Alliances, Hiring & Policy 3 items

A three-front news day for Anthropic — a new enterprise partnership, a talent program, and a reversal on a contested safeguard.

DXC–Anthropic alliance

anthropic_newsroomJun 11Details

Anthropic partners with DXC to push Claude deeper into enterprise integration and services work.

Claude Corps

anthropic_newsroomJun 11Details

Anthropic launches Claude Corps, a new program framing how it deploys talent and Claude into the field.

Anthropic walks back policy that could have 'sabotaged' AI researchers using Claude

simon_willisonJun 11Details

After a Wired scoop, Anthropic says it will make Fable 5's frontier-development safeguards visible instead of silent, easing fears the rules hampered legitimate research.

OpenAI's Wider Agenda 2 items

Beyond the Ona deal, OpenAI pressed on policy and showcased Codex in frontier science.

Supporting Europe's work in ensuring a trustworthy AI ecosystem

openai_blogDetails

OpenAI backs the EU Code of Practice on AI content transparency, advancing provenance standards for AI-generated content.

How an astrophysicist uses Codex to simulate black holes

openai_blogDetails

Chi-kwan Chan uses Codex to build black hole simulations, testing extreme physics and Einstein's general relativity — a concrete scientific-computing case study.

Open Models, Essays & Releases 3 items

The day's reading and tinkering: a sharp essay on the lab landscape, a steady library release, and a hands-on agent skill.

You are caught up for this edition

AI Daily Recap

What happened in AI — Jun 11, 2026

Coding Agents & Developer Tooling 5 items

Xiaomi's MiMo Code beats Claude Code at ultra-long, 200+ step tasks

OpenAI to acquire Ona

How frontier teams are reinventing AI-native development

Evaluate AI agents systematically with Agent-EvalKit

Outpost — Capability-based API access for AI agents

Anthropic: Alliances, Hiring & Policy 3 items

DXC–Anthropic alliance

Claude Corps

Anthropic walks back policy that could have 'sabotaged' AI researchers using Claude

OpenAI's Wider Agenda 2 items

Supporting Europe's work in ensuring a trustworthy AI ecosystem

How an astrophysicist uses Codex to simulate black holes

Open Models, Essays & Releases 3 items

[AINews] Open Models, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo

Datasette 1.0a33

An agent skill for making HTML slides with consultant style