{"date":"2026-06-11","title":"What happened in AI — Jun 11, 2026","generated_at":"2026-06-12T00:03:39Z","intro":["Coding agents were the story of the day. Xiaomi open-sourced MiMo Code, an agentic harness it says outlasts Claude Code on ultra-long, 200-plus-step tasks, while OpenAI moved to acquire Ona to give Codex secure, persistent cloud environments for long-running agents. Underneath the headlines, AWS published hard numbers — frontier teams reporting 4.5x and occasionally 10x productivity gains — and shipped Agent-EvalKit, an Apache-2.0 toolkit for measuring whether those agents actually work.","Anthropic had a busy news cycle of its own: a new enterprise alliance with DXC, the launch of Claude Corps, and a notable retreat on a safeguard policy that researchers had warned could 'sabotage' frontier work done with Claude. The company says it will make Fable 5's frontier-development safeguards visible rather than silent.","Around the edges, OpenAI leaned into policy and science — backing the EU's content-transparency Code of Practice and showcasing an astrophysicist using Codex to simulate black holes — and Sarah Guo's widely shared essay reframed the open-models debate as a fight between model labs and agent labs over what's ultimately 'untrainable.'"],"highlights":["Xiaomi open-sourced MiMo Code, claiming it beats Claude Code on 200+ step agentic tasks.","OpenAI is acquiring Ona to give Codex persistent, secure cloud environments for long-running agents.","AWS reports frontier teams hitting 4.5x–10x productivity gains and releases the open-source Agent-EvalKit.","Anthropic walks back a safeguard policy researchers feared could sabotage frontier work with Claude.","Anthropic also unveils a DXC enterprise alliance and launches Claude Corps.","Sarah Guo's essay reframes the open-models debate around model labs vs. agent labs and what's 'untrainable.'"],"article_count":14,"categories":[{"name":"Coding Agents & Developer Tooling","slug":"coding-agents-developer-tooling","summary":"The day's center of gravity: new agentic coding harnesses, infrastructure to run agents at length, and tools to evaluate and govern them.","articles":[{"title":"Xiaomi's MiMo Code beats Claude Code at ultra-long, 200+ step tasks","summary":"Xiaomi releases an open-source agentic coding harness it claims outperforms Claude Code on very long, multi-step tasks — a notable entrant from a non-Western lab.","source":"search_agent_engineering_news","url":"https://news.google.com/rss/articles/CBMi2AFBVV95cUxOQU9vTnF6eFktY0ZJeDdib1M0TzVWY1hMUl9Ka3dMUGlERlo1UGNDYU5xazgxeWRxY0d3UHA1N1h3dzlSZnBFNzQ2ZjRTVDFBV0loZGdjRGNKS2tMNzNYNVZjZmFPOTFuYW9YOGtEUGItWm9NUmpEMTdKaG02eVRHaURJanNRMEtPekxOZFZKeGJfR1BKMUVMV1pUYmNUMDNwUW02VlpSeC12UGhmVlJGTUdqY0pEWmwtbENPeEdKYXUtcEZBZ0w2c3ViWnlydll6TWVncVNNOUk?oc=5","published":"Thu, 11 Jun 2026 23:14:00 GMT"},{"title":"OpenAI to acquire Ona","summary":"OpenAI plans to fold Ona into Codex to add secure, persistent cloud environments — the substrate long-running enterprise agents need.","source":"openai_blog","url":"https://openai.com/index/openai-to-acquire-ona","published":"Thu, 11 Jun 2026 00:00:00 GMT"},{"title":"How frontier teams are reinventing AI-native development","summary":"AWS profiles teams redesigning how software gets built around AI, citing 4.5x productivity gains and, in some cases, more than 10x.","source":"aws_ml_blog","url":"https://aws.amazon.com/blogs/machine-learning/how-frontier-teams-are-reinventing-ai-native-development/","published":"Thu, 11 Jun 2026 00:54:42 +0000"},{"title":"Evaluate AI agents systematically with Agent-EvalKit","summary":"An Apache-2.0 toolkit that brings evaluation infrastructure to AI coding assistants including Claude Code, Kiro CLI, and Kilo Code.","source":"aws_ml_blog","url":"https://aws.amazon.com/blogs/machine-learning/evaluate-ai-agents-systematically-with-agent-evalkit/","published":"Thu, 11 Jun 2026 15:49:47 +0000"},{"title":"Outpost — Capability-based API access for AI agents","summary":"An open-source project proposing capability-scoped API access as a safer way to hand agents real-world permissions.","source":"hackernews_ai","url":"https://github.com/sausin/outpost","published":"Thu, 11 Jun 2026 10:14:25 +0000"}]},{"name":"Anthropic: Alliances, Hiring & Policy","slug":"anthropic-alliances-hiring-policy","summary":"A three-front news day for Anthropic — a new enterprise partnership, a talent program, and a reversal on a contested safeguard.","articles":[{"title":"DXC–Anthropic alliance","summary":"Anthropic partners with DXC to push Claude deeper into enterprise integration and services work.","source":"anthropic_newsroom","url":"https://www.anthropic.com/news/dxc-anthropic-alliance","published":"2026-06-11T18:00:00+00:00"},{"title":"Claude Corps","summary":"Anthropic launches Claude Corps, a new program framing how it deploys talent and Claude into the field.","source":"anthropic_newsroom","url":"https://www.anthropic.com/news/claude-corps","published":"2026-06-11T13:00:00+00:00"},{"title":"Anthropic walks back policy that could have 'sabotaged' AI researchers using Claude","summary":"After a Wired scoop, Anthropic says it will make Fable 5's frontier-development safeguards visible instead of silent, easing fears the rules hampered legitimate research.","source":"simon_willison","url":"https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/#atom-everything","published":"2026-06-11T03:45:49+00:00"}]},{"name":"OpenAI's Wider Agenda","slug":"openai-wider-agenda","summary":"Beyond the Ona deal, OpenAI pressed on policy and showcased Codex in frontier science.","articles":[{"title":"Supporting Europe's work in ensuring a trustworthy AI ecosystem","summary":"OpenAI backs the EU Code of Practice on AI content transparency, advancing provenance standards for AI-generated content.","source":"openai_blog","url":"https://openai.com/index/supporting-eu-trustworthy-ai-ecosystem","published":"Thu, 11 Jun 2026 00:00:00 GMT"},{"title":"How an astrophysicist uses Codex to simulate black holes","summary":"Chi-kwan Chan uses Codex to build black hole simulations, testing extreme physics and Einstein's general relativity — a concrete scientific-computing case study.","source":"openai_blog","url":"https://openai.com/index/using-codex-to-simulate-black-holes","published":"Thu, 11 Jun 2026 00:00:00 GMT"}]},{"name":"Open Models, Essays & Releases","slug":"open-models-essays-releases","summary":"The day's reading and tinkering: a sharp essay on the lab landscape, a steady library release, and a hands-on agent skill.","articles":[{"title":"[AINews] Open Models, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo","summary":"Latent Space spotlights Sarah Guo's essay reframing the open-models debate as a contest between model labs and agent labs over the limits of what can be trained.","source":"latent_space","url":"https://www.latent.space/p/ainews-open-models-model-labs-vs","published":"Thu, 11 Jun 2026 03:14:26 GMT"},{"title":"Datasette 1.0a33","summary":"Simon Willison's alpha extends the ?_extra= pattern to queries and rows, another step toward a stable Datasette 1.0.","source":"simon_willison","url":"https://simonwillison.net/2026/Jun/11/datasette/#atom-everything","published":"2026-06-11T15:26:49+00:00"},{"title":"An agent skill for making HTML slides with consultant style","summary":"A former consultant shares an agent skill that generates polished, consultant-grade HTML slide decks — sidestepping Office tooling.","source":"hackernews_ai","url":"https://news.ycombinator.com/item?id=48485885","published":"Thu, 11 Jun 2026 03:25:42 +0000"}]}]}