What happened in AI — Jun 22, 2026

15 articles · 5 categories

← Live feed 📈 Storylines 🗓️ Weekly recap 🗣️ Voices ✉️ Email digest JSON Day

The finishable daily brief

What happened in AI — Jun 22, 2026

Monday, Jun 22, 2026
15 articles · 5 categories

read top to bottom · then stop

In 30 seconds

OpenAI launched Daybreak — Codex Security and GPT-5.5-Cyber — to find, validate, and patch vulnerabilities at scale, plus Patch the Planet to support open-source maintainers.
Evals were the recurring theme: why most evals miss real failures (the Linear sales-email case), three years of lessons building financial-agent evals, and a reproducible benchmark for poisoned agent memory.
Agent-memory tooling shipped: PMB (local-first memory for coding agents over MCP) and Headroom (a context-compression layer for agents).
Infra moved: AWS Graviton5 hit GA with 192 cores and formally verified VM isolation, NVIDIA's Vera CPU heads to Los Alamos supercomputers, and liquid cooling now runs at 45°C.
OpenAI's Codex-maxxing guide and an agent-friendly architecture writeup pushed long-running, autonomous coding workflows.

Security led the day: OpenAI unveiled Daybreak — Codex Security and a GPT-5.5-Cyber model built to find, validate, and patch vulnerabilities at scale — alongside Patch the Planet to back open-source maintainers. A Latent Space conversation with Gray Swan reinforced the framing that AI security is its own discipline, not cybersecurity with a model bolted on.

Underneath ran two engineering threads: evals (why they miss real failures, lessons from financial agents, and a benchmark for poisoned memory) and agent-memory tooling (PMB over MCP, Headroom for context compression). Infrastructure kept pace, with AWS Graviton5 reaching GA and NVIDIA's Vera CPU bound for Los Alamos.

AI Security & Red-Teaming 3 items

OpenAI made security its headline of the day, shipping Daybreak tools to find and patch vulnerabilities at scale and an initiative to back open-source maintainers, while a Gray Swan conversation argued AI security is a discipline of its own.

Daybreak: Tools for securing every organization in the world

openai_blogDetails

OpenAI introduces Daybreak — including Codex Security and GPT-5.5-Cyber — to help organizations find, validate, and patch vulnerabilities at scale.

Patch the Planet: a Daybreak initiative to support open source maintainers

openai_blogDetails

A Daybreak program that helps open-source maintainers find, validate, and fix vulnerabilities with AI plus expert review.

Red-Teaming after Mythos — Zico Kolter & Matt Fredrikson, Gray Swan

latent_spaceDetails

OpenAI board member Zico Kolter and Gray Swan CEO Matt Fredrikson explain why AI security is not just “cybersecurity with AI.”

Evals & Agent Reliability 3 items

A strong day for the unglamorous work of measuring agents: why typical evals miss the failures that matter, hard lessons from years of financial-agent evals, and a reproducible benchmark for memory poisoning.

Why most AI evals would miss the Linear sales email failure

hackernews_aiDetails

A case study in why pass/fail eval suites overlook the subtle, context-dependent failures that actually break agents in production.

Lessons from Building Evals for Financial AI Agents

hackernews_aiDetails

Three years of hard-won lessons on designing evals where correctness, cost, and trust are non-negotiable.

Agent-memory systems admit poisoned facts — a reproducible benchmark

hackernews_aiDetails

A benchmark showing agent-memory stores will accept and retrieve poisoned facts — a concrete reliability and security gap to test against.

Agent Memory & Context Engineering 2 items

Two pieces of practical tooling for the perennial agent problems of remembering and fitting context: local-first memory over MCP and a dedicated context-compression layer.

PMB — local-first memory for AI coding agents over MCP

hackernews_aiDetails

A single-file SQLite + LanceDB store with hybrid BM25/vector retrieval, exposed to coding agents over MCP — no server or API keys.

Headroom — the context compression layer for AI agents

hackernews_aiDetails

A drop-in layer that compresses context so agents stay within window limits on long-running tasks.

Coding Agents & Multi-Agent Systems 3 items

From keeping long-running coding work alive to architectures that read well for both humans and agents — plus Sakana packaging a multi-agent system as a single model.

Codex-maxxing for long-running work

openai_blogDetails

How Jason Liu uses Codex to preserve context, manage complex projects, and keep work going beyond a single prompt.

Agile and Coding: An Agent- and Human-Friendly Architecture

hackernews_aiDetails

Structuring codebases so the same architecture is legible to autonomous agents and the humans reviewing them.

Sakana Fugu multi-agent system delivered as one model

hackernews_aiDetails

Sakana packages a multi-agent system into a single deployable model — a notable take on collapsing orchestration into inference.

AI Infrastructure & Hardware 4 items

The platform layer had a busy day: a new general-purpose ARM server chip, NVIDIA silicon and cooling for AI factories and national labs, and a real-world multimodal retrieval architecture on AWS.

You are caught up for this edition

📰 AI Daily Recap

What happened in AI — Jun 22, 2026

AI Security & Red-Teaming 3 items

Daybreak: Tools for securing every organization in the world

Patch the Planet: a Daybreak initiative to support open source maintainers

Red-Teaming after Mythos — Zico Kolter & Matt Fredrikson, Gray Swan

Evals & Agent Reliability 3 items

Why most AI evals would miss the Linear sales email failure

Lessons from Building Evals for Financial AI Agents

Agent-memory systems admit poisoned facts — a reproducible benchmark

Agent Memory & Context Engineering 2 items

PMB — local-first memory for AI coding agents over MCP

Headroom — the context compression layer for AI agents

Coding Agents & Multi-Agent Systems 3 items

Codex-maxxing for long-running work

Agile and Coding: An Agent- and Human-Friendly Architecture

Sakana Fugu multi-agent system delivered as one model

AI Infrastructure & Hardware 4 items

AWS Graviton5 Reaches General Availability with 192 Cores and Formally Verified VM Isolation

NVIDIA Vera CPU Opens the Way for Agentic Scientific AI at Los Alamos National Laboratory

Hotter Than a Hot Tub: The 45°C Breakthrough to Cool AI's Biggest Machines

Embed the world: Multimodal AI for searchable aerial imagery at scale