📰 AI Daily Recap

15 articles · 5 categories

← Live feed 📈 Storylines 🗓️ Weekly recap 🗣️ Voices ✉️ Email digest JSON

The finishable daily brief

What happened in AI — Jun 22, 2026

Monday, Jun 22, 2026
15 articles · 5 categories

read top to bottom · then stop

In 30 seconds

  • OpenAI launched Daybreak — Codex Security and GPT-5.5-Cyber — to find, validate, and patch vulnerabilities at scale, plus Patch the Planet to support open-source maintainers.
  • Evals were the recurring theme: why most evals miss real failures (the Linear sales-email case), three years of lessons building financial-agent evals, and a reproducible benchmark for poisoned agent memory.
  • Agent-memory tooling shipped: PMB (local-first memory for coding agents over MCP) and Headroom (a context-compression layer for agents).
  • Infra moved: AWS Graviton5 hit GA with 192 cores and formally verified VM isolation, NVIDIA's Vera CPU heads to Los Alamos supercomputers, and liquid cooling now runs at 45°C.
  • OpenAI's Codex-maxxing guide and an agent-friendly architecture writeup pushed long-running, autonomous coding workflows.

Security led the day: OpenAI unveiled Daybreak — Codex Security and a GPT-5.5-Cyber model built to find, validate, and patch vulnerabilities at scale — alongside Patch the Planet to back open-source maintainers. A Latent Space conversation with Gray Swan reinforced the framing that AI security is its own discipline, not cybersecurity with a model bolted on.

Underneath ran two engineering threads: evals (why they miss real failures, lessons from financial agents, and a benchmark for poisoned memory) and agent-memory tooling (PMB over MCP, Headroom for context compression). Infrastructure kept pace, with AWS Graviton5 reaching GA and NVIDIA's Vera CPU bound for Los Alamos.

AI Security & Red-Teaming 3 items

OpenAI made security its headline of the day, shipping Daybreak tools to find and patch vulnerabilities at scale and an initiative to back open-source maintainers, while a Gray Swan conversation argued AI security is a discipline of its own.

Evals & Agent Reliability 3 items

A strong day for the unglamorous work of measuring agents: why typical evals miss the failures that matter, hard lessons from years of financial-agent evals, and a reproducible benchmark for memory poisoning.

Agent Memory & Context Engineering 2 items

Two pieces of practical tooling for the perennial agent problems of remembering and fitting context: local-first memory over MCP and a dedicated context-compression layer.

Coding Agents & Multi-Agent Systems 3 items

From keeping long-running coding work alive to architectures that read well for both humans and agents — plus Sakana packaging a multi-agent system as a single model.

AI Infrastructure & Hardware 4 items

The platform layer had a busy day: a new general-purpose ARM server chip, NVIDIA silicon and cooling for AI factories and national labs, and a real-world multimodal retrieval architecture on AWS.

You are caught up for this edition