Record a skill
5.5-cyber vs mythos
Hey folks,
I’ve been looking at a lot of design resources for the manual recently. Interface Craft by Josh is one of the best places I’ve found for learning how to actually feel and notice taste.
Let’s get into what happened over the weekend.
Ben’s Bites is brought to you by nexos.ai, the unified AI platform
Tired of using different AI models for different tasks? We were too, so we made nexos.ai. It gives you access to 100+ models in one platform. Oh, and no more Googling “which AI should I use for this?” Just let nexos.ai automatically pick the best one. Give it a try (50% off with code benbites).
Headlines
Codex Record & Replay - show Codex a recurring workflow once, like filing an expense report or submitting time off, and it turns the demo into an inspectable, editable skill you can reuse.
Claude Code has Artifacts now - HTML pages with some functionality that you can share with others (like a PR walkthrough or living project dashboard). Available in beta for Team and Enterprise plans.
OpenAI is expanding Daybreak, its cybersecurity program, with a new version of GPT-5.5-Cyber (trusted partners only) that can reproduce even more bugs than Mythos and Patch the Planet, a push to help maintainers fix vulnerable open-source software faster.
Sakana AI released Fugu - An API that picks and coordinates several models for hard tasks. Fugu Ultra claims 73.7 on SWE-bench Pro and 82.1 on TerminalBench 2.1, roughly Fable-class, but you will feel gaps in real usage.
Build & ship at the Runpod Flash Hack Day! Join Runpod on June 30 at the SF Builder’s Collective for an in-person hackathon. Remote-friendly. Learn how to use Runpod Flash to turn Python functions into auto-scaling, serverless GPU endpoints without Docker. Demos, prizes & mentorship. Register here.*
My feed
The coming loop - this is the clearest explanation/writing I’ve read on the loop discourse.
Cursor /automate - describe an automation; Cursor sets triggers, tools and instructions.
Claude Code steering guide - where to put instructions, skills, hooks and subagents.
Gemini’s Interactions API is now generally available. It combines their API for models and agents.
GPT-5.5 Instant got better at health questions - now on par with OpenAI’s best Thinking models, per OpenAI.
birdclaw.sh - local Twitter archive with search, inbox and ranked triage. (see more)
Lettera - native file-based Markdown editor for Mac.
Perplexity Computer has a new memory system - Brain.
You’re spending too much on AI. You’re also using it too little.
Stripe Directory - agents can search and pay Stripe businesses from the CLI.
Clips - open-source Loom alternative agents can inspect from a URL.
Can Claude get a robot to play ball? Not fully, but Opus 4.7 was about 20x faster than last year’s Opus 4.1 team.
Ports - Mac menu app for local dev servers, ports and kill/open buttons.
Slackbot can now connect to 20+ apps like Linear, Replit, Canva, DocuSign, Zoom and more.
ElevenLabs Ads Engine - localise ads into 50+ languages and publish to ad accounts.
15 copy-paste loops people are actually running.
Joining a Software Factory - how Factory builds repos for agent PRs.
Post-Agent Companies - agents make labour cheap; context and trust stay scarce.
Afters
Read about me and Ben’s Bites
📷 thumbnail via @keshavatearth
* sponsors who make this newsletter possible :)
Wanna partner with us for the next quarter?
Email us at shanice@bensbites.com or k@bensbites.com






