Apple releasing a Python SDK for Mac's on-device LLM quietly while everyone watched the Perplexity and Notion announcements - that's the kind of news that ages well.
The shift from CLI-based agents to agents that use the computer like a human is obvious in retrospect but took longer than expected to get to actual products. What I keep wondering about the computer-use approach: it adds a layer of latency and failure modes that CLI avoids entirely. For tasks where the app has an API, going visual seems like the wrong abstraction. But for everything else, it might be the only path to general usefulness.
Your point about "Hey I’m Ben" resonates with what I've been seeing in production systems. The gap between the theory and what actually ships is where most teams struggle, especially around reliability.
Apple releasing a Python SDK for Mac's on-device LLM quietly while everyone watched the Perplexity and Notion announcements - that's the kind of news that ages well.
The shift from CLI-based agents to agents that use the computer like a human is obvious in retrospect but took longer than expected to get to actual products. What I keep wondering about the computer-use approach: it adds a layer of latency and failure modes that CLI avoids entirely. For tasks where the app has an API, going visual seems like the wrong abstraction. But for everything else, it might be the only path to general usefulness.
Cursor video demos are HUGE. That’s the big bottleneck for parallel agent orchestration, testing/validation. I hope Codex follows suit
OpenClaw was the proof of concept no one asked for and everyone needed. All the tech were already there waiting for someone to stitch them together.
The feature race is visible. The governance readiness isn't.
Your point about "Hey I’m Ben" resonates with what I've been seeing in production systems. The gap between the theory and what actually ships is where most teams struggle, especially around reliability.