Who owns the Frontier?
it's time to let the minions run wild
Hey I’m Ben. I build stuff with agents, even though I’m not technical. Here’s all the stuff I’m reading and tinkering with. If you want to start building or level up your ‘vibe-coding’ skills, join our community.
Hey folks,
I built something small that I needed…with all this clawdbot/openclaw mania I found it really hard to ‘see’ files on my remote computer (mac-mini/vps/etc). So I built a combined file-explorer app - you can upload/read/edit files on any machine (local or remote) you connect to it. Its free to clone/remix.
We have two new coding models: Opus 4.6 from Anthropic and GPT-5.3-Codex from OpenAI. My feed is loving GPT-5.3-Codex more (see Matt’s and Theo’s reviews) - I prefer it some of the time; when opus gets stuck or seems stupid about something → get codex to sort it out, if I know what I want and need it to just get done → codex, for planning, brainstorming and anything that needs resources (docs, links, etc) → opus. Both OpenAI and Anthropic put the models to extreme tests:
Building a C compiler with a team of parallel Claudes.
I spent $10,000 to automate my research at OpenAI with Codex.
Prompting tips and new API features for Opus 4.6.
The new Opus comes with beta support for 1M tokens in its context window and a fast mode, which costs 6x more for 2.5x faster outputs. Anthropic also released a bunch of other API features like Context Compaction, Adaptive Thinking, Effort and a new feature in Claude Code - Agent Teams (demo + how to install it). Agent teams are multiple cc sessions working with shared tasks, messaging between themselves and centralised management. It’s available to all cc users.
There’s another launch from OpenAI too. A new platform, called OpenAI Frontier. It does a similar thing, i.e. let enterprises create agents plugged into their data with the ability to run commands on a computer (just like cc/codex) and feedback loops to improve them over time. Copilot and Google Cloud have something similar for a while now, but a) model capabilities and b) computer access/ability to use tools have been holding them back. To me, Frontier feels like an attempt to capture those users versus something similar to cc’s new agent teams.
Some more quick notes:
If you're building Docker for agents, I want to invest.
Ads in ChatGPT are live for testing.
ai.com guys ran a Super Bowl ad, and it crashed their website. Right now, it looks like an openclaw wrapper.
Why’s there always a meeting bot in your Zoom call? Blame Recall.ai. They power every meeting AI app, from Cluely to Hubspot to Clickup. Recall.ai handles the hard part: getting recording data across meeting platforms. Get started with $100 in credits.*
🌐 What I’m consuming
Wiki Education partnered with Pangram (an AI detection tool), and they released a report detailing where it works, its blindspots and more. The collective trust in the AI community for Pangram’s detection is significantly more than the early “AI detectors don’t work” claim, so this distinction is worth a read.
Tailscale lets you hook into a dev environment on a remote machine (like a Mac Mini) from any device. I’ve been using it for my projects (here’s a guide). It’s ex-CTO (now building exe.dev) wrote about the last eight months of agents.
Stripe is using minions - agents that can one-shot features end-to-end. Simon wrote about how StrongDM’s AI team builds serious software without even looking at the code. Also read: Agent-native engineering - Restructuring your organisation around agents as individual contributors instead of engineers.
Should and will we build a new programming language now that we have agents? (I think so, if you are - I want to invest)
Ghostty founder’s journey from AI sceptic to finding a lot of value in it daily.
Let Claude improve itself and become better at marketing.
The rise of the professional vibe coder.
Stop talking to walls of predictive text and start doing real research with Superagent. Give it a question, and it gets to work: Subagents deeply interrogate your topic, scour a wide range of credible sources, and package it all up into boardroom-ready reports, slides, docs, or websites.*
⚙️ Tools and demos
👩🚀 Agent Composer - AI agents for advanced industries—compress routine engineering tasks from hours to minutes.*
Claude in Excel and PowerPoint - The official extensions for office tools by Anthropic.
Sphinx - Fully browser-based data science environment with a powerful agent.
Agentation - Let your agent fix your UI by annotating elements in your app.
Solo manages your entire dev stack. Add a project, let it detect all processes and start everything in one click.
Observational Memory by Mastra AI - A human-inspired memory system with a completely stable context window.
Keep.md - Bookmark links from anywhere, store them as markdown and give them to your agent wherever you need. (going to replace my current link-saving workflow with this for this newsletter)
An attempt at prototyping components in your coding workspace.
🦞 OpenClaw updates
OpenClaw’s skill store, Clawhub, now auto-scans all skills for malware using VirusTotal. Every day, there are a dozen variants of OpenClaw launching now. Here are some that sound legit:
Webclaw - A fast, local-first, open-source web client for openclaw.
Aight.cool - Openclaw in an iOS app.
Klaus by Bits - Opinionated, on the cloud, batteries included.
And there are even more articles on how to set it up, get the most out of it. Some select ones:
Let your OpenClaw call you on the phone using ElevenAgents.
Use OpenClaw to enforce structure in my day without willpower.
Lulubot takeaways: 1 week of building and using my OpenClaw.
How to set up a team of agents in OpenClaw.
Letting OpenClaw agents run a website by themselves.
How I built an AI agent swarm in Discord.
Create scripts to save tokens when building monitoring agents.
Memory tricks - saving past memories and searching past conversations.
And this fun tool → Lobster Anatomy. It visualises your OpenClaw agent and helps to improve it.
🥣 Dev Dish
Latch - Security middleware for agents and their tools, blocking sus actions and letting safe ones through. (i’m an investor)
Run npx playbooks scan skills to scan your locally installed skills for security issues, or check out this security skill.
md-browser - A markdown-first mini browser that sees the web like an AI.
Agent-relay - Real-time messaging between AI agents. Sub-5ms latency, any CLI, any language.
Sage - Privacy-first personal AI agent with persistent memory, built in Rust. (explainer video)
agent-browser can now access local PDFs/HTML files and capture all clickable divs on a page.
pi-messenger - A chat room for multiple agents working on the same project.
Shannon - An AI hacker that wants to break your app and find exploits.
Napkin - A skill for Claude Code that gives the agent persistent memory of its mistakes.
X API is now pay-per-use. Though I use Bird CLI, the new pricing lets you build things like this Twitter research assistant on official APIs easily now.
Cloudflare’s Sandbox SDK now supports PTY (pseudo-terminal) passthrough, enabling browser-based terminal UIs.
🍦 Afters
Vercel AI Accelerator - 6 weeks prgram with access to the Vercel team, investors and $6M in credits. Application open now until February 16th.
Vouch - A community trust management system for who gets to contribute to your open-source projects.
Cursor released Composer 1.5 - the same base model as Composer 1, but RLed 20 times more. It’ll be a lot more expensive while we have only one measure of how much better it is - Cursorbench (an internal benchmark with no public details).
Enjoy this newsletter? Forward it to a friend.
That’s it for today. Feel free to comment and share your thoughts. 👋
Read about me and Ben’s Bites
📷 thumbnail by @keshavatearth
* sponsors who made this newsletter possible :)
Wanna partner with us for Q1?



