What's next for coding agents?

New UIs, lobsters and imagined worlds.

Feb 03, 2026

The newsletter for the technically curious. Updates, tool reviews, and lay of the land from an exited founder turned investor and forever tinkerer.

Hey folks,

Codex, OpenAI’s coding assistant, now has a macOS app. It’s not too different from other interfaces (GUIs) for terminal agents. It imports all your past sessions (from your local machine and codex web app) and has a dedicated place to one-click install skills and set up automations that run on a schedule. Free and Go users can also use Codex via the app now, and all other plans have 2x the limit for the next two months.

At first glance, it looks like many others out there, but it does have a very polished feel to it. And I think having an ‘Automations’ tab is going to be pretty powerful - if you could easily set up workflows for your agent every morning, give me a breakdown of my inbox, but first triage emails into these labels and archive xyz’ then I think that gets less-technical people down the rabbit hole of just how powerful agents can be.

Theo has things to say about it, and Simon has a breakdown of how things work behind the scenes.

“Unlike Conductor, Codex doesn't absolutely eviscerate my battery. It also doesn't require a worktree for every single thing I'm working on, which makes it much easier to use for most stuff :)” — Theo
“Automations are currently restricted in that they can only run when your laptop is powered on. OpenAI promise that cloud-based automations are coming soon, which will resolve this limitation.” — Simon

DeepMind played one of the aces up its sleeve - Project Genie. It’s an experimental research prototype that lets you create, edit, and explore virtual worlds. It takes prompts and refines them for “world building”, generates an image as the starting point and then uses Genie 3, the virtual world creation engine, to let you interact with the output. Currently limited to 60-second-long generations and 18+ adults in the US with a Gemini Ultra subscription.

xAI is now a part of SpaceX, and xAI’s new video model, Grok Imagine (now at version 1.0), has created over a billion videos in the last 30 days.

Would you share a logo with another brand? No, so why share an AI voice? Amplified 2026: Voices’ Annual State of Voice Report reveals why enterprises are securing exclusive voice licenses for their AI voices. See how leaders are using voice to create a trusted brand trademark. Download the report.*

🦞 The lobsters are live

Clawdbot had another rebirth, under the name OpenClaw. Quick recap: it’s a highly autonomous personal agent that connects to your tools and a computer to do tasks. (I built my own from scratch, called bites - it’s not nearly ready for ‘prime-time’ but the code is open source)

Setting it up is not easy, and a bunch of options exist now like SimpleClaw (one click deploy), moltwoker (uses cloudflare), another via Composio and with some help, even on your mac within a container (i.e. a bit safer). Vercel and Docker are both pushing their Sandbox offering to run it securely.

My feed is full of tweets about it: a sample of tasks for your bots, or giving them access to buy using USDC. There’s nanoclaw - a minimal, hackable reproduction of it using apple containers for sandboxing/security. Clawhub is a place to upload AgentSkills bundles, version them like npm, and make them searchable. and although it has dummy data, there’s even a craigslist for the lobster.

X has been going bananas, and everyone is jumping on the bandwagon. Many posts are slop, but here are a few decent ones:

Bring your own agent - the ones built by others suck.
Agentic personal knowledge management with OpenClaw, PARA, and QMD.
How to build a cross-platform voice chat with your OpenClaw.
Mission Control - How we built an AI agent squad.
Everyone talks about Clawdbot, but here’s how it works.

A lot, innit?

But the breakout story of this saga is:

Moltbook – a reddit-like site for these clawd bots to chat with each other.

A post on there had the bots discussing encrypting their messages so humans can’t read it and you can imagine it’s breaking news and freaking everyone out. Balaji is not so freaked out about it but Andrej Karpathy, Scott Belsky and Jack Clark all have valid points about these “networks of autonomous AIs”.

The maker of Moltbook is expanding though, with a 22 min interview on TBPN and a developer platform to build on top of MoltBook.

Too much? Just give this post from Simon a read → All you need to know about Moltbook.

🌐 What I’m consuming

Pi - The minimal agent within OpenClaw.
How OpenClaw’s creator uses AI to run his life.
Getting started with Gemini Deep Research API.
Two kinds of AI users are emerging. The power users and the ones stuck with Copilot & similarly poor enterprise tools.
Why OpenClaw feels alive even though it’s not.
Screensharing the process to build a Techmeme killer.
10 tips to get more out of Claude Code from its maker.

⚙️ Tools and demos

Skipup.ai solves the admin of booking meetings. It knows your calendar, preferences & handles follow-ups on autopilot. Your EA at scale!*
Superagent - Turn your complex business questions into boardroom-ready answers, beautifully rendered as reports, slides, or websites.
8090 - Software factory for your team to collaborate and ship high-quality software.
Commander AI - macOS app for Codex and Claude Code. Free for the next two months.
Polylogue - Collaborative writing platform where AI agents can join your workspace.
Transformer Lab - Open-source OS for modern AI research.
Moltbook Search - Find posts, discussions, and insights from the agent internet.
Agent skills for Typefully - Let your favourite AI coding agents manage your socials.
Tax UI - Visualise and chat with your tax returns. Runs locally with your Anthropic API key.
Muse - An AI agent for music composition, with a full multi-track MIDI editor and support for 50+ instruments. (demo)

🥣 Dev Dish

Primer - Get your repo ready for AI.
ui.sh by Tailwind Labs- A toolkit for coding agents to help you build UIs.
qmd - A mini cli search engine for your docs, knowledge bases, meeting notes, whatever.
Mintlify now lets you see how many AI agents are viewing your docs.
GLM-OCR - open weights model with <1B params from Z AI that beats most LLMs for OCR.

🍦 Afters

LM Arena is now called Arena with the goal of “measuring intelligence”.
Engineers at NASA used Claude to walk on Mars - They plotted out a 400m route for the Martian rover Perseverance.
WSJ reports that the $100B investment deal between Nvidia and OpenAI is probably on ice.
Day AI raised $20M to build the Cursor of CRM.

Enjoy this newsletter? Forward it to a friend.

That’s it for today. Feel free to comment and share your thoughts. 👋

Find me on X, Linkedin, or Instagram
Read about me and Ben’s Bites
📷 thumbnail by @keshavatearth

* sponsors who made this newsletter possible :)
Wanna partner with us for Q1?

Calvin P

Re the Two kinds of AI users are emerging article. I work for a larger company, and I definitely noticed this dynamic. My codebase is several million lines split across a bunch of Visual Studio solutions. We've had access to GitHub Copilot for a bit over a year, and it's fine. It can recommend changes, autocomplete code, debug basic local issues, etc. It helps, but it's not a gamechanger.

We just got access to Copilot CLI, and its power is astounding in comparison. It feels the same as using Claude Code or Codex for my side projects at home. It can reason over the entire codebase, even though it's so large. It represents a 2-5x speedup in the speed I can work, without even having to ask my coworkers for help when I'm working outside my areas of expertise. I'm sure on a smaller and more nimble codebase it could do even more.

As far as I can tell I'm the only one on my team who is using Copilot CLI, though I'm trying to change that.

Discussion about this post

Ready for more?