how to steamroll agi 101

a new experiment for builders

Jul 01, 2025

I write a newsletter about startups and investing—for ai builders of all levels.

I record mini-tutorials, review tools I’m testing, share my insights from an exited founder turned investor.

Hey folks,

I’m solo-parenting this week 😬 so excuse anything…

I’m also in a bit of a product focus which still may not amount to anything useful except lessons for me - particularly with figuring out how to work more efficiently with claude code or cursor. I’m still not quite there but when I do i’ll zoom out and try to distill some of the lessons.

Shoutout to Rison, for the tip to use Claude Code Chat, an extension within cursor - used it for an hour but I love it.

New experiment alert…Ben’s Bites Builders [working title] - if you’re thinking about what you want to do next (but maybe you already have a job). You may want start/join an early ai startup? meet others in the same situation? and maybe you want to launch products here in the email? Enter your details here.

Zuck has built a superintelligence team at Meta. Alexander Wang (Scale AI’s CEO until recently) and Nat Friedman (former Github CEO) will head this team of 11 world class researchers poached from Google Deepmind, Anthropic, and mainly OpenAI. OpenAI feels like “someone has broken into their home” and is working on renegotiating comp.

OpenAI added two new features to its API: Deep Research and Webhooks. Deep Research comes via two models o3-deep-research ($10/$40 for 1M input/output token) and o4-mini-deep-research ($2/$8). Webhooks help you set notifications for long-running tasks (like when a deep research report is complete). There’s a good opportunity to make a deep research wrapper with this new API by focusing on a niche with a list of curated sources, and a bespoke UI.

OpenAI also announced the date for Dev Day 2025 - Oct 6, 2025 in San Francisco. we are back to the livestreamed keynote + offline demos format, and one thing we can expect from DevDay is the open source model (which twitter is hyped about).

Anthropic released Hooks. Hooks are user-defined shell commands that run at specific points in Claude Code’s workflow. You can use hooks for notifications, formatting, logging, feedback, and permissions.

Build, deploy, and pay autonomous AI agents—no code, open-source, free, full control. Shinkai lets anyone create, manage, and deploy AI agents effortlessly. Run remote or local AI models, import or expose MCPs, or connect decentralized AI agents using micropayments. Get ShinkAI (free)*

Cursor’s background agents are now available on the web and mobile as well. Start a task from anywhere (including recently released Slackbot) and then come to the Cursor IDE for review or deeper edits. Ian, myself and a few others have already been using it on our phones. It’s cool!

Claude is an indie hacker, too. Anthropic gave Claude a coffee shop to run, and this one chart is enough to summarise what happened:

The net value of Claude's business over time. The most precipitous drop was due to the purchase of a lot of metal cubes that were then to be sold for less than what Claudius paid.

*sponsored

want to partner with us? Click here

🌐 What I’m consuming

Iconiq Capital’s “state of AI research” report, based on a survey of 300+ executives in April 2025.
This report from MenloVC on how AI is faring amongst consumers in the US.
Vercel’s CEO on the change in software engineering, MCP and GUI for AI.
How AI agents are reshaping enterprise work.
I came across this benchmark that evaluates Gemini models on Mermaid diagram syntax. Creating these specific evals is one of the best ways to get noticed as an AI engineer in this market.
Using Claude Code to build a GitHub Actions workflow.
Handbook for building the future of consumer AI.

⚙️ Tools I’m looking into

Signadot AI Smart Tests: Write simple API tests. AI shadow testing automatically finds regressions in microservices to stop bad deploys.*
Tidbit - Use AI in a Slack-like interface with individual channels for topics you return to most often.
Cora - The email triaging app I use is officially open to everyone.
Dino - An AI talking toy for kids.
Mastra Cloud - The easiest way to deploy AI agents (built w/ Mastra).
Terragon - Use Claude Code as a background agent.
Cline can now spawn multiple background coding agents, so you can choose the best output for your task.
npc.town - an AI-driven town simulator. (repo is open source)
MrBeast released and then pulled an AI thumbnail tool. Levels jumped at the chance and launched his own tool to do the same.

*sponsored

🥣 dev dish:

Local MCP servers can now be saved as a new file format: Desktop Extensions (.dxt files).
Workflows by LlamaIndex - A lightweight framework for orchestrating complex, multi-step AI systems.
Inworld TTS - Fast and high-quality voice cloning for consumer applications.
Github is introducing a term called Continuous AI - A broad term referring to any automated AI process in software collaboration. (inspired from CI/CD)
Gemma 3n is out of preview. It runs smoothly on high-end mobile devices and is the first <10B params model to get 1300+ elo on LMarena. Time to build apps that use local LLMs is here.
This demo showcases how you can use a service worker (that intercepts network requests) to make a website fully offline.
and then…Make a worker offline-capable in just 4 lines of code

🍦 Afters

OpenAI acquired an 8-year-old personal AI assistant company, Crossing Minds.
ARIA is looking for a new CEO.
CHAI Research generated antibody molecules with 15% hit rate (100x better than current approaches).
Get a preview of ARC-AGI-3 if you’re in SF this July.

That’s it for today. Feel free to hit reply and share your thoughts. 👋

Enjoy this newsletter? Please forward to a friend.

Find me on X, Linkedin, or Instagram
Read about me and ben’s bites

how to steamroll agi 101

a new experiment for builders

*sponsored

want to partner with us? Click here

🌐 What I’m consuming

⚙️ Tools I’m looking into

*sponsored

🥣 dev dish:

🍦 Afters

Discussion about this post