The newsletter for ai builders of all levels. Mini-tutorials, tool reviews, and lay of the land from an exited founder turned investor and forever tinkerer.
We’ve got a few open ad slots over the summer. Wanna partner with us?
Hey folks,
Windsurf’s deal with OpenAI is dead. Instead, founders and key researchers have been poached by Google (for reportedly $2.4B) and the company (whatever’s left of it) is being acquired by Cognition, the makers of Devin. The announcement from Cognition’s founder also mentions that Windsurf is a $82M business with growing enterprise growth. Also, Meta acquired PlayAI a couple of days back. A wild few days in acquisition land.
Another Chinese lab, Moonshot AI is taking over the open-model scene with Kimi K2. It’s a massive 1T MoE model with no reasoning yet (tech blog), and it easily beats Claude Sonnet 4 and pokes Opus 4 too. It’s great at coding and tool calling, but that’s not it. On this creative writing eval, it just blows everything out of the water, avoiding most of the telltale signs of AI writing. This is what Llama 4 Behemoth was supposed to be.
You can use it on the web, it also has an API (makes it possible to run in Claude Code), but that’s slow, and the folks at Groq are already serving it at superfast speeds.
Grok has a new feature: Companions. It’s built in partnership with Anichat. Grok powers the dialogue (with voices), and Anichat covers the generative animation. One of the avatars Rudi is a red panda, who can be good and bad, and the other one is Ani, a borderline nsfw avatar based on an anime character. That company Meta just acquired used to clone AI voices, so expect something like this from Meta too
Claude finally has a directory of pre-built MCP connectors to connect to Notion, Figma, Stripe and more. Plus this is also live for the Pro plan ($20/mo) and web app. The desktop app has some special ones like allowing Claude to use Chrome, read your files, fill PDF forms etc. You can also add a custom connector if you have a remote MCP url—like from Smithery which is a portfolio company I use several times a week.
Introducing one of my latest investments - Mirai just dropped an on-device AI SDK for iOS & macOS that lets any solo dev ship Llama-3, Qwen-3, Gemma-3 or DeepSeek-R1 (or bring your own model) locally—zero latency, zero cloud cost, zero privacy fuss. Grab the SDK, the Swift engine, the model-conversion toolkit or poke around the platform and docs.
Unwrap partners with DoorDash, Clay, Stripe, and others to power their customer intelligence. Its AI-powered platform aggregates and analyzes customer feedback across channels and proactively surfaces emerging issues to you before they snowball out of control. Check them out here.*
Hostinger is doing a special free session for Ben’s Bites, covering how to build your idea into an app even if you lack the technical chops. July 24th, RSVP here.
*sponsored
🌐 What I’m consuming
The rise of agentic commerce and Stripe’s role in it.
Stop saying RAG is dead.
Context Rot - How increasing input tokens impacts LLM performance.
Claude Code is all you need - Using CC for non-technical tasks. I’m doing this more and more now - I used it to help with a P&L and do research for an investment memo I am writing. It really is the agentic workflow they’ve built that is great at using tools.
⚙️ Tools I’m looking into
ART - Open RL framework for improving agent reliability by allowing LLMs to learn from experience. Integrate GRPO into any application.*
Cased - Interactive agents that do the busywork of DevOps for you.
Basic Memory - AI conversations that build lasting knowledge over time.
Consensus Deep Search - Conduct literature reviews across 200M academic papers in two minutes.
Terragon - Use Claude Code like OpenAI Codex/Cursor web agents
The AI() function in Google Sheets is now available to everyone.
BrowserOS - An open-source agentic web browser.
Factory - A top coding agent is offering 20m free tokens or 1 month free for new users
Smithery Playground - Explore, test and debug MCPs.
Amazon has a new agentic IDE called Kiro. The focus is on spec-driven development, which could be really big for PMs and founders.
*sponsored
🥣 dev dish
Mistral has a bigger version of its developer-focused model, Devstral. Devstral Medium is available on the Mistral API, and an update to Devstral Small is on HuggingFace.
gemini-embedding-001 is now generally available. It’s not multimodal (= text only), but it has the best benchmark scores.
diffs are a way to show deleted lines and newly added lines when a code files is added. Developers are familiar with them but I think there are other ways to show that difference (aka the "diff"). this tool diffsitter focuses on creating semantic diffs ignoring basic formatting differences.
Goose, the open-source ai dev agent by twitter’s co-founder jack is getting a dedicated team of engineers and designers. The github repo already has 16k+ stars.
genai-processors - A new python library by Google to build applications with AI. From the surface, looks like the focus is to enable AI apps that deal with audio/video input-output easily.
Cactus - Ollama for smartphones.
asyncmcp - Give MCP servers the ability to queue up requests.
Reaper - An open-source SDK for finding dead code.
How to build a multi-agent research assistant with Gemini and Llamaindex
🍦 Afters
Product Talk with Chris Pedregal, CEO of Granola - Tues 12 August in London
H-Nets - new architecture in state space models to get better tokenisation.
The AI usage survey for the State of AI report 2025 is open for inputs. Takes <10 mins.
The US Department of Defense is giving $200M each to Anthropic, Google, OpenAI, and xAI to build AI workflows for the department.
That’s it for today. Feel free to comment and share your thoughts. 👋
Just read Context Rot. It should be required reading for every founder.
Feeding AI more data doesn't guarantee a better output. Feeding it more GENERIC data guarantees more generic output.
Any plans of exploring the Lovable alternative, Apper.io, to check if its lifetime deal is worth it?