The newsletter for the technically curious. Updates, tool reviews, and lay of the land from an exited founder turned investor and forever tinkerer.
Hey folks,
Anthropic just released Haiku 4.5 - a smaller model that performs the same as Sonnet 4 (a 5-month-old model) while being 3x cheaper than Sonnet. For context, until 2 weeks ago, the $20 plan in Claude Code only gave you access to Sonnet 4, and if you were happy using it but kept hitting rate limits, you can now work 3x more (and faster). It now powers the “Explore tool” in Claude Code by default.
This model is also great for agents like web searches, sifting through some data, or browser automations. Basically, when you need some level of intelligence but not the smartest bloke out there.
Google upgraded its video generation models to Veo 3.1 and Veo 3.1 Fast. It’s still 50% pricier than Sora 2, but the new controllable features make up for it.
“Ingredients to video” feature lets you upload up to three images and use them to guide the style of the video or add elements to it.
Scene extension takes in the final second of your generation and continues your video for even longer than a minute.
First to last frame lets you upload the starting and ending image and fill in the gap with a video.
ChatGPT can now automatically manage your saved memories, prioritising recent and frequent ones. Searching and deleting your memories in the app is also a bit smoother now.
My latest investment was in Good Start Labs - they make games to test models’ capabilities. They ran a Diplomacy contest with AIs working against one another that was streamed on Twitch and watched by 10s of thousands of people. Here’s a podcast with Alex, one of the co-founders. He used to work at our friends, Every, which incubated the company and spun it out (I’ve invested in 2 out of 2 Every spinouts thinking about it now… lex - the ai writing tool - being the other). I met Alex at Dev Day last week after I invested and it was great meeting irl!
Reminder; if you’re building dev tools or anything in infra I invest out of Ben’s Bites Fund. DM me on X - I can also include some early experiments here to get early users and test your positioning.
And if you’re interesting in investing in the fund - first close will be end of Oct - you can enter your details here and I’ll reach out (I’m having to prioritise $100k+ checks first)
AmpCode added a free plan to their CLI coding tool, supported by ads. My first reaction was omg cringe BUT the more i sat with it, the more i thought…wait it could actually work - I mean, it’s very targeted and if i’m coding up something that needs a database and a Supabase ad came on, that could work…Or it’s a way to try and grow. cto.new, another coding agent is giving unlimited access to all models (ty VCs!).
Build smarter Voice AI Agents with the best ears in AI. Speechmatics captures who said what - even in noisy, heavily accented, multi-speaker conversations. 25% higher accuracy than competitors. Sub-second latency. 55+ languages. Start building with $200 free credits*
🌐 What I’m consuming
Haiku 4.5 vibe check by Every.
Optimising coding agent rules for improved accuracy. They improved Cline’s performance with GPT-4.1 from 18% to 34% on SWE-Bench lite.
The case for vibe coding - personal software.
How fine-tuning a model reduced latency by 3x while improving reliability for Cal AI.
AI agent benchmark compendium - high-level overview to over 50 of modern benchmarks, grouped into four key categories.
Just talk to it - the no-bs way of agentic engineering.
⚙️ Tools and demos
Retool - AI app gen that meets your production requirements. Prompt to start, edit in context, and deploy on a platform your team trusts.*
Flint - Launch and optimise on-brand landing pages instantly. (raised $5M)
Apps by Runway - Use case specific workflows for creative editing (like upscaling an image, reshooting products, adding dialogue, etc.), ready for instant use.
Deadulus Labs - Connect any model to any MCP server with a single API. (raised $11M)
Decode.dev - Review your app with a browser and a whiteboard and send that feedback to Claude Code.
Flask - A video editor that feels like a mix of Notion and Loom. It’s got some AI features but not an AI product primarily.
Fundable - Real-time funding signals from 10k+ sources.
Strawberry - A self-driving browser, create “companions” for tasks you do on the browser and let them work.
🥣 Dev dish
Lexiconic - Words that don’t translate. Alana keeps building these fun, cool and open-source projects (while investing). Check out the repo for this one.
Sandbox SDK by Cloudflare - Let your agents run code, execute commands, manage files, run services and more securely.
SelfDB - Self-hosted, open-source alternative to Supabase and Firebase.
Spec Kit by GitHub - Get started with spec-driven development for your codebase.
Dexter - An open source financial agent in ~200 lines of code. Claude Code like experience, but for finance.
GrayPane - the simplest way to search for the best time and flight to fly.
🍦 Afters
DeepMind and Yale built a new model on top of Gemma to identify new drugs that affect cancer cells. Classic Google stuff.
Sara is hiring a data adaptation research engineer for her new AI lab.
Sneha is hiring a founding designer in NYC to build agentic QA for e-commerce brands.
Karina Nguyen (ex-OpenAI & Anthropic, worked on Artifacts) is opening a fashion house with Ilya Sutskever as her first collab.
Nathan is turning his AI consultancy into an AI infra and compute company.
That’s it for today. Feel free to comment and share your thoughts. 👋
Read about me and ben’s bites
📷 thumbnail creds: @keshavatearth,
The '3x cheaper' Haiku 4.5 for agents, as you mentioned, is a brilliant pivot, though I'm curios if the reall-world performance matches the hype.