Google's secret kitchen

Mistral's new CLI and Shopify SimGym

Dec 11, 2025

The newsletter for the technically curious. Updates, tool reviews, and lay of the land from an exited founder turned investor and forever tinkerer.

Hey folks,

Google Labs products have been on a spree of launches: Scheduled Tasks and proactive suggestions in Jules, their async coding agent. Pomelli, their marketing asset creation tool, now lets you animate those generations. And some more in Stitch, their design exploration tool.

Mistral is joining the CLI squad with their open-source entry, Mistral Vibe. Its license allows you to fork/build on top of it, and it’s pretty minimal, so you can add your own take without fighting against the existing codebase (looking at you Gemini CLI).

btw, did you see? droid is the top-performing agent on terminal bench, again. really cementing itself.

sorry, I digress; back to Mistral.

Along with the CLI, it released two more open-source models: Devstral 2 and Devstral 2 Small. They perform better than GLM 4.6 and Kimi K2, and are on par with DeepSeek 3.2 while being much smaller than them. But these models are still chunky and meant for business with a few GPUs. You’ll have trouble running even the smaller one on a MacBook Air.

Shopify launched a SimGym - Digital customers that behave like real ones. They browse your site, complete tasks, and reveal optimisation spots. You can even run A/B tests with zero live traffic. It’s part of their winter launch with more AI features.

OpenAI posted a blog claiming their new models are now reaching a “high” level of cybersecurity capabilities with GPT-5.1-Codex-Max already scoring 76% (vs 27% for GPT-5 in Aug).

Attio is the AI-native CRM for the next generation of teams. Sync your email and calendar, and Attio instantly builds your CRM—enriching every company, contact, and interaction with actionable insights in seconds. Join fast-growing teams like Granola, Flatfile, Modal, and more. Start for free today.*

🌐 What I’m consuming

Demonstrably safe AI for autonomous driving - A look into Waymo’s approach.
200k tokens is plenty - Software development using a dozen short threads vs a longer one.
Minimise successful test outputs and stop overloading your agent’s context.
What does AI think about Hacker News comments from 10 years ago?
Why AGI will not happen.
Clay’s approach to GTM for going from $1M to $100M ARR.
Useful patterns from building HTML tools.
Supermemory: Raising $3M at 19 and going from open source to funded startup. (I’m an investor)

⚙️ Tools and demos

DeepSky - AI superagent for founders, builders & operators. Designed to go beyond basic LLMs for strategy, competitive intel, & market research.*
Autopilot by Mintlify - Proactively identify when your documentation needs updates based on your codebase.
Scouts by Yutori - Track topics you care about across the web and get fast, clean summaries.
Orchids - An IDE built for vibe coding with built-in browser, Supabase, and Stripe all in a single tool.
Detail.dev - Deep scans of your codebase that find bugs you’ll be glad to know about.
Figma now lets you edit images with AI to make assets fit your UI. Removing elements or background, expanding images and more.

🥣 Dev Dish

Gemini’s text-to-speech models got an upgrade in expression, pacing and dialogue quality.
Claude Code tasks can now spawn async subagents, and the Claude Agent SDK now lets you build with 1M context length for Sonnet 4.5.
Relace Search - Parallel tool calls to make agentic search 4x faster.
Dev Browser - A Claude Skill to let your agent close the loop without eating up tokens. (read more)
Claude Island - Dynamic island widget for Claude Code. Everything at a glance with quick approvals without switching windows. (repo)
Paper2Slides - Open-source repo to create stunning slides from your research papers.
Stirrup - A lightweight framework for building agents by Artificial Analysis.
A quick repo to create your Claude Code Wrapped.

🍦 Afters

Generative AI in the Enterprise by Menlo Ventures. It claims Enterprises lean towards using ready-made AI products much more (76%) than building them in-house.
Anthropic, Block and OpenAI are co-founding the “Agentic AI Foundation” and donating MCP, Goose and Agents.MD to it.
New suite of benchmarks from Google DeepMind quantifies how well LLMs know facts. (Leaderboard)
OpenAI is offering certificate courses on AI foundations inside ChatGPT as a pilot program.

Enjoy this newsletter? Forward it to a friend.

That’s it for today. Feel free to comment and share your thoughts. 👋

Find me on X, Linkedin, or Instagram
Read about me and Ben’s Bites
📷 thumbnail creds: @keshavatearth

Thanks to today’s sponsors who made this newsletter possible :)
Wanna partner with us?

Discussion about this post

Ready for more?