Gemini 3 is worth the hype

but it's not a clear winner

Nov 20, 2025

The newsletter for the technically curious. Updates, tool reviews, and lay of the land from an exited founder turned investor and forever tinkerer.

Hey folks,

Gemini 3 Pro is out beating GPT-5.1, Sonnet 4.5 on all benchmarks but one (SWE-Bench Verified). It’s only slightly smarter than other models, but it is much better at vision. It scores 72.7% on a benchmark for screenshot understanding; the second best is 36.2%. Faster and slightly expensive than 2.5 Pro.

My vibe check: it’s a good model, it’s eager to commit changes (!!), isn’t as good as following instructions as Codex (maybe some new prompting quirks needed?) and definitely good at frontend.

This launch comes with a new IDE from Google: Antigravity (courtesy of acquiring the Windsurf founders). It also has an Agent Manager and a browser for the agent to see/test what it has built. I tried it for a few hours: Tab complete feels very slow, and the agent is over-eager to implement a plan (while in planning mode). The Agent Manager treats extra documents like plans, task lists as separate “artifacts” and doesn’t clutter your codebase. I liked that. It took me a while to get the browser integration working, but once it was set up, it was really nice and fast too vs Atlas.

In other stuff, they released Gemini Agent (for ultra subs only), introduced Dynamic UI in chat, teased Gemini 3 Deep Think and rolled it out in Search (via AI mode) on day 1.

They are not done (potentially nano banana 2 today), but neither are OpenAI and xAI. OpenAI released two models: GPT-5.1 Pro and GPT-5.1-Codex-Max to follow up on Gemini 3, and xAI has released Grok 4.1 Fast. These three models all look like they’d do better than Gemini 3 Pro, but only on specific tasks (hard academic problems, code generation and tool calls).

An underrated release from Meta: SAM 3. SAM (Segment Anything Model) family of models can take an image/video and create an overlay of any individual or group of objects in it. Meta is partnering with Roboflow to let people fine-tune SAM on their use cases, and it’ll use SAM in Instagram’s video editing app called “Edits”.

One API for All Your Voice AI Workflows. Stop wasting time juggling voice AI vendors. AssemblyAI combines multilingual Speech-to-Text, speaker diarization, speech understanding & LLMs in one developer-friendly API. Trusted by Granola, Dovetail & Ashby. Free to try, pay-as-you-go. Start building voice AI today.*

I’ve been talking with TELUS (one of Canada’s largest telecom companies) this year. They built a platform that allows 70k+ employees pick from over 30 LLMs to build copilots. I chatted with them and wrote about their story here.

🌐 What I’m consuming

Gemini 3 reviews: Matt Shumer — Simon Willison
GPT-5.1-Pro reviews: Matt Shumer — Simon Smith
How evals drive the next chapter in AI for businesses.
The impact of AI scams on elderly people.
Hyperproductivity - An astonishing, exhilarating, exhausting new style of work.
LLMs have distinct coding personalities. This research (free report, no form fill needed) from Sonar lays out each LLM’s unique habits, blind spots and risks – like hidden security flaws, messy code and severe bugs. Useful read for making smarter, safer decisions when coding with AI.*

⚙️ Tools and demos

Build chat, voice and multimodal conversational AI applications with NLX. No coding required, only great ideas. Try it for free today.*
Smithery Chat - Chat with artifacts, code mode and 2000+ integrations. (more)
Warp Agents 3.0 - Full terminal use, /plan, code review and more integrations.
emdash - Open-source tool to run multiple coding agents in parallel. (demo)
Zo - Your intelligent cloud computer with all the context.
Design mode in Replit - Prototype beautiful UIs fast with Gemini 3 Pro.

built with Replit Design by Chris Dunlop

🍦 Afters

xAI is raising $15B for a $230B valuation, and so are many others: Openhands - 18.8M series A, Luma Labs - 900M series C, AlphaXiv - $7M seed round and Suno - $250M Series C.
Nvidia and Microsoft are investing up to $10B and $5B in Anthropic.
Meta confirms Yann LeCun will leave to form a new AI startup at the end of the year.
ChatGPT will be free for teachers till June 2027.

Enjoy this newsletter? Forward it to a friend.

That’s it for today. Feel free to comment and share your thoughts. 👋

Find me on X, Linkedin, or Instagram
Read about me and Ben’s Bites
📷 thumbnail creds: @keshavatearth,

Thanks to today’s sponsors who made this newsletter possible :)
AssemblyAI, Sonar and NLX.
Wanna partner with us? Last few slots left for the rest of the year.

Leonidas Tam, PhD

Nov 22, 2025

We need a benchmark for LLMs that glaze you the least. Gemini still has high agreeableness in my testing on aistudio

Angie Spaw | Haven Point

Nov 29, 2025

I like this. Even with more powerful models, you still need good instructions and clear boundaries. In operations, structure is everything, and AI is no different. The people who know how to guide the model will always get the best results.

3 more comments...

Discussion about this post

Ready for more?