My name is Claudius Maximus Promptus

learner of skills, commander of terminals and I will have my tokens

Oct 21, 2025

The newsletter for the technically curious. Updates, tool reviews, and lay of the land from an exited founder turned investor and forever tinkerer.

Hey folks,

‘Anthropic is on fire’ would be an understatement for last weekend. They released Skills across all Claude products, Claude Code on the web, plus a couple of side quests here and there.

Skills are folders (downloadable as zip files) that contain two things: 1. a Skill.md file and 2. other files relevant to that skill (more .md descriptions, maybe a .py script, and more). First 3-4 lines of Skill.md are reserved for metadata (like name, description). Metadata for all installed skills gets added to Claude’s context every time, but Claude can choose to expand Skill.md and then the rest of the relevant files. Anthropic has a collection of pre-built skills hidden away in this repo.

Simon thinks Skills are a bigger deal than MCP.
Keshav and I built our version of skills a couple of months earlier, which are strikingly similar to this.
Skill Seekers - A repo to scrape any documentation page and convert it into a Claude skill. (an easy-to-use hosted version).
But beware! Don’t install Skills randomly from anywhere; they can be easily weaponised.

Next up, Claude Code is now available on the web — go to claude.ai/code, and you can queue up multiple tasks in parallel across different GitHub repositories. Every, as ever, has produced a review which is worth a read. As is Simon Willison’s.

Claude Code also got two new tools (available in the terminal):

First one to display a new interactive question UI with multiple questions and certain choices to select for each (and blank space to answer subjectively).
Sandbox to define what Claude can access (locally and remotely) to make working with it more secure. Run /sandbox to configure it, and the tool is open-source.

On side quests, Anthopic is launching a program called Claude for Life Sciences and adding new integrations for it. It’s also cosying up with Microsoft, connecting Claude to all of the Office 365 tools while trying to eat Glean’s lunch by offering Enterprise Search to the Claude Team and Enterprise plans.

OpenAI is planning a “Sign in with ChatGPT” option and pitching it to other companies. Companies that agree can let users use their ChatGPT quota to access AI features on their tools.

Mocha is an all-in-one tool for building real apps, no duct tape required. It packs in a backend with auth, payments, hosting and database—no need to juggle Supabase, or even have a Github account. Launch in minutes, scale without glue. 👉 Join 100,000+ founders to start building your ideas today.*

🌐 What I’m consuming

It pays to be a middleman - how SF compute corners to offtake market.
Using AI to generate 100% of my code over the last few months.
3 ways Manus engineers context for its agent.
How GPT-5 thinks, with OpenAI’s VP of Research.
Local models are (not) cope.
Evaluating long context reasoning ability and introducing a new benchmark.
A tale of two Agent Builders - Two competing solutions to the same design problem in AI interfaces.
LLM psychosis isn’t, generally, psychosis.
Andrej Karpathy on Dwarkesh’s podcast—I’m 30% into it, and I have just one recommendation: don’t listen to commentaries on the podcast; instead, listen to it. Everyone has their own take that benefits what they want to sell you.

⚙️ Tools and demos

Tired of fixing your CRM instead of closing deals? Clarify is the self-updating CRM that records calls, enriches, and updates in real time.*
Everyone wants to be an app builder. Manus, the general web agent, is also focusing heavily on app creation in its upgrade to Manus 1.5.
Cline also has a CLI tool now, and Cline in your IDE can orchestrate Cline CLI subagents.
DocsAlot - Automatically generate and update documentation, tutorials, and blog posts as your code evolves.
TLDW - Learn from long YouTube videos better & faster. (repo and examples)
Epilogue - Record your natural thoughts, capture quotes, and explore questions while you’re reading.
E2B Build System 2.0 - A faster and simpler way to create custom sandboxes.
Code Review in Conductor - Comment on diffs generated by Claude Code, and send them back to get fixed.
Alloy.app - Prototype with your real product (vs. vibe coding from scratch).
Dunbar - Replace cold emails with warm intros by searching through your existing network. One of my recent investments

🔌 MCP Matters

Default Context by Context7 now auto-generates library documentation using Claude models, even if its repo has zero docs.
Parallel Task MCP Server - The first async MCP server for complex research tasks that can work in the background.

🥣 Dev dish

I’m hearing LLMs are better at writing Swift than React — so if you’re vibe-coding a desktop app, you might want to try making a native MacOS app instead of an Electron app. (exhibit - Ivan built a working clone of Apple Notes in 30 mins)
Gemini CLI added a new feature to let you use the same terminal window (where it’s running) to run other commands, keeping them in Gemini’s context. Gemini API now offers support for Grounding with Google Maps. It’s one of the unique AI features I’ve seen (demo app).
RepoPrompt 1.5 - Build context that fits a certain token budget, connect to your agent of choice and use your existing subscription.
Moondream Cloud - Make cutting-edge vision applications without worrying about hosting a model.
GPT-OSS is now 20-40% cheaper on Groq with an additional 50% reduction from prompt caching.
Starter kit to ship a ChatGPT app on Vercel with Next.js and MCP. Or maybe you wanna roll out your own web app to run Claude Code, Codex or any CLI tool.
llmchat - Feature-rich chat app with local storage for your chats.
Open Agent Builder - open source n8n-style workflow builder.
How to build a Lovable clone with Kimi K2.

📊 Charts you should see

Cognition has trained two new models, SWE-grep and SWE-grep-mini, to search a codebase for relevant context to answer a question. These models are way faster than LLMs and have better performance. These are available in Windsurf as a “Fast Context” subagent that triggers automatically.

SWE-grep and SWE-grep-mini outperform both LLMs and embeddings.

Shortcut, an agent for Excel, just surpassed Microsoft’s own Copilot Agent Mode (which is different from Copilot in Excel, you know how that goes) on SpreadsheetBench. There’s still quite a gap between the best score and humans.

tryshortcut.ai scores #1 (59.3%) on SpreadsheetBench

Alpha Arena is a new experiment where 6 models get $10000 to trade cryptocurrencies. It started a little over 90 hours ago, and Deepseek and Claude are up, while Gemini and GPT-5 are in the gutters. They call it a benchmark, but I doubt it’s a good one.

Performance of 6 models on Alpha Arena trading cryptocurrencies

Gemini’s share of the web traffic going to AI is increasing steadily, but still a tiny portion of what ChatGPT gets.

SimilarWeb’s data on AI web traffic share in the past 12 months

💰 Who got that bag?

Osmosis raised $7M to fine-tune models for other companies.
Clove raised $14M (founded by ex-CEO of Paddle) to make AI your financial guide.
Reducto raised $75M Series B for production-ready document parsing.
General Intuition spins out of Medal with their $133M seed raise to build foundational models.

🍦 Afters

Leann and Autumn are hiring founding engineers + operators.
World Labs has trained another interactive model where you can walk through a generated environment. Demo here.
Uber is now letting drivers label data to get paid while they wait.

Enjoy this newsletter? Forward it to a friend.

That’s it for today. Feel free to comment and share your thoughts. 👋

Find me on X, Linkedin, or Instagram
Read about me and ben’s bites
📷 thumbnail creds: @keshavatearth,

* marks sponsors that make this newsletter possible :)

Wanna partner with us? Last few slots left for the rest of the year.

Rainbow Roxy

Oct 30, 2025

Brilliant. It's so cool that you and Keshav were truely ahead of the curve already building something similar to these new Anthropic skills! I'm super curious about the mechanism for how they can be weaponized; is it mostly through malicious scripts, or can the metadata alone pose a risk for context injection?

Discussion about this post

Ready for more?