<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The AI Engineer]]></title><description><![CDATA[The practical AI newsletter for engineers who build. Every week: tutorials, frameworks, and real lessons to help you ship agents, automate your work, and become the engineer everyone asks for help.]]></description><link>https://newsletter.owainlewis.com</link><image><url>https://substackcdn.com/image/fetch/$s_!zVJN!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c5885e-22ac-4938-93aa-43f6d04d3364_1080x1080.png</url><title>The AI Engineer</title><link>https://newsletter.owainlewis.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 21 Apr 2026 00:09:18 GMT</lastBuildDate><atom:link href="https://newsletter.owainlewis.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Owain Lewis]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[owainlewis@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[owainlewis@substack.com]]></itunes:email><itunes:name><![CDATA[Owain Lewis]]></itunes:name></itunes:owner><itunes:author><![CDATA[Owain Lewis]]></itunes:author><googleplay:owner><![CDATA[owainlewis@substack.com]]></googleplay:owner><googleplay:email><![CDATA[owainlewis@substack.com]]></googleplay:email><googleplay:author><![CDATA[Owain Lewis]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Is Pi better than Claude Code?]]></title><description><![CDATA[I spent a week with the minimalist AI coding agent Pi. Here's my honest take.]]></description><link>https://newsletter.owainlewis.com/p/is-pi-better-than-claude-code</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/is-pi-better-than-claude-code</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Sat, 18 Apr 2026 17:00:15 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/43474ba5-4061-4243-97b6-e5752f679132_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I spent last week using Pi as my main coding agent, which was an interesting exercise because Pi is radically different coding agent compared to Claude Code or Codex.</p><p>If you have not come across it, <a href="https://pi.dev/">Pi</a> is an open source coding agent that has a very minimalist philosophy. It ships with only 4 tools (read, edit, write, bash), supports ALL AI models, has a system prompt under a thousand tokens, no MCP, no permissions. The one feature that makes it really different is the ability to extend and customise the agent through extensions.</p><p>The minimalism is a feature not a bug. As coding agents have more and more random (often unnecessary) features bolted on, they can start to feel like a big mess. You have no control of third party coding agents so if you want to change something you&#8217;re at the mercy of Anthropic or OpenAI.  </p><p>A really interesting feature of Pi to me is that you can edit or completely replace the system prompt for the agent. This is useful if you want to use it for things that aren&#8217;t coding.</p><h2>How extensions work</h2><p>Extensions are just TypeScript files you drop in a folder, and on startup Pi picks them up. They can do almost anything. Change the UI. Register tools. Intercept calls. Gate permissions if you want that back. or anything else you can think of.</p><p>I built a workflow extension that takes a spec, write code, reviews the code with a fresh context window, fixes the issues, runs the tests, and verifies. It took me about twenty minutes to put together because Pi can read its own source and documentation and write the extension for you. The agent is, in a fairly literal sense, able to extend itself.</p><p>I have wanted this for a long time. The loop I run with any coding agent is roughly: write, review, fix, test, verify. Doing it by hand means a lot of prompts that are always the same prompts, and the fresh-context review step is the one I skip most often because it is annoying to set up. Putting it in code, deterministically, so the agent does not have to remember to do it, feels like a real step beyond relying on non deterministic prompts.</p><h2>What the minimalism costs you</h2><p>The honest other side is that minimalism has a cost, and whether you care about the cost depends on how you work. Pi does not ship with MCP. You can add it through an extension, but the default answer to &#8220;how do I connect this to my other tools&#8221; is to use a command-line tool from bash. There is no to-do tracker. No sub-agents. No hooks. These are not accidental omissions, they are the design, but if your current workflow leans on any of them, you will feel it.</p><p>The bigger issue for me personally is the provider situation. Pi lets you bring any model. OpenAI via ChatGPT, Google, GitHub Copilot, OpenRouter, local models through Ollama. But Anthropic recently stopped allowing third-party agents to use the Claude subscription, which means if you want to use Claude models through Pi, you pay API rates. I use Claude heavily, I am on the subscription, and moving to API pricing for my daily agent is not something I want to do. That is not Pi&#8217;s fault. Thankfully OpenAI let you use your Codex subscription with other tools.</p><h2>Where I think this fits</h2><p>If I stripped out the Anthropic-subscription problem, I think Pi would probably be my default coding agent. I like the minimalism. I like that I can read and understand the system prompt. I like that extensions are code, not configuration. I like that the agent can improve itself by writing its own tools.</p><p>As it stands, I am going to keep Pi installed and keep using it for specific workflows where the extension system earns its keep. The multi-stage review loop is one. Anything where I want deterministic control over what the agent does between prompts is another. For everything else, I am still inside Claude Code, because that is where my subscription works and where the models I trust most live.</p><p>The broader thing I will take away is this. The interesting question for coding agents is no longer what features they ship, because the features have converged. The interesting question is what they let you change. Pi bets harder on user-defined behaviour than any other agent I have used, and that bet is, I think, the right one.</p><p>If you want to see what this looks like in practice, I recorded a full walkthrough where I install Pi, configure providers, customise the system prompt, and build an extension from scratch on camera. </p><p>You can watch it here: </p><div id="youtube2-BZ0w0JhPQ9o" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;BZ0w0JhPQ9o&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/BZ0w0JhPQ9o?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Thanks for reading. Feel free to comment on the video if I missed anything. </p>]]></content:encoded></item><item><title><![CDATA[Six RAG strategies, explained simply (with code).]]></title><description><![CDATA[Six Ways to Retrieve Data for an LLM]]></description><link>https://newsletter.owainlewis.com/p/six-rag-strategies-explained-simply</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/six-rag-strategies-explained-simply</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Fri, 03 Apr 2026 08:36:55 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/b4b43934-5ca3-487e-8620-0a0e9677b26a_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Most RAG tutorials jump straight to vector embeddings. Half the time, that&#8217;s the wrong tool.</p><p>RAG means retrieval augmented generation. Retrieve information, add it to the prompt, let the model answer using it as context. The retrieval part is where it gets interesting; there are a lot more options than most people realise.</p><p>As a side note, I use Postgres for everything. You don&#8217;t need complex database infrastructure. Postgres can handle all of these strategies, making it a pragmatic choice for most situations. </p><p>Here are six approaches I use on real client projects, roughly in order of complexity.</p><p>PS: If you want a full walkthrough, I made a video <a href="https://www.youtube.com/watch?v=29PzjQ6myMU">here</a>.</p><h2>1. Document loading</h2><p>Almost everyone dismisses this one, because it&#8217;s simple and boring. </p><p>If you&#8217;re loading a step-by-step runbook, a checklist, or a recipe - you can&#8217;t just retrieve partial bits of information. You need the entire document or the answer won&#8217;t make sense. Partial retrieval of a setup guide produces partial answers.</p><p>Two ways to do it. The naive approach: read the file, stick it in the prompt.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">with open(path, "r") as f:
    document = f.read()
prompt = f"Answer using this document:\n\n{document}\n\nQuestion: {question}"</code></pre></div><p>The smarter approach: an index or lookup system that describes each document. Pass the index to the model, let it pick the right document first, then load it. Slower, but handles a larger document set.</p><p>The downside is tokens. The other downside is you need to know roughly where the information lives. If you have hundreds of documents and no idea which one is relevant, this won&#8217;t work. But for a focused document set, surprisingly reliable.</p><p>Don&#8217;t dismiss the simple option. </p><h2>2. Full text search</h2><p>This has been around forever. Search by keyword. Built into Postgres with tsvector and tsquery, no extra infrastructure needed.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sql&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sql">SELECT content, ts_rank(search_vector, query) AS rank
FROM document_chunks, plainto_tsquery('english', %s) query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT %s</code></pre></div><p>Someone asks about &#8220;30-day returns&#8221; and your document says &#8220;30-day returns,&#8221; you&#8217;ll find it. Postgres also stems keywords, so a search for &#8220;running&#8221; matches &#8220;runner&#8221; and &#8220;runs&#8221; as well.</p><p>Where it breaks: meaning. &#8220;Comfortable shoes for long distance running&#8221; will only find product descriptions that contain the word comfortable. It&#8217;ll miss cushioned, supportive, anything like that. Use this when your users naturally reach for the same words your documents use.</p><h2>3. Vector search</h2><p>This is what most people mean by RAG. Take a document, break it into chunks, turn each chunk into a vector, store it. When a user asks a question, embed the query into the same space and find the closest matches.</p><p>PS: As well as the cliched chunk documents approach - you can also turn database fields (product descriptions e.t.c) into vectors and search them as well (no one talks about this!). </p><p>The power: it understands what you meant, not just what you said. &#8220;Comfortable shoes for long distance running&#8221; finds a product described as &#8220;plush cushioned sole designed for marathon training.&#8221; The description never used the word comfortable. Vector search found it anyway.</p><p>Pgvector adds this directly to Postgres. No separate database.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sql&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sql">sql = """
    SELECT content, 1 - (embedding &lt;=&gt; %s::vector) AS similarity
    FROM document_chunks
    ORDER BY embedding &lt;=&gt; %s::vector
    LIMIT %s
"""</code></pre></div><p>Where it breaks: exact filters. &#8220;Nike shoes under &#163;100&#8221; is a structured query. The embedding of &#8220;under &#163;100&#8221; does not reliably land near documents that contain &#163;99. It might return &#163;200 shoes because the description is semantically similar. Semantic similarity and numerical filtering are different problems.</p><h2>4. Hybrid search</h2><p>Combine keyword and vector search, merge the results. This is my default when I&#8217;m not sure which approach a dataset needs.</p><p>The merging uses Reciprocal Rank Fusion. Each document scores based on where it appeared in each ranked list.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">def reciprocal_rank_fusion(keyword_results, vector_results, k=60):
    scores = {}
    for rank, result in enumerate(keyword_results):
        scores[result["id"]] = scores.get(result["id"], 0) + 1 / (k + rank + 1)
    for rank, result in enumerate(vector_results):
        scores[result["id"]] = scores.get(result["id"], 0) + 1 / (k + rank + 1)
    return sorted(scores, key=lambda x: scores[x], reverse=True)</code></pre></div><p>&#8220;Nike running shoes, comfortable for long distance.&#8221; Keyword search finds anything with Nike, vector search finds anything semantically similar to comfortable and long distance. You get both.</p><p>A solid and pragmatic choice for many business applications.</p><h2>5. SQL RAG (Database RAG)</h2><p>This one is relatively underrated and under-discussed, and it&#8217;s one of my favourites.</p><p>Plot twist: Most business data isn&#8217;t in documents. Customer records, orders, inventory, product listings. None of that is in a PDF. It&#8217;s in a database. SQL RAG turns a natural language question into a database query and just goes and gets exactly what you need. Tends to be very reliable.</p><p>Two approaches with different risk profiles.</p><p><strong>Parameterised queries (safer).</strong> Pre-write the SQL. The model extracts parameters from the question and slots them in. The model never writes SQL.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sql&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sql">sql = """
    SELECT name, price, stock_quantity
    FROM products
    WHERE category = %(category)s
    AND price &lt; %(max_price)s
    AND rating &gt;= %(min_rating)s
"""</code></pre></div><p><strong>Dynamic query generation (more powerful, riskier).</strong> The model writes the actual SQL. LLMs are surprisingly good at this. The queries get complex fast, joining tables, applying multiple filters, and they&#8217;re usually correct.</p><p>I really like this approach for internal analytics tools or database querying tools where the cost of a bad query is low. I&#8217;d be very hesitant to use it on a customer-facing product. </p><p>Start with parameterised queries via regular tool calls.</p><h2>6. Agentic RAG</h2><p>Give the model access to all the retrieval tools above and let it decide which one to use.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">tools = [
    {"name": "search_documents", "description": "Search docs and FAQs"},
    {"name": "query_products", "description": "Search products by price, category, rating"},
    {"name": "get_order_status", "description": "Look up orders for a customer"}
]</code></pre></div><p>Where this shines is compound questions. &#8220;I want running shoes under &#163;150, and what&#8217;s the return policy if they don&#8217;t fit?&#8221; That needs a product database query AND a document lookup. One retrieval strategy can&#8217;t answer it. The agent looks at the question, looks at the tools it has, figures out which to call, and synthesises the answer.</p><p>The downside is latency. The agent has to make decisions and sometimes makes bad ones. Picks the wrong search term, retries. It&#8217;s also less deterministic. But this is the kind of strategy tools like Claude Code use: search in one file, realise it&#8217;s wrong, correct, search somewhere else. Very powerful.</p><h2>How to choose</h2><ul><li><p>A few well-defined or small documents, low query volume or cost sensitivity, users need full context: document loading.</p></li><li><p>Users search with specific keywords: full text search.</p></li><li><p>Semantic understanding matters: vector search.</p></li><li><p>Not sure which applies, or need both: hybrid search.</p></li><li><p>Data is in a database: SQL RAG, start with parameterised queries.</p></li><li><p>Compound questions or complex search requirements across multiple data sources: agentic RAG.</p></li></ul><p>Most production systems combine at least two. Agentic RAG is really just a routing layer over the others.</p><p>All six strategies are implemented against the same database in the video. Run the same questions through each one and watch where they fail. Seeing the failure modes side by side is more useful than any explanation.</p><p>If you want a full walkthrough, I made a video <a href="https://www.youtube.com/watch?v=29PzjQ6myMU">here</a>.</p><p>Thanks for reading. </p><p>Owain</p>]]></content:encoded></item><item><title><![CDATA[How I Use AI To Review AI Code]]></title><description><![CDATA[How to write better code when using AI agents]]></description><link>https://newsletter.owainlewis.com/p/how-i-use-ai-to-review-ai-code</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/how-i-use-ai-to-review-ai-code</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Fri, 27 Mar 2026 17:33:31 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/16d5fe07-235d-4369-9a80-9034deeaebf5_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We're offloading more and more of our coding to AI agents. But AI-generated code has more bugs, security issues, and logic errors than human-written code &#8212; and we're generating it faster than any team can review it.</p><p>The answer isn&#8217;t to skip review. It&#8217;s to automate parts of it so humans only spend time on the things that actually require human judgement.</p><p>Here&#8217;s the four-layer setup I use. Each layer filters out a category of problems so the next layer sees less noise.</p><ol><li><p><strong>Automated checks</strong> run your linter, tests, and security scanner before the agent can finish. </p></li><li><p><strong>Local AI review</strong> gets a second agent to review the code before you push. </p></li><li><p><strong>CI review</strong> runs AI code review automatically on every PR. The safety net for when you skip step two (it happens)</p></li><li><p><strong>Human review</strong> handles what&#8217;s left: architecture, business logic, and &#8220;should we even build this?&#8221;</p></li></ol><p>By the time a human looks at the code, the only things remaining are the things only a human can judge. </p><p>Here&#8217;s the setup.</p><h2>Layer 1: Automate The Obvious</h2><p>Claude Code has a feature called hooks. A hook is a shell script that runs automatically at certain points in the agent lifecycle (like when the agent finishes a task). If the script fails, the agent is blocked from completing and has to fix the issues first.</p><p>I use a Stop hook that runs my linter and scanner every time Claude finishes work.</p><p>The config goes in your Claude Code settings:</p><pre><code><code>{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/stop-checks.sh"
          }
        ]
      }
    ]
  }
}
</code></code></pre><p>The script itself is just whatever checks you already run:</p><pre><code><code>#!/bin/bash
set -e
rubocop .
brakeman -q
bundle exec rspec --fail-fast</code></code></pre><p>Swap those for whatever your project uses. Ruff and pytest for Python. ESLint for JavaScript. The point is the same: the agent can&#8217;t say &#8220;done&#8221; until these pass.</p><p>This alone catches a surprising amount. Formatting issues, unused imports, type errors, broken tests. None of that makes it into a review.</p><h2>Layer 2: Agent Review</h2><p>After automated checks pass, review the code yourself and get an AI second opinion before you push.</p><p>Two things matter here. First, actually run the code. This sounds obvious but it catches the most embarrassing bugs in two minutes. Second, read the diff. You don&#8217;t need to understand every line; understand the shape of the change. What files were touched? Does the scope match what you asked for? Did the agent silently change something you didn&#8217;t ask it to?</p><p>For the AI review, the key is a fresh context window. Don&#8217;t ask the same agent that wrote the code to review it. It has sunk-cost bias and is less likely to challenge its own decisions.</p><p>There are a few ways to do this:</p><ul><li><p><strong>Custom Claude Code command.</strong> A review prompt in .claude/commands/review.md, paired with a REVIEW file at the project root that encodes your project-specific rules. Portable across tools, fully customisable. Claude Code also ships with some built in plugins. </p></li><li><p><strong>Codex </strong>/review<strong>.</strong> Four presets covering every scenario (base branch, uncommitted changes, specific commit, custom instructions). Priority-ranked findings. The best local review UX I&#8217;ve seen. Bonus: writing with Claude and reviewing with Codex means cross-model review built into your workflow. Different models have different blind spots.</p></li><li><p><strong>CodeRabbit.</strong> /coderabbit:review locally. 40+ linters and scanners running behind the scenes, purpose-built for code review. There are many other great code review tools like Greptile to explore also. </p></li></ul><p>I use a custom review command that reads a REVIEW file at the project root. This file has project-specific rules, things I always want checked.</p><pre><code><code># REVIEW.md

## Project Patterns
- Repository pattern for data access. Direct DB queries in handlers are a flag.
- New API routes need an integration test. Flag if missing.</code></code></pre><p>The general review catches general problems. The project-specific rules catch the things that are unique to your codebase. </p><h2>Layer 3: External Review</h2><p>Sometimes I forget to run the local review. Sometimes I&#8217;m in a rush. So I have an automated check on GitHub that reviews every PR before a human sees it.</p><p>There are a few options for this. Codex has a GitHub integration that reviews PRs automatically. CodeRabbit has a GitHub App that does the same thing. Anthropic has an open source GitHub Action for security-focused review.</p><p>I like having this as a separate layer because it catches things even when I skip the local step. Set it up once, runs on every PR for free.</p><h2>Layer 4: Human Review</h2><p>By the time a teammate opens the PR, the linter has passed, tests are green, and an AI has already flagged obvious issues. The human reviewer doesn&#8217;t need to catch formatting problems or unused variables.</p><p>What&#8217;s left is the stuff only a human can judge. Is this the right approach? Does it solve the actual business problem? Will this cause issues in three months? Five minutes of focused review on those questions is more valuable than thirty minutes of line-by-line reading.</p><h2>TL;DR </h2><p>I spent a long time trying to find the perfect code review setup. The experience was frustrating. There are hundreds of tools, plugins, and approaches, many of them doing the same thing in slightly different ways.</p><p>Don&#8217;t get lost looking for the perfect solution or perfect prompt. Start with Layer 1. Set up your linter and your hooks; that alone eliminates an entire category of review noise. Then find one way to get an AI review locally that you trust. Add CI when you&#8217;re ready.</p><p>Start simple and never accept the first output from an agent. </p><p>If you&#8217;re interested in this topic: I also made a <a href="https://www.youtube.com/watch?v=As2xy_cSx00">video walking through the full setup with demos</a> if you prefer to watch.</p>]]></content:encoded></item><item><title><![CDATA[How I delegate work to a team of AI agents]]></title><description><![CDATA[Building systems for delegating work to AI agents]]></description><link>https://newsletter.owainlewis.com/p/how-i-delegate-work-to-a-team-of</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/how-i-delegate-work-to-a-team-of</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Thu, 19 Mar 2026 14:42:29 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/efac063d-a872-435d-9afe-83aa5c351b9a_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey &#128075;,</p><p>Most of us are using AI coding agents the same way. You&#8217;re in the terminal, you&#8217;re very involved. You prompt, you review, you go back and forth. This is still the right way to work in many cases. We need to be in the loop to keep quality high and think through problems. </p><p>But if you&#8217;re working on smaller tasks like bug fixes or documentation updates, you generally don&#8217;t need to be in the loop. I think about this as delegating vs micro-managing. You just want to hand these off to an agent and trust that they&#8217;re going to do the work. </p><p>The problem is there aren&#8217;t many easy ways to do that right now. </p><p>So this week, I built a proof of concept solution to help with this problem. </p><h2>Agent worker</h2><p>I built a simple <a href="https://github.com/owainlewis/agent-worker/tree/main/src">TypeScript worker</a> script that polls a task management system for tickets. When it finds one, it picks it up, delegates the work to Claude Code (headless), runs a series of checks, and opens a pull request. The ticket moves to &#8220;In Review.&#8221; I go and check the output.</p><p>I call this an agent control plane. Your tasks manager becomes your interface for delegating work to one or more agents. </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wg8I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wg8I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png 424w, https://substackcdn.com/image/fetch/$s_!wg8I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png 848w, https://substackcdn.com/image/fetch/$s_!wg8I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png 1272w, https://substackcdn.com/image/fetch/$s_!wg8I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wg8I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png" width="1456" height="259" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:259,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50326,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/191473705?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wg8I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png 424w, https://substackcdn.com/image/fetch/$s_!wg8I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png 848w, https://substackcdn.com/image/fetch/$s_!wg8I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png 1272w, https://substackcdn.com/image/fetch/$s_!wg8I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4751cc-766a-48fa-ba17-1a2b6c23d070_1954x348.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>I&#8217;m using Linear - but this works with Jira, Monday.com, or anything with an API. </p><p>Task managers are the right way to delegate this kind of work because if you have many things going on at once, you don&#8217;t want to be chatting with agents. You want a way to actually track what they&#8217;re doing. You&#8217;d do the same thing if you&#8217;re working in a team. You wouldn&#8217;t delegate work via chat. You&#8217;d have some kind of system to track it, especially if you&#8217;re working on hundreds of tasks.</p><h2>Pull vs push architecture </h2><p>This is maybe the most interesting architectural decision. Push-based systems like OpenClaw use webhooks. You expose an endpoint, something hits it, the agent starts working. That means your agent runtime is reachable from the internet. Anyone can hit that endpoint. </p><p>A pull-based architecture is different. The agent worker makes outbound requests only. No need for open inbound ports. No exposed servers. If the worker goes down, tickets just sit in the queue until it comes back. The only trade-off is latency. If it takes 60 seconds to pick up a ticket, that&#8217;s fine. We don&#8217;t care about latency for this kind of work.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!m5xp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!m5xp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png 424w, https://substackcdn.com/image/fetch/$s_!m5xp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png 848w, https://substackcdn.com/image/fetch/$s_!m5xp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png 1272w, https://substackcdn.com/image/fetch/$s_!m5xp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!m5xp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png" width="1456" height="561" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:561,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74395,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/191473705?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!m5xp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png 424w, https://substackcdn.com/image/fetch/$s_!m5xp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png 848w, https://substackcdn.com/image/fetch/$s_!m5xp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png 1272w, https://substackcdn.com/image/fetch/$s_!m5xp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ea6a3d0-66aa-4e88-9834-d3a12e208e22_1542x594.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For a system where you&#8217;re giving an AI agent write access to your codebase and the ability to open PRs, we want the smallest possible attack surface. Polling gives you that. </p><h2>Deterministic guardrails around non-deterministic agents</h2><p>One of the challenges when you&#8217;re delegating to agents this way is you can&#8217;t really do an iterative process. You need it to work in one shot and you need full permissions. This is challenging because more often than not agents make mistakes on their first attempt. </p><p>So we wrap the non-deterministic part (the agent writing code) with deterministic checks on both sides. </p><p><strong>Pre-hooks</strong> run before the agent starts. Check out a worktree, git pull, make sure the environment is clean. If any of that fails, the agent doesn&#8217;t start.</p><p><strong>Post-hooks</strong> run after the agent finishes. Run tests, run linting, push the code. These are just shell commands.</p><p>The workflow inside the agent is also structured to compensate for the lack of back-and-forth. Write the code, run the tests, then run a code review with CodeRabbit to look for errors, fix anything it finds, and run the tests again. This reduces the number of iterations you need.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">When solving a ticket:

1. Write the code to solve the ticket
2. Run `bun test` and fix any failures
3. Review your changes for code quality. Use CodeRabbit if available
4. Fix any issues found in the review
5. Run `bun test` again to confirm fixes didn't break anything</code></pre></div><h2>Automated code review</h2><p>When you&#8217;re not sitting in the terminal reviewing the code, you need some kind of automated system to do that for you. I use CodeRabbit. It&#8217;s an AI code review tool that integrates with Claude Code and also runs on GitHub when a PR is opened. So every PR the agents open gets reviewed automatically before I even look at it.</p><p>It doesn&#8217;t catch everything. But the PRs I end up reviewing have already been through linting, tests, and an AI code review. The obvious stuff is already handled.</p><h2>Scaling agents</h2><p>What I like about this simple approach is that it scales well. </p><p>You can start with one worker running on your laptop. But you can also run multiple workers on different machines, on a VPS, wherever. Same delegation process, more throughput. </p><p>I showed this in <a href="https://www.youtube.com/watch?v=Zhbx-dj0qHE">this video</a> with two workers picking up two tickets at the same time and completing them in parallel.</p><p>The code is mostly a proof of concept to demonstrate the idea. What&#8217;s interesting here is the architecture, not necessarily the code. Pull-based delegation with deterministic guardrails around non-deterministic workers. That pattern holds regardless of which tools you use.</p><p>Full walkthrough with the demo and all the code is in the video:</p><div id="youtube2-Zhbx-dj0qHE" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Zhbx-dj0qHE&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Zhbx-dj0qHE?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Thanks for reading. Have an awesome week : )</p><p>P.S. If you want to go deeper on building AI systems, I run a community for people interested in these topics: <a href="https://skool.com/aiengineer">https://skool.com/aiengineer</a></p>]]></content:encoded></item><item><title><![CDATA[The 7 stages of building software with AI (with prompts you can steal)]]></title><description><![CDATA[Real prompts for planning, building, reviewing, and shipping software with AI agents]]></description><link>https://newsletter.owainlewis.com/p/the-7-stages-of-building-software</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/the-7-stages-of-building-software</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Fri, 06 Mar 2026 13:15:47 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4beb01be-339f-44c5-a01e-c0a2a28e5991_3572x1938.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Every week there&#8217;s a new AI coding framework that promises to revolutionise how you build software. A new agent. A new spec driven agent workflow. A new way to structure your prompts that will supposedly change everything.</p><p>Most of them are packaging the same ideas with different names, and if you&#8217;re feeling overwhelmed by all of it, I think stepping back and looking at the big picture is more useful than chasing the next tool.</p><p>Here&#8217;s what I mean. Every piece of software ever built, at Google, at a two-person startup, on a weekend project, went through some version of the same lifecycle. Requirements. Design. Task breakdown. Build. Review. Deploy. Monitor. The tools change constantly. The lifecycle doesn&#8217;t. It hasn&#8217;t changed in decades, and a new AI framework isn&#8217;t going to change it now.</p><p>What has changed is that AI now accelerates every single stage of that lifecycle, not just the coding step. And most people are only using it for one part, code generation, and leaving enormous value on the table everywhere else.</p><p>I want to walk through all seven stages, share how I actually use AI at each one, and give you specific prompts and examples that have worked well for me. Some of this might seem obvious to experienced engineers, but I&#8217;ve been building software for over twenty years and I still find it useful to step back and look at the full picture. Especially now that the tools have changed so dramatically.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lczc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lczc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png 424w, https://substackcdn.com/image/fetch/$s_!lczc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png 848w, https://substackcdn.com/image/fetch/$s_!lczc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png 1272w, https://substackcdn.com/image/fetch/$s_!lczc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lczc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png" width="3966" height="1103" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1103,&quot;width&quot;:3966,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:415558,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/190099650?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c55f1ae-2a49-4606-99c2-17de88ca0955_3966x1140.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lczc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png 424w, https://substackcdn.com/image/fetch/$s_!lczc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png 848w, https://substackcdn.com/image/fetch/$s_!lczc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png 1272w, https://substackcdn.com/image/fetch/$s_!lczc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc0157f-a089-4612-bac1-ab4de7a0428e_3966x1103.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you want a full video: get it <a href="https://www.youtube.com/watch?v=O5ph_x4-L50">here</a>.</p><h2>Why planning still matters, even when building is fast</h2><p>On a recent client project, I spent three full days on research and planning before I wrote a single line of code. That probably sounds like a long time when you could just open a terminal and start prompting an agent.</p><p>But here&#8217;s what happened: because I had a clear plan, requirements, technical design, key decisions all documented, I could constantly go back to it as I was building. When I hit a fork in the road, the plan had already made the decision for me. When the agent drifted in a direction I didn&#8217;t want, I could point it back to the spec. Over the course of the project, those three days of planning saved me far more time than they cost. The project went smoothly in a way that felt almost unusual.</p><p>One benefit I didn&#8217;t expect: because the requirements were so clearly defined, when it came time to write evals, the agent was able to generate large numbers of them almost automatically. It knew exactly what the software was supposed to do, so it could test against that. Without those clear requirements, the agent wouldn&#8217;t have had enough context to generate useful evals at all. That&#8217;s a downstream benefit of planning that you don&#8217;t really see until you&#8217;ve experienced it. The clarity compounds through every later stage.</p><p>If I hadn&#8217;t done that planning, I know exactly what would have happened, because I&#8217;ve seen it play out dozens of times over my career. You rush into building, you make a decision about your database schema or your auth strategy that feels fine in the moment, and then three weeks later you realise it was wrong. But by then your software is in production, customers are using it, and the cost of reversing that decision is so high that most teams just live with it. I&#8217;ve watched teams carry bad architectural decisions for years because someone rushed the planning phase. That&#8217;s not a hypothetical. It&#8217;s one of the most common patterns in software engineering.</p><p>So the first two stages of the lifecycle are requirements (what are we building and why) and technical design (how are we building it). AI is useful at both. You can have a real conversation with Claude about your architecture, ask it to challenge your assumptions, even prototype multiple approaches quickly to see which one feels right. But the thinking still needs to happen. You need to own these decisions.</p><p>Here&#8217;s a simple requirements template that works well:</p><pre><code><code>What: User authentication system
Why: Users need accounts to save preferences
Who: End users of the web app
In scope: Email/password login, signup page
Out of scope: OAuth, password reset (v1), admin roles</code></code></pre><p>That &#8220;out of scope&#8221; section is quietly one of the most useful things you can write. It stops scope creep before it starts, and it gives the agent a clear boundary for what not to build.</p><h2>Breaking work down is where most people go wrong</h2><p>The third stage is task breakdown, and this is the one that makes the biggest difference to the quality of what you get from AI coding agents.</p><p>The instinct is to hand an agent your entire project and say &#8220;build this.&#8221; Don&#8217;t do that. You&#8217;ll get a mess of code that&#8217;s hard to review, hard to test, and hard to understand. What you want instead is a series of small, clear, bounded tasks. Each one with enough context that the agent can do it well without needing to hold your entire application in its head.</p><p>I use a prompt like this to break a spec down into tasks:</p><pre><code><code>Read the spec in .ai/specs/auth.md.

Break it down into independent work items that can be completed
one at a time. Each work item should have a clear title, a short
description of what needs to be done, and any dependencies on
other work items.

Once you have the list, push each work item to Linear as a new task.</code></code></pre><p>And then when I hand a specific task to Claude Code, I give it real context:</p><pre><code><code>Task: Create the login API endpoint
Context: We're using FastAPI with SQLAlchemy async.
Auth is JWT tokens in httpOnly cookies.
User model is already defined in app/models/user.py.
Follow the existing pattern in app/routers/health.py.</code></code></pre><p>The difference between this and a vague &#8220;add login&#8221; prompt is night and day. An LLM is making hundreds of small decisions as it writes your code. Naming conventions, error handling patterns, where to put things. If you give it context, those decisions are well-informed. If you don&#8217;t, it guesses, and it guesses in ways that feel plausible but don&#8217;t fit your application.</p><h2>Review is the step that changed my workflow</h2><p>I keep Claude Code open all day. I run multiple terminal sessions. And I have a slash command specifically for reviewing code. This is the part of the workflow that I think most people skip, and it&#8217;s the part that has made the single biggest difference to the quality of what I ship.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TE3u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TE3u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png 424w, https://substackcdn.com/image/fetch/$s_!TE3u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png 848w, https://substackcdn.com/image/fetch/$s_!TE3u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png 1272w, https://substackcdn.com/image/fetch/$s_!TE3u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TE3u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png" width="1456" height="646" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:646,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:486024,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/190099650?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TE3u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png 424w, https://substackcdn.com/image/fetch/$s_!TE3u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png 848w, https://substackcdn.com/image/fetch/$s_!TE3u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png 1272w, https://substackcdn.com/image/fetch/$s_!TE3u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b3735d2-6a96-45d6-a648-0c9bd0e2c23e_3210x1424.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>After the agent finishes a task, I ask it to review its own work:</p><pre><code><code>Look at the code you just wrote. Find any bugs, edge cases,
security issues, or potential problems.</code></code></pre><p>I&#8217;m consistently surprised by how much this catches. Not dramatic, application-breaking bugs. Usually small things. A missing edge case. Input validation that isn&#8217;t there. An error handling path that doesn&#8217;t quite work. But these small things compound. If every change you make introduces one minor issue, over time your codebase degrades in ways that are hard to track down later.</p><p>The reason this works is that generation and review are fundamentally different cognitive tasks (for humans and agents). When the agent is writing code, it&#8217;s focused on making things work. When it&#8217;s reviewing, it&#8217;s looking for problems. These aren&#8217;t the same mode of thinking, and almost every time I run a review pass, it finds something meaningful that it missed the first time around.</p><p>Once I&#8217;ve built out a complete feature, I&#8217;ll also do a secondary review of the whole thing end-to-end. You catch a different class of issues at that level. Things that look fine in isolation but don&#8217;t quite fit together, or patterns that are inconsistent across files. I&#8217;ve started thinking of this as just part of the work now, not an extra step.</p><h2>Deploy and monitor</h2><p>The last two stages are deployment and monitoring. Neither is as glamorous as the build step, but both are areas where AI has saved me more time than I expected.</p><p>For deployment, I&#8217;ve used prompts as simple as:</p><pre><code><code>Commit and save these changes with a clear commit message.
Then push the latest version to GCP Cloud Run.</code></code></pre><p>If you&#8217;re not deeply familiar with CI/CD pipelines or infrastructure configuration, this is one of those areas where AI genuinely shines. You can describe what you want and it will walk you through the setup or just do it for you. Things that used to take an afternoon of reading (truly awful) cloud provider docs now take minutes.</p><p>Monitoring is the stage that most people skip entirely, and then they find out their application is broken because a customer emails them about it. I&#8217;ve seen this happen more times than I&#8217;d like to admit, including on my own projects. The fix is simple: set up error tracking with something like Sentry, add uptime monitoring, configure alerts. You can ask Claude Code to integrate all of this into your application, and the whole thing takes less time than you&#8217;d spend debugging one production incident without it.</p><h2>What I&#8217;ve learned after a year with Claude Code</h2><p>I&#8217;ve been using Claude Code since it first launched, and at this point I use it for essentially all of my development work. But that experience has also taught me something important: it&#8217;s only a powerful tool if you know how to guide it.</p><p>Claude Code still makes a significant number of mistakes. It still makes decisions that don&#8217;t align with what you want. It still needs clear direction, careful planning, and thorough review to produce software you&#8217;d actually be proud of. The agents are getting better all the time, but we&#8217;re not at a point where you can skip the thinking and get good results. I&#8217;m not sure we ever will be, honestly. The thinking is the valuable part.</p><p>The people who are getting the most out of these tools aren&#8217;t the ones with the cleverest prompts or the most elaborate frameworks. They&#8217;re the ones who understand the fundamentals of building software (requirements, design, task breakdown, review) and use AI to accelerate each of those stages rather than trying to skip them entirely.</p><p>Every new framework that comes along is ultimately just a different way of sending text to a language model. The framework doesn&#8217;t change the quality of the output. Your thinking before you write the prompt does.</p><p>Vibe coding isn&#8217;t the enemy. Skipping the thinking is. A senior engineer who has done the design work, made the architectural decisions, and broken the work down can move fast within that structure and produce something great. Someone who skips all of that and just prompts their way through will produce a mess, no matter how good the tools are.</p><p>Do the thinking. Then you&#8217;ve earned the right to move fast.</p><div><hr></div><p>I put together a <a href="https://github.com/owainlewis/youtube-tutorials/tree/main/tutorials/stop-vibe-coding">companion repo</a> on GitHub with all the prompts from this piece, plus the presentation slides I used in the <a href="https://www.youtube.com/watch?v=O5ph_x4-L50">video</a>. Clone it, steal the prompts, adapt them to your own workflow.</p>]]></content:encoded></item><item><title><![CDATA[How I'm using OpenAI Codex automations to improve my code]]></title><description><![CDATA[How to create AI agents that work for you 24/7.]]></description><link>https://newsletter.owainlewis.com/p/how-im-using-openai-codex-automations</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/how-im-using-openai-codex-automations</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Sat, 21 Feb 2026 15:17:09 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/68cf5d35-6624-43a5-b7ec-45ab3048621a_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>OpenAI just added a feature to their coding agent, Codex, that most people missed. It&#8217;s called Automations.</p><p>If you haven&#8217;t used Codex - it&#8217;s an AI coding assistant, similar to Claude Code. You give it a task in plain English (&#8221;fix this bug&#8221;, &#8220;review this file&#8221;) and it writes the code for you. Think of it as a developer on your team that you hand tasks to.</p><p>Automations let you take any task you&#8217;d give Codex and run it on a schedule. You write a prompt, pick a frequency (every morning, every 3 hours, whatever), and the agent runs that task automatically in the background on repeat. No manual prompting. You&#8217;re not at the keyboard.</p><p>I&#8217;ve been running two of these for a few weeks and they&#8217;ve caught bugs I almost certainly would have missed. One scans for issues and creates Linear tickets. The other picks up those tickets, fixes the code, and opens PRs.</p><p><a href="https://www.youtube.com/watch?v=HAlERUhb1x8">This video</a> walks through the full demo. This edition breaks down the setup and how I wired the two automations together.</p><h2>The Setup</h2><p>Each automation lives as a .toml file (a simple config format) inside a .codex/automations/ folder in your project. It has three things: a prompt (what the agent should do), a schedule (when it runs), and a memory.md file that persists between runs so the agent remembers what happened last time.</p><p>Here&#8217;s what the bug scanner looks like stripped down:</p><pre><code><code>[automation]
name = "Bug Scanner"
cwd = "/workspace/myproject"
schedule = "0 9 * * *"  # Every day at 9am

[prompt]
content = """
You are performing a daily code review. Your job is to find critical bugs,
security issues, and unhandled edge cases in the codebase.

For each issue you find:
1. Check whether a Linear ticket already exists for this issue. If it does, skip it.
2. If it's new, create a Linear ticket with the following:
   - Title: concise description of the bug
   - Label: autofix
   - Body: summary, affected files with full paths, customer impact,
     reproduction steps, suggested fix

Use full absolute paths when referencing files. This automation runs inside
a Git work tree and relative paths will not resolve correctly.

At the end, report: X bugs found, Y skipped as duplicates.
"""</code></code></pre><p>The bug fixer runs on the same schedule:</p><pre><code><code>[automation]
name = "Bug Fixer"
cwd = "/workspace/myproject"
schedule = "0 9 * * *"  # Every day at 9am

[prompt]
content = """
Scan your Linear board for open issues with the label: autofix.

For each issue:
1. Read the bug description in full
2. Check out a new git branch: fix/&lt;issue-id&gt;-&lt;slug&gt;
3. Implement the fix
4. Verify the build passes
5. Open a pull request using the GitHub CLI
6. Move the ticket to In Review status

If anything fails, stop and report the error. Do not silently work around failures.
"""</code></code></pre><p>The memory file sits alongside the automation config and gets updated after each run. It keeps a record of what the agent found and did previously. On the next run, the agent reads it before scanning - so it knows which issues it already reported and won&#8217;t duplicate them.</p><h2>Quality</h2><p>This is what surprised me most. </p><p>The agents write better bug tickets than most developers do. </p><p>The tickets generated by these automations had incredibly detailed descriptions, a suggested fix, and detailed steps to reproduce the issue.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2njF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2njF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png 424w, https://substackcdn.com/image/fetch/$s_!2njF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png 848w, https://substackcdn.com/image/fetch/$s_!2njF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png 1272w, https://substackcdn.com/image/fetch/$s_!2njF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2njF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png" width="1456" height="959" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:959,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:413397,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/188714826?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2njF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png 424w, https://substackcdn.com/image/fetch/$s_!2njF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png 848w, https://substackcdn.com/image/fetch/$s_!2njF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png 1272w, https://substackcdn.com/image/fetch/$s_!2njF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82118311-dd71-416f-b291-d72ceb1decc6_2134x1406.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Memory</h2><p>Traditional automation tools like Zapier and n8n run fixed flows. They do the same steps every time. These Codex automations are different because they persist memory across runs.</p><p>After each run, the agent writes what it found into the memory file. Here&#8217;s a simplified version of what that looks like after a few runs:</p><pre><code><code># Automation Memory

## 2026-02-18
- Found 3 new issues. Created tickets: LIN-47, LIN-48, LIN-49.
- Skipped 1 issue (duplicate of LIN-44).

## 2026-02-19
- Found 1 new issue. Created ticket: LIN-52.
- Note: The webhook timeout issue (LIN-47) was fixed and merged.
  Removed from watch list.

## 2026-02-20
- No new issues found in webhook or retry modules.
- Flagged auth module for closer review tomorrow - noticed some patterns
  that could lead to session fixation under specific conditions.</code></code></pre><p>On the next run, the agent reads this before starting. It knows what it already reported. It knows what was fixed. It can notice when it flagged something yesterday and follow up on it today.</p><h2>What to Automate</h2><p>The pattern generalises. This works in any agent environment - it&#8217;s just a prompt, a recurring schedule, and a memory file.</p><p>What makes a good candidate:</p><ul><li><p>It&#8217;s something you&#8217;d do on a regular schedule anyway</p></li><li><p>A vague instruction is enough to produce useful output (you don&#8217;t need pixel-perfect determinism)</p></li><li><p>It&#8217;s safe for an agent to try and fail - the output goes somewhere reviewable before anything irrevocable happens</p></li></ul><p>Bug scanning fits all three. So does dependency review, documentation checks, stale ticket cleanup, security audits, release notes, and test coverage monitoring. Anything that currently gets skipped because you&#8217;re busy is a candidate.</p><p>Where it doesn&#8217;t work well: anything where the agent needs to make a decision you&#8217;d want to make yourself, or where a wrong answer is hard to detect in review. </p><h2>Summary</h2><p>If you have a codebase with more than a few thousand lines, set up one scanner this week. </p><p>Write a prompt that asks the agent to do a daily code review and create a ticket for anything new. Run it manually a few times to see what it finds.</p><p>I was really impressed by this feature. The idea is simple but the impact is significant. A bug that never reaches your customers, a task you&#8217;re too busy to do that can now be done automatically, a security issue that is detected early on. </p><p>I made a video covering this in more depth <a href="https://www.youtube.com/watch?v=HAlERUhb1x8">here</a>.</p>]]></content:encoded></item><item><title><![CDATA[Claude Code agent teams explained]]></title><description><![CDATA[Surprising lessons from building an app with a team of agents]]></description><link>https://newsletter.owainlewis.com/p/claude-code-agent-teams-explained</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/claude-code-agent-teams-explained</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Thu, 12 Feb 2026 17:25:21 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/5e3a9e16-a2c7-4249-8f87-27e8add74c14_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey &#128075;,</p><p>Claude Code just shipped a really interesting feature called <a href="https://code.claude.com/docs/en/agent-teams">Agent Teams</a>. Instead of one agent doing everything, you can now run multiple Claude Code instances that work together as a team. Each agent has its own context window, and they can talk to each other directly (which is crazy).</p><p>Using AI agents to write code is standard now. Using one agent in a terminal or IDE feels natural - describe a task, it builds, you review. Straightforward loop.</p><p>Multiple agents talking to each other feels completely different. And, despite this being an early feature, it feels like looking into the future. </p><p>I made a <a href="https://www.youtube.com/watch?v=KuxsOv0q0mo">video</a> showing how to set this up with tmux so you can watch all the agents working together. </p><h2>Subagents vs Agent Teams</h2><p>You&#8217;re probably familiar with subagents in Claude Code. With subagents, each one has its own context window and results return to the parent. Communication is one direction. The parent manages everything. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8w_R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8w_R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png 424w, https://substackcdn.com/image/fetch/$s_!8w_R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png 848w, https://substackcdn.com/image/fetch/$s_!8w_R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png 1272w, https://substackcdn.com/image/fetch/$s_!8w_R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8w_R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png" width="1404" height="566" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:566,&quot;width&quot;:1404,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59109,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/187762480?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8w_R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png 424w, https://substackcdn.com/image/fetch/$s_!8w_R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png 848w, https://substackcdn.com/image/fetch/$s_!8w_R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png 1272w, https://substackcdn.com/image/fetch/$s_!8w_R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba39f65c-5784-4859-aaf5-6dc04ab8cb28_1404x566.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>With agent teams, each agent also has its own context window but they&#8217;re fully independent Claude Code sessions. They can message each other directly. There&#8217;s a shared task list for coordination. It&#8217;s best for long running work that needs discussion and iteration between agents. </p><p>The quick test is: if your agents would benefit from talking to each other, use a team. If they just need to return results, subagents are simpler and cheaper.</p><h2>When To Use This</h2><p>The pattern I keep coming back to is agents reviewing other agents&#8217; work.</p><p>One agent writes code. A second agent reads the output and sends specific feedback. The first agent fixes the issue. The reviewer checks again.</p><p>With subagents, this feedback loop runs through you. The reviewer reports back, you read it, you paste it into a new prompt for the builder. You&#8217;re the middleman.</p><p>With agent teams, the reviewer sends notes directly to the builder. The builder fixes it. The reviewer checks again. Multiple rounds without you relaying anything.</p><p>That&#8217;s a small thing on paper. In practice, it changes what kind of work you can hand to agents. Single-pass generation - write this function, generate these tests - works fine with one agent. Multi-pass long running work - build something, review it, revise, review again &#8212; requires agents that can talk to each other.</p><h2>C Compiler</h2><p>To see where this goes, look at what Anthropic&#8217;s engineering team did. Nicholas Carlini ran sixteen Claude instances to build a <a href="http://anthropic.com/engineering/building-c-compiler">C compiler</a> from scratch in Rust. Two weeks, about two thousand sessions, just under twenty thousand dollars in tokens. The result: a hundred thousand lines of Rust that compiles the Linux kernel. Ninety-nine percent GCC torture test pass rate.</p><p>The interesting part isn&#8217;t the scale. It&#8217;s the change in the type of work agents can do.</p><p>Carlini&#8217;s key observation: &#8220;Claude will work autonomously to solve whatever problem I give it. So it&#8217;s important that the task verifier is nearly perfect.&#8221; The agents weren&#8217;t the bottleneck. The quality of the feedback loop was.</p><p>That&#8217;s the same pattern at a different scale. Builder agents produce code. Reviewer agents check it and send feedback. The builders act on it. The loop runs without a human relaying messages.</p><h2>My Experience</h2><p>I used agent teams to build an app with three Claude instances: a backend agent, a frontend agent, and a code reviewer. The reviewer watches both, checks that the API contract lines up, and sends issues to the agent that owns the code.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pafc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pafc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png 424w, https://substackcdn.com/image/fetch/$s_!pafc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png 848w, https://substackcdn.com/image/fetch/$s_!pafc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!pafc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pafc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png" width="1456" height="644" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:644,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:233653,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/187762480?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pafc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png 424w, https://substackcdn.com/image/fetch/$s_!pafc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png 848w, https://substackcdn.com/image/fetch/$s_!pafc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!pafc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c75c1c8-a0bb-4263-ba37-fe0b55489322_2606x1152.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>During the build, the reviewer caught bugs in the initial implementation and delegated back to the front end and back end agents to fix the issues. </p><p>That&#8217;s a closed-loop correction that happened without me. With a single agent, I&#8217;d have caught it during code review. With teams, the feedback loop ran on its own.</p><h2>Tradeoffs </h2><p>Three agents in parallel costs roughly three times as much. I burned through my rate limits making a video about this (and rarely have issues). That&#8217;s the honest trade-off.</p><p>The C compiler project used about twenty thousand dollars in tokens over two weeks. That&#8217;s sixteen agents running in loops. For most of us, the question isn&#8217;t &#8220;can I afford sixteen agents&#8221; &#8212; it&#8217;s &#8220;is the closed-loop feedback worth 3x the cost for this particular task?&#8221;</p><h2>Where Is This Heading?</h2><p>Right now, agent teams are experimental. </p><p>But the pattern - agents reviewing agents in a loop, self-correcting, picking up the next task when they&#8217;re done - that&#8217;s clearly the direction we&#8217;re going in. Systems of specialised agents working together on more complex tasks. The C compiler project showed it actually works at scale. Agent teams in Claude Code bring a version of it to your terminal.</p><p>I walked through the full build in my latest <a href="https://www.youtube.com/watch?v=KuxsOv0q0mo">video</a> here. </p><p></p>]]></content:encoded></item><item><title><![CDATA[Your agent workflow doesn't scale (here's the fix)]]></title><description><![CDATA[How to build your agent control plane]]></description><link>https://newsletter.owainlewis.com/p/your-agent-workflow-doesnt-scale</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/your-agent-workflow-doesnt-scale</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Sat, 07 Feb 2026 16:36:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3daa5046-3a49-4472-aac5-893013e4f3b0_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey,</p><p>I&#8217;ve been using a project board to manage my AI agents and it&#8217;s working really well. Instead of prompting back and forth in the terminal, I put tasks on a Linear board. The agents pick up tickets, follow a workflow I&#8217;ve defined, and open PRs. I just review the PRs.</p><p>Here&#8217;s the setup.</p><h2>The Setup</h2><p>Two things make this work: an MCP connection to your task board, and a CLAUDE.md file that defines how the agent should work.</p><h3>Connect Claude Code to Linear</h3><p>One command:</p><pre><code><code>claude mcp add --transport http linear-server https://mcp.linear.app/mcp</code></code></pre><p>Open a Claude Code session and authenticate through Linear&#8217;s OAuth flow. Claude can now read tickets, create tickets, update status, and close issues. </p><p>While I&#8217;m using Linear, you could follow this flow in any task management system. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lifx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lifx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png 424w, https://substackcdn.com/image/fetch/$s_!lifx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png 848w, https://substackcdn.com/image/fetch/$s_!lifx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png 1272w, https://substackcdn.com/image/fetch/$s_!lifx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lifx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png" width="1456" height="986" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:986,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:393910,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/187205856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lifx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png 424w, https://substackcdn.com/image/fetch/$s_!lifx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png 848w, https://substackcdn.com/image/fetch/$s_!lifx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png 1272w, https://substackcdn.com/image/fetch/$s_!lifx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa39bcc5a-f605-4734-904f-ad545c005c0e_1966x1332.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I break all this setup down in a <a href="https://youtu.be/9YpHBUmwY5M?si=EpdJJBegp7w-sbGH">video here</a>. </p><h3>Encode the Workflow in CLAUDE.md</h3><p>Your project&#8217;s CLAUDE.md file tells the agent how to behave. Here&#8217;s a stripped-down version of what I actually use:</p><pre><code><code>## Linear Integration

- Fetch issues using the Linear MCP tool.
- Always read the parent issue (if one exists) for full context.
- If a description references a spec file, read it before implementing.
- Set issue status to **In Progress** when starting,
  **In Review** after PR creation.

## Branching

Branch format: `&lt;prefix&gt;/&lt;issue-id-lowercase&gt;-&lt;slug&gt;`
- `feature/` for features
- `fix/` for bugs
- `cleanup/` for tech debt

Example: `feature/gra-12-add-supabase-sync`

## Commits

- Format: `&lt;summary&gt; (&lt;ISSUE-ID&gt;)` e.g. `Add Supabase sync (GRA-12)`
- Never commit code that doesn't build. Run `bun run build` first.

## Pull Requests

Create with `gh pr create`. PR body must include:
- Summary of changes
- Verification: `bun run build` result, files changed
- Link to the Linear issue

## Self-Review (required before pushing)

After implementation, launch a sub-agent to review the diff:
- Check for bugs, dead code, security issues, over-engineering</code></code></pre><p>That&#8217;s the whole system. The agent reads this file, follows the workflow, and produces PRs that are structured, verified, and linked to tickets. You write it once and every task follows the same process.</p><p>The agent fetches the ticket, reads context, checks out a branch, implements, runs the build, reviews its own code with a sub-agent, then opens a PR. All defined in a file that lives in your repo.</p><h3>Write Good Tickets</h3><p>The workflow only works if your tickets are clear. Here&#8217;s what a good one looks like:</p><blockquote><p><strong>Add authentication to the dashboard</strong></p><p>Users should be able to log in with email/password. The login form should validate input, create a session, and redirect to the dashboard on success.</p><p><strong>Files to update:</strong> auth.ts</p><p><strong>Acceptance criteria:</strong></p><ul><li><p>Login form renders at /login</p></li><li><p>Invalid email/password shows an error message</p></li><li><p>Successful login creates a session and redirects to /dashboard</p></li><li><p>Unauthenticated users are redirected to /login</p></li></ul><p><strong>Reference:</strong> See specs/auth.md for expected behaviour</p></blockquote><p>More detailed than most developer tickets in the real world. That&#8217;s the point. Clear tickets are what let you step back. Vague tickets pull you back into the terminal.</p><h2>Start Assigning</h2><pre><code><code>Fetch the open tickets on my Linear board and show me what's in the backlog.</code></code></pre><p>Pick a ticket. Tell Claude to work on it. Watch it follow the workflow.</p><p>Before this, I was writing Markdown specs and handing them to agents. It worked for a while. But once I had more than a few tasks going, I couldn&#8217;t keep track of what was done, what was stuck, or what depended on what. Markdown files don&#8217;t have status. A board does.</p><p>It&#8217;s the same thing that happens when you grow as an engineer. Early on you just write code and push it. Then you add tests, CI, code review. Not because you want more process, but because you&#8217;ve been burned enough times to know that a bit of structure saves you from a lot of pain.</p><p>Same thing with agents. Prompting in the terminal works fine for small stuff. But once you&#8217;re juggling multiple tasks or building something real, having tickets with clear acceptance criteria and a build step that runs before every PR just makes everything more reliable.</p><h2>How I Decide What to Hand Off</h2><p>Not everything needs the full workflow. Small stuff I still just prompt directly. But for anything that takes more than a few minutes, I put it on the board.</p><p>I started by staying pretty hands-on. Watching the agent work through tickets, seeing where it needed better instructions. Over time I got a feel for what it handles well on its own (bug fixes, refactoring, straightforward features) and where I need to stay involved (architectural decisions, anything where I need to see the result before committing to an approach).</p><h2>Summary</h2><p>Pick one feature you&#8217;re working on right now. Break it into three or four tickets on a board. Start assigning tickets to your agents like you would assign to an engineer. </p><p>Watch it pick up the ticket, implement the work, and open a PR.</p><p>Then see what else you can hand off.</p><p>Thanks for reading. I walked through the full setup in a <a href="https://youtu.be/9YpHBUmwY5M?si=EpdJJBegp7w-sbGH">video here</a>. </p>]]></content:encoded></item><item><title><![CDATA[How I code with AI agents (spec-driven development)]]></title><description><![CDATA[An opinionated guide to writing code with AI agents like Claude Code.]]></description><link>https://newsletter.owainlewis.com/p/how-i-code-with-ai-agents-spec-driven</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/how-i-code-with-ai-agents-spec-driven</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Fri, 30 Jan 2026 12:24:42 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/62520764-1ab1-4507-926e-b9f0cc620f1a_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there &#128075;,</p><p>Here&#8217;s a pattern you&#8217;ll recognise: You tell an AI agent to &#8220;add authentication to the app.&#8221; It starts coding immediately. An hour later, you&#8217;re undoing decisions you never asked for.</p><p>Complex code you didn&#8217;t ask for. Password reset flows you didn&#8217;t need. New dependencies you explicitly avoid. The agent was trying to help. It just had no idea what you actually wanted.</p><p>The fix isn&#8217;t better prompting. It&#8217;s a different workflow entirely.</p><p>This post breaks down <strong>spec-driven development</strong> - the practice of defining a specification before letting an agent execute. I&#8217;ll show you exactly what goes in a spec, how it differs from other documents you might write, and a step-by-step workflow I use every day.</p><h2>What Is Spec-Driven Development?</h2><p>Spec-driven development is simple: instead of prompting first and figuring things out as you go, you define a specification up front. A short markdown document that describes what you&#8217;re building, constraints, relevant context, and a list of tasks to complete.</p><p>When you tell an AI agent to &#8220;add authentication,&#8221; there are dozens of decisions to make. Token expiration. Storage approach. Error handling. Library choices. If you don&#8217;t specify, the agent guesses. And guesses compound.</p><p>Even though agents like Claude Code can write plans, ask clarifying questions, and resume sessions, you&#8217;ll still want to own the spec yourself. A plan Claude generates lives in a conversation. A spec lives in your repo - a markdown file you can review, edit, version control, and hand to any other agent or teammate. And writing the spec yourself forces you to make decisions rather than just review the agent&#8217;s choices.</p><h2>The Three Documents (and Why They&#8217;re Different)</h2><p>I see people conflating PRDs, design docs, and specs constantly. They serve different purposes.</p><p><strong>Product Requirements Document (PRD):</strong> For humans; product managers, stakeholders. Covers <em>what</em> we&#8217;re building and <em>why</em>. Business value, user stories, success metrics. This is a debate document.</p><p><strong>Technical Design Document:</strong> For engineers. Covers <em>how</em> we&#8217;re building it. Architecture decisions, scalability considerations, security implications. Also debated and reviewed.</p><p><strong>AI Spec:</strong> For agents. This is an <em>execution</em> document - not a debate, a plan. It translates decisions from the PRD and design doc into something an agent can act on.</p><p>In practice, you don&#8217;t always write all three. For a small feature, you might skip straight to a spec. For a large initiative, you&#8217;d have a PRD that spawns multiple design docs, each spawning multiple specs. The spec is always the final translation layer before code.</p><h2>Anatomy of a Good Spec</h2><p>A spec has four parts:</p><p><strong>1. Why (Brief Context)</strong></p><p>Keep this short. One or two sentences about the problem you&#8217;re solving. This helps the agent make intelligent decisions if it encounters ambiguity.</p><p><strong>2. What (Scope)</strong></p><p>Define the boundaries. What features are you building? Be specific about implementation details the agent would otherwise guess about.</p><p>Example: &#8220;JWT-based auth with one-hour access tokens and seven-day refresh tokens. Users can register, login, and refresh tokens.&#8221;</p><p><strong>3. Constraints (Boundaries)</strong></p><p>This is where you prevent the agent from being too eager. What libraries to use. What patterns to follow. What&#8217;s explicitly out of scope.</p><p>Example: &#8220;Use bcrypt for password hashing. Store user data in Postgres via Prisma. Must not add new dependencies. Must not store tokens in the database. Out of scope: password reset, OAuth, email verification.&#8221;</p><p><strong>4. Tasks (Discrete Work Units)</strong></p><p>Break the work into small, verifiable chunks. Each task should specify what to build, which files to touch, and how to verify completion.</p><p>Example:</p><ul><li><p>Task 1: Add user model to Prisma schema. Verify: npx prisma generate succeeds.</p></li><li><p>Task 2: Create registration endpoint. Verify: Test with curl, user appears in database.</p></li><li><p>Task 3: Create login endpoint. Verify: Returns valid JWT on correct credentials.</p></li></ul><h2>When Specs Go Wrong</h2><p>Specs fail in two directions.</p><p><strong>Over-specified:</strong> You&#8217;ve constrained the agent so tightly it can&#8217;t solve the problem. Signs: the agent keeps asking for permission, or produces convoluted code to satisfy contradictory constraints. Fix: loosen constraints, focus on outcomes rather than implementation details.</p><p><strong>Under-specified:</strong> The agent still has to guess. Signs: you review the code and find unexpected decisions - new files, different patterns, surprise dependencies. Fix: add the missing constraints. Each surprise is a constraint you forgot to write down.</p><p>The goal is a spec tight enough that the agent can&#8217;t make decisions you&#8217;d disagree with, but loose enough that it can solve problems you didn&#8217;t anticipate.</p><h2>My Workflow</h2><p>It&#8217;s important to point out that this level of planning isn&#8217;t always needed. If you&#8217;re fixing a simple bug, you likely don&#8217;t need extensive planning. Just do it. If you&#8217;re working on something large that might split into many tasks or run over multiple sessions - write a spec. </p><p>Here&#8217;s how I actually use specs day to day:</p><p><strong>Step 1: Generate.</strong> I describe what I want to build to the agent and ask it to write a spec&#8212;not implement the feature. I use a /spec command for this.</p><p><strong>Step 2: Iterate.</strong> I review the spec carefully. The agent will make assumptions. I correct them, add constraints I forgot, remove scope creep. This is where I catch problems before they become code.</p><p><strong>Step 3: Execute.</strong> I open a fresh session. I ask the agent to read the spec and implement Task 1. Review the code. Commit. Move to Task 2.</p><blockquote><p>&#8220;Read &lt;path to spec&gt; and implement T1&#8221;</p></blockquote><p><strong>Step 4: Adapt.</strong> Review the code. Could it be improved? Maybe Task 3 reveals a flaw in the spec. I go back and update it. This isn&#8217;t waterfall, it&#8217;s iterative. The spec is a living document.</p><p>The key insight: <strong>don&#8217;t ask the same agent to plan the work and do the work.</strong> Planning and execution are different modes. An agent that&#8217;s planning will think through edge cases. An agent that&#8217;s executing will rush to ship.</p><h2>Skip the Frameworks?</h2><p>There are a lot of spec-driven development frameworks out there. OpenSpec. Kiro. GitHub Spec Kit. I&#8217;ve tried them.</p><p>To me, they felt like overkill. </p><p>They generate tons of files. They want you to define user stories in markdown. They add ceremony that slows you down without adding value.</p><p>Here&#8217;s what you actually need: one slash command that acts as a meta-prompt to generate a spec.<br><br>&gt; &#8220;/spec implement rate limiting in the API.</p><p>The power of spec-driven development isn&#8217;t in the tooling. It&#8217;s in the practice of thinking before prompting. A fancy framework won&#8217;t fix sloppy thinking. A simple markdown file that forces you to articulate constraints is probably enough. </p><h2>Why This Creates Leverage</h2><p>Spec-driven development isn&#8217;t new. Software teams have always worked this way. PRD &gt; Design doc &gt; Task breakdown &gt; Implementation. The only difference is we&#8217;re handing tasks to agents instead of other developers.</p><p>But here&#8217;s what changes: an agent that executes well-defined specs can move faster than any human. The bottleneck shifts from implementation to specification. Your job becomes defining work clearly enough that an agent can execute it autonomously.</p><h2>Final Thoughts</h2><p>If you&#8217;re building anything non-trivial with AI agents, write a spec first. It prevents agents from guessing. It gives you control over implementation decisions. It produces higher-quality code.</p><p>Working incrementally - one task at a time, reviewed and committed - beats letting an agent generate 10,000 lines you have to untangle later. Match the spec&#8217;s detail to the task&#8217;s complexity. One liner? Just do it. Small feature? Short spec. Large feature? Detailed spec with many tasks.</p><p>Your job is to architect the work. The agent&#8217;s job is to build it.</p><p>PS: Watch the video here. It contains a link to all the templates I use: </p><div id="youtube2-RhaF4LVAVng" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;RhaF4LVAVng&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/RhaF4LVAVng?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Thanks for reading. Have an awesome week : )</p><p>P.S. If you want to go deeper on building AI systems, I run a community where we build agents hands-on: <a href="https://skool.com/aiengineer">https://skool.com/aiengineer</a></p>]]></content:encoded></item><item><title><![CDATA[The simplest way to build AI agents in 2026]]></title><description><![CDATA[How to build personal AI agents without frameworks, infrastructure, or unnecessary complexity]]></description><link>https://newsletter.owainlewis.com/p/the-simplest-way-to-build-ai-agents</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/the-simplest-way-to-build-ai-agents</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Fri, 09 Jan 2026 17:43:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/473cfb04-34db-4f6d-ac2e-557fabf7ec6e_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there &#128075;,</p><p>You can build a working AI agent with just a folder, a markdown file, and a python script.</p><p>No N8N. No LangGraph. No FastAPI. No infrastructure at all.</p><p>The AI agent space has convinced people that building agents requires either expensive no-code platforms or serious engineering overhead. I think that&#8217;s backwards - at least for personal use.</p><p>If you&#8217;re a solo builder who wants AI agents that handle your research, automate your workflows, or manage repetitive tasks, you don&#8217;t need production infrastructure. You need something you can build in an afternoon and modify in minutes.</p><p>I call this the <strong>Micro-Agent Architecture</strong>. It&#8217;s the pattern I use for my own agents, and it&#8217;s embarrassingly simple.</p><h2>The Structure</h2><p>Here&#8217;s everything you need:</p><pre><code><code>my-agent/
&#9500;&#9472;&#9472; AGENTS.md
&#9500;&#9472;&#9472; tools/
&#9500;&#9472;&#9472; context/
&#9492;&#9472;&#9472; workspace/</code></code></pre><p>Four folders. Let me show you what each one does.</p><h3>AGENTS.md (The Instructions)</h3><p>This is where you tell the agent who it is and what it can do. Think of it as a system prompt you can version control.</p><p>Most modern coding agents read this file on startup. For Claude Code just add this to your Claude.md and it will read that file.</p><pre><code>@AGENTS.md</code></pre><p>Here&#8217;s a real example: a research agent I use for YouTube content analysis which can fetch videos, research topics, get video transcripts, and even uploads my videos for me (writing all the metadata, tags, and descriptions):</p><pre><code><code># YouTube Research Agent

You are a research agent specialising in YouTube content analysis.

## Tools

Use the following tools.

### get_channel_videos

Fetch videos for a YouTube channel. Returns view counts, titles, outlier scores.

uv run tools/youtube.py get_channel_videos @mkbhd --days 30

### get_transcript

Pulls the transcript for a specific video.

uv run tools/youtube.py get_transcript VIDEO_ID</code></code></pre><p>The agent knows its role, knows what tools it has, and knows the workflow for common tasks.</p><h3>Tools (The Scripts)</h3><p>Simple scripts that do specific things. Python, Bash, Node. If you don&#8217;t know how to code, Claude can just write the scripts for you (&#8220;write a python script to fetch youtube videos for a channel. Tell me how to use it&#8221;). The LLM reads AGENTS.md, sees the command, runs it. No SDK. No framework. Just scripts.</p><h3>Context (The Knowledge)</h3><p>Reference material the agent reads before working. Style guides, templates, examples, SOPs. This is how you make agents consistent-by giving them documentation, the same way you&#8217;d onboard a person.</p><h3>Workspace (The Output)</h3><p>Where the agent saves its work. Research, drafts, data. Files that persist between sessions. Everything it creates goes here, so you can review it, edit it, and build on it.</p><p>As an example, when using my YouTube agent, I store complete video transcripts as files and sometimes refer back to them during conversations. </p><h2>How It Works</h2><p>You already have the agent runtime. It&#8217;s Claude Code, Codex, Amp - whatever agentic coding tool you&#8217;re already using. <em><strong>ANY of them</strong></em>. These tools can read files, follow instructions, and most importantly run commands. That&#8217;s all an agent needs.</p><p>We treat the agent harness itself (Claude Code, Codex, Goose, Amp) as a building block and largely interchangeable. </p><p>Point your tool at the folder and give it a task:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OG3L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OG3L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png 424w, https://substackcdn.com/image/fetch/$s_!OG3L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png 848w, https://substackcdn.com/image/fetch/$s_!OG3L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png 1272w, https://substackcdn.com/image/fetch/$s_!OG3L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OG3L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png" width="1456" height="638" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:638,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:340889,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/184043505?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OG3L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png 424w, https://substackcdn.com/image/fetch/$s_!OG3L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png 848w, https://substackcdn.com/image/fetch/$s_!OG3L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png 1272w, https://substackcdn.com/image/fetch/$s_!OG3L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab544a40-8c04-4b92-99d2-cbc3ebf01f00_2156x944.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The agent reads AGENTS.md, understands its role, runs the tools, and saves everything to workspace. Real research, done automatically, saved locally.</p><p><strong>The folder IS the agent.</strong> Instructions are markdown. Knowledge is markdown. Tools are scripts. Storage is files. The agentic coding tool you already have is the runtime.</p><p>No deployment. No hosting. No complexity. </p><h2>The Insight That Makes This Work</h2><p>Software engineers have been building CLIs and scripts for decades. We write utilities that automate our work. It&#8217;s one of the oldest traditions in the craft.</p><p>Here&#8217;s what I&#8217;ve realised: agents are exceptionally good at using CLIs. Better than humans, actually.</p><p>Think about it. An agent can read documentation perfectly, remember every flag, and invoke your scripts hundreds of times without getting tired or making typos. Give it a conversational interface and suddenly your little Python script becomes something you can talk to.</p><p><strong>Any CLI becomes 100x more powerful when you add an intelligence layer to it.</strong></p><p>That YouTube research tool I showed earlier? It&#8217;s just a script. But when an agent uses it, it can analyse fifty channels in parallel, cross-reference the results, and synthesise insights I&#8217;d never have time to find manually.</p><p>And here&#8217;s the thing-anything can become a tool. A Python script. A bash one-liner. A Docker container. If it runs from a terminal, an agent can use it.</p><p>You&#8217;re not learning a new skill. You&#8217;re amplifying one you already have.</p><h2>Why I Use This Instead of Frameworks</h2><p>For personal agents, frameworks are overhead.</p><p>N8N, LangGraph, and similar tools solve real problems - for teams shipping production systems to users. If you&#8217;re building agents other people will use, you need observability, APIs, error handling, deployment pipelines, all of it.</p><p>But if you&#8217;re building agents for yourself? You don&#8217;t need any of that. You need something you can modify in two minutes when your requirements change. You need something you can understand completely. You need something that doesn&#8217;t break when a framework updates.</p><p>A folder of markdown and scripts gives you that. It&#8217;s not sophisticated. That&#8217;s the point.</p><h2>The Leverage Angle</h2><p>A tool helps you once. A system helps you a thousand times.</p><p>The Micro-Agent Architecture isn&#8217;t about building impressive AI systems. It&#8217;s about building personal agents that help 1 person do the work of 10. </p><div><hr></div><p>Thanks for reading. Have an awesome week : )</p><p>P.S: If you want to build agents like this hands-on with other engineers, find more in depth content here: <a href="https://skool.com/aiengineer">https://skool.com/aiengineer</a></p>]]></content:encoded></item><item><title><![CDATA[AI frameworks worth learning in 2026]]></title><description><![CDATA[A practical breakdown for engineers who are feeling overwhelmed]]></description><link>https://newsletter.owainlewis.com/p/ai-frameworks-worth-learning-in-2026</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/ai-frameworks-worth-learning-in-2026</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Wed, 07 Jan 2026 17:56:40 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/36b8dc8a-d80b-488a-813f-aaa28f3c07d5_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;b5b82c10-5444-4e71-a7b4-18700d0e1d89&quot;,&quot;duration&quot;:null}"></div><p>Hey there &#128075;,</p><p>AI framework hell is real.</p><p>Every week, a new AI framework arrives. And every week, developers ask the same question: which one should I learn?</p><p>LangChain, CrewAI, LangGraph, AutoGen, Spring AI, LlamaIndex, Vercel AI SDK, OpenAI Agents SDK, Google ADK. Everyone&#8217;s telling you to learn the latest one or you&#8217;ll fall behind.</p><p>Here&#8217;s what nobody tells you: you don&#8217;t <em>need</em> any of these. The provider SDKs (OpenAI, Anthropic, Google) are powerful enough on their own. You can build agents and complex multi-step workflows without a framework.</p><p>Frameworks are genuinely useful once you understand what they&#8217;re abstracting away. But, the problem is most developers start with frameworks before they understand the fundamentals. When something breaks, they&#8217;re stuck.</p><p>This week, I&#8217;ll show you the frameworks I think are worth learning right now, and what each one is best for.</p><h2>The Trade-Off</h2><p>Every framework is a trade-off. You&#8217;re adding a layer between you and the model. That layer gives you convenience, patterns, and abstractions. It also means bugs you didn&#8217;t write, upgrade paths that break your code, and opinions about how AI apps should work.</p><p>I call this <strong>framework tax</strong>. Not because frameworks are bad, but because they&#8217;re not free. You&#8217;re trading flexibility for convenience. Sometimes that&#8217;s the right trade. Sometimes it isn&#8217;t.</p><p>The provider SDKs (OpenAI, Anthropic, Google) are well-documented, stable, and give you direct control. The Gemini SDK in particular has seamless tool calling out of the box.</p><p>Here&#8217;s the approach I recommend: <strong>SDK-first development</strong>. Start with the raw SDK. Understand how tool calling works. Build a simple agent loop. Feel the edges.</p><p>Then, when you hit a wall, when you need something the SDK doesn&#8217;t give you easily, reach for a framework. Now you understand what it&#8217;s doing for you. You&#8217;re not cargo-culting. You&#8217;re making an informed choice.</p><h2>When Frameworks Genuinely Help</h2><p>Here are the five situations where I&#8217;d reach for one:</p><p><strong>First: a production-ready agent loop.</strong> The core agent loop is simple. Maybe 70 lines of code. But the production details add up: max iterations, timeouts, graceful error recovery, retry logic. Frameworks have battle-tested these patterns. You <em>can</em> write this yourself. The question is whether you want to discover all the edge cases on your own. Plus, you probably don&#8217;t want to write this by hand on every project. </p><p><strong>Second: provider flexibility.</strong> This is the big one. Moving from OpenAI to Azure OpenAI. Switching from Anthropic to Bedrock. Testing a new model from a different provider. These changes touch a lot of code if you&#8217;re using raw SDKs. An abstraction layer makes swapping providers a config change. If you think you might switch, or want the option, this alone justifies a framework.</p><p><strong>Third: team standardisation.</strong> On larger teams, frameworks enforce consistent patterns. Same structure, same debugging approach, same conventions. Everyone speaks the same language. This benefit scales with team size. </p><p><strong>Fourth: complex workflows.</strong> Retries, human-in-the-loop approvals, branching logic, parallel execution. If you&#8217;re building a workflow engine, frameworks have already solved the hard parts. You could build it yourself, but you&#8217;d be reinventing solutions that have been refined over years.</p><p><strong>Fifth: multi-agent orchestration.</strong> Handoffs between agents, shared state, delegation patterns. Most apps don&#8217;t need this. But if yours does, frameworks make it easier than rolling your own.</p><h2>Provider-Specific vs Provider-Agnostic</h2><p>One more distinction before the list.</p><p><strong>Provider-specific frameworks</strong> (Google ADK, Claude Agents SDK, OpenAI Agents SDK) are built by the model providers. They&#8217;re optimised for one model family, encode that provider&#8217;s best practices, and typically offer the smoothest experience. The trade-off is commitment. You&#8217;re betting on that provider.</p><p><strong>Provider-agnostic frameworks</strong> (Pydantic AI, LangGraph, Vercel AI SDK) abstract across multiple providers. You trade some optimisation for flexibility. When a new model leapfrogs the competition, you swap a config value instead of rewriting code.</p><p>Neither is universally better. If you&#8217;re all-in on one provider, go provider-specific. If you want options, go agnostic. Know which game you&#8217;re playing.</p><div><hr></div><h2>The Five Frameworks Worth Learning</h2><p>Here are the five I&#8217;d actually invest time in - organised by language and use case.</p><h3>1. Pydantic AI (Python)</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X3HB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X3HB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png 424w, https://substackcdn.com/image/fetch/$s_!X3HB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png 848w, https://substackcdn.com/image/fetch/$s_!X3HB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png 1272w, https://substackcdn.com/image/fetch/$s_!X3HB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X3HB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png" width="1066" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:1066,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21889,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/183655124?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!X3HB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png 424w, https://substackcdn.com/image/fetch/$s_!X3HB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png 848w, https://substackcdn.com/image/fetch/$s_!X3HB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png 1272w, https://substackcdn.com/image/fetch/$s_!X3HB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93313f6e-6bc9-4fcd-938c-dc2d16081d30_1066x262.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>If you&#8217;ve used FastAPI or Pydantic, this feels instantly familiar. Same developer experience (types, validation, contracts) applied to LLM applications.</p><p>It handles response validation, retries on malformed outputs, and structured error handling. Works across all major providers.</p><p><strong>Best for:</strong> Python developers who think in types and want reliable, testable agents.</p><h3>2. LangGraph (Python)</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KPN9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KPN9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png 424w, https://substackcdn.com/image/fetch/$s_!KPN9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png 848w, https://substackcdn.com/image/fetch/$s_!KPN9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png 1272w, https://substackcdn.com/image/fetch/$s_!KPN9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KPN9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png" width="1456" height="303" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:303,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29730,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/183655124?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KPN9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png 424w, https://substackcdn.com/image/fetch/$s_!KPN9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png 848w, https://substackcdn.com/image/fetch/$s_!KPN9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png 1272w, https://substackcdn.com/image/fetch/$s_!KPN9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cd89740-a330-4a22-8ca0-d2a443ed7724_1470x306.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>LangGraph models AI applications as graphs. Nodes and edges with explicit state control at every point.</p><p>This shines when your workflow has branches, retries, parallel execution, or human-in-the-loop steps. More complex than simpler frameworks, but that complexity pays off for sophisticated systems.</p><p><strong>Best for:</strong> Production systems with complex workflow logic.</p><h3>3. Vercel AI SDK (TypeScript)</h3><p>The default choice for TypeScript developers. Clean DX, provider-agnostic, swap models without touching frontend code.</p><p><strong>Best for:</strong> TypeScript developers building AI applications. If you&#8217;re in this ecosystem, start here.</p><h3>4. Google Agent Development Kit (ADK)</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PeMw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PeMw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png 424w, https://substackcdn.com/image/fetch/$s_!PeMw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png 848w, https://substackcdn.com/image/fetch/$s_!PeMw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png 1272w, https://substackcdn.com/image/fetch/$s_!PeMw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PeMw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png" width="1456" height="555" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:555,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:236912,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/183655124?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PeMw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png 424w, https://substackcdn.com/image/fetch/$s_!PeMw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png 848w, https://substackcdn.com/image/fetch/$s_!PeMw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png 1272w, https://substackcdn.com/image/fetch/$s_!PeMw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5efb121e-4d6f-4b63-b474-93712d609b1f_3108x1184.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The first framework I&#8217;ve seen that takes polyglot teams seriously. Same mental model across four languages (Java, Go, TypeScript, Python).</p><p>The standout feature is observability. A built-in web UI (super useful) for traces, tool calls, and agent decisions. You can see exactly why your agent did what it did.</p><p><strong>Best for:</strong> Teams on Google Cloud, or polyglot teams that want consistency across languages.</p><h3>5. Spring AI (Java)</h3><p>For Java shops, this is the natural choice. LLM services exposed through familiar Spring patterns.</p><p>The value isn&#8217;t innovation. It&#8217;s integration. If your team thinks in Spring terms, you add AI capabilities without learning a new paradigm.</p><p><strong>Best for:</strong> Enterprise teams with Spring Boot microservices adding AI incrementally.</p><h2>Final Thoughts</h2><p>Frameworks aren&#8217;t the enemy. Neither are they required.</p><p>Start with the SDK. Understand how the primitives work: tool calling, message formats, streaming. Build something simple. Feel where it gets painful.</p><p>Then, when you have a real problem (provider switching, team coordination, complex workflows) pick the framework that solves that specific problem.</p><p>The goal isn&#8217;t to avoid frameworks. It&#8217;s to use them deliberately. Know what you&#8217;re trading away and what you&#8217;re getting back.</p><p>SDK-first. Then frameworks when they earn it.</p><p>That&#8217;s how you stay out of framework hell.</p><div class="poll-embed" data-attrs="{&quot;id&quot;:429597}" data-component-name="PollToDOM"></div><div><hr></div><p>Thanks for reading. Have an awesome week : )</p><p>P.S. If you want to go deeper on building AI systems without the hype, I run a community where we build agents hands-on: <a href="https://skool.com/aiengineer">https://skool.com/aiengineer</a></p>]]></content:encoded></item><item><title><![CDATA[From software engineer to AI engineer (the 2026 roadmap)]]></title><description><![CDATA[Everything you need to go from software engineer to shipping production AI: without chasing every new framework]]></description><link>https://newsletter.owainlewis.com/p/the-complete-ai-engineer-roadmap</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/the-complete-ai-engineer-roadmap</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Wed, 31 Dec 2025 14:55:59 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/77926057-92fa-4c60-a4f7-d060bdba93b6_1200x627.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey friend &#128075;,</p><p>Breaking into AI Engineering can feel overwhelming.</p><p>New tools launch weekly. Tutorials assume you already know everything. Half the advice contradicts the other half. It&#8217;s hard to know where to start, or what actually matters versus what&#8217;s just noise.</p><p>Here&#8217;s the good news: it&#8217;s simpler than it looks.</p><p>AI Engineering is software engineering with LLMs. You&#8217;re not training models from scratch or doing research. You&#8217;re building products that use language models as one component among many.</p><p>AI Engineers spend most of their time on the same things great software engineers focus on: designing reliable systems, writing code, testing properly, and making sure things work in production. The LLM part is maybe 20% of the job. The other 80% is engineering.</p><p>This roadmap is the practical path through the noise. </p><p>Let&#8217;s dive in.</p><h2>Stage 1: Programming and Architecture</h2><p>You&#8217;re a software engineer first. Everything else builds on this.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HGs0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HGs0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg 424w, https://substackcdn.com/image/fetch/$s_!HGs0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg 848w, https://substackcdn.com/image/fetch/$s_!HGs0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!HGs0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HGs0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg" width="1440" height="960" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:960,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:467895,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/183045892?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!HGs0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg 424w, https://substackcdn.com/image/fetch/$s_!HGs0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg 848w, https://substackcdn.com/image/fetch/$s_!HGs0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!HGs0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b4f1dab-c518-4ed4-a211-962d1cc4c55b_1440x960.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most AI problems are design and architecture problems - the same kind of problems good engineers have always solved. How do the pieces communicate? Where can things fail? Why is latency so high? How do we know if it&#8217;s working? How does data flow from input to output? These questions matter more in AI applications than most people realise.</p><p>Know when to use different types of databases. Have a mental model for how web services work. If terms like &#8220;stateless,&#8221; &#8220;caching,&#8221; or &#8220;message queue&#8221; are unfamiliar, spend time here before moving on.</p><p>One myth to bust early: you don&#8217;t need Python. It&#8217;s popular and has great ecosystem support, but AI systems are language agnostic. Java developers can use Spring AI and Google&#8217;s ADK. TypeScript developers have the Vercel AI SDK and great provider support, Use what you know. What matters is understanding fundamentals, not picking the &#8220;right&#8221; language.</p><p>Don&#8217;t skip this stage. People who jump straight to complex AI frameworks end up with impressive demos that fall apart when real users touch them.</p><h2>Stage 2: Working With LLMs</h2><p>Now you&#8217;re ready to add LLMs to your toolkit.</p><p>Start by deeply learning one provider&#8217;s API: OpenAI, Anthropic, or Google. Understand how authentication works, how to handle streaming responses, what happens when you hit rate limits, and how to implement proper retry logic. Know what tokens are and why they matter for both context limits and cost.</p><p><strong>Think about production inference early.</strong> The API you prototype with isn&#8217;t always what you&#8217;ll use in production. OpenAI in production means Azure OpenAI. Gemini means Vertex AI. Claude is available on AWS Bedrock, Azure (via Microsoft Foundry), and GCP. Enterprise platforms offer better SLAs, compliance guarantees, and network control. The SDKs are similar but not identical: authentication differs, and some features lag behind. Know where your application will run before you&#8217;ve built too much to easily switch.</p><p>Master the core capabilities that modern LLMs offer:</p><p><strong>Text generation</strong>: master prompting: writing instructions that get LLMs to do what you want. The simplest and most overlooked strategy for learning is to master meta-prompting (ask AI to refine and improve your prompts). </p><p><strong>Structured outputs</strong> matter because software systems need predictable structure, not free-form text. Learn how to constrain model outputs to schemas your code can reliably parse.</p><p><strong>Tool calling</strong> is how LLMs take actions in the world. The model doesn&#8217;t just generate text: it decides which function to call and with what arguments. Understand this at the API level before reaching for abstractions.</p><p><strong>MCP (Model Context Protocol)</strong> is becoming the standard for connecting LLMs to external data and tools. Instead of writing custom integrations for every database or API, MCP provides a common protocol that works across providers. It&#8217;s the connective tissue between your model and everything it needs to access. Worth understanding even if you don&#8217;t adopt it immediately. </p><p><strong>Multi-modal inputs</strong> are increasingly essential. Models like Gemini process images, audio, and documents natively. A customer support system can accept photos of broken products. Voice agents are getting traction. A research tool can process PDFs with charts and diagrams. If you&#8217;re only thinking text-in-text-out, you&#8217;re missing half of what&#8217;s possible.</p><p>Frameworks can help, but they&#8217;re not where you should start. The SDKs from OpenAI, Anthropic, and Google handle all of this directly. Learn what&#8217;s happening at the API level first. Build a simple loop that reasons and acts using tool calling. Once you genuinely understand what&#8217;s underneath, then evaluate whether a framework adds value for your specific use case.</p><h2>Stage 3: RAG</h2><p>Retrieval-Augmented Generation is how you give LLMs your specific knowledge: your documents, your data, your domain expertise (stuff the LLM wasn&#8217;t trained on).</p><p>The core idea is simple: instead of relying only on what the model learned during training, you retrieve relevant information and include it in the prompt. But RAG isn&#8217;t one specific technique. Vector databases are popular, but they&#8217;re just one strategy. Keyword search, hybrid approaches, and even simple file lookups all count. Pick what fits your use case.</p><p>Don&#8217;t overcomplicate RAG at the start. Most improvements come down to better retrieval: finding the right information - not from adding sophisticated components.</p><p>The difference between a demo and a production system comes down to three things: measuring retrieval quality (are you actually finding relevant content?), handling failure cases gracefully (what happens when nothing relevant exists?), and keeping your index fresh when source documents change.</p><h2>Stage 4: System Design</h2><p>There&#8217;s a spectrum of approaches for building AI applications. On one end: deterministic workflows with fixed sequences of steps where you control exactly what happens. In the middle: agentic workflows that add flexibility within boundaries you define. On the other end: autonomous agents that plan and execute with significant independence.</p><p>As a general rule: agents are less reliable/predictable but can handle open ended problems (where you don&#8217;t know the steps in advance).  </p><p>Here&#8217;s an insight that&#8217;s important: LLM calls are slow. A single call can take seconds. A multi-step agent flow could take minutes. If you run these inside web requests, you&#8217;ll get timeouts and hanging UIs.</p><p>The fix is async patterns you&#8217;ve probably used before. Your API receives the task and puts a message on a queue. The user gets an immediate response. A worker picks up the job, does the LLM calls, stores the result when it&#8217;s done.</p><p>Celery, SQS, Temporal workflows - these are battle-tested tools with automatic retries and easy observability. </p><h2>Stage 5: Observability and Testing</h2><p>You can&#8217;t improve what you can&#8217;t see.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q1zO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q1zO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png 424w, https://substackcdn.com/image/fetch/$s_!q1zO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png 848w, https://substackcdn.com/image/fetch/$s_!q1zO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png 1272w, https://substackcdn.com/image/fetch/$s_!q1zO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q1zO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png" width="1456" height="834" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:834,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:334024,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/183045892?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q1zO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png 424w, https://substackcdn.com/image/fetch/$s_!q1zO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png 848w, https://substackcdn.com/image/fetch/$s_!q1zO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png 1272w, https://substackcdn.com/image/fetch/$s_!q1zO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82061951-5b30-4a0a-812d-4be0303b69ec_1736x994.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Every LLM call in your application should be traced. Capture the inputs, the outputs, latency, token usage, and cost. This isn&#8217;t optional for production systems. You need this data to debug issues, understand costs, and improve quality over time.</p><p>Tools like <a href="https://langfuse.com/">Langfuse</a> and <a href="https://www.braintrust.dev/">Braintrust</a> make this easier. The important thing is visibility.</p><p>Testing LLM applications is different from traditional software. Outputs are non-deterministic: same prompt, different responses. A &#8220;good output&#8221; is often subjective. Your regular code should have normal unit tests with mocked LLM responses. For LLM behaviour itself, you need evaluation.</p><p>Build a dataset of test cases with inputs and some notion of good outputs (&#8220;can you define what good output means?&#8221;). Run your application against these regularly. Check that outputs contain required fields, or use another LLM to judge quality. Run evals on every significant change to catch regressions.</p><p>Evals are the only way to confidently modify prompts and logic without breaking things.</p><h2>Stage 6: Deployment</h2><p>Your application isn&#8217;t done until it&#8217;s running reliably for real users.</p><p>For hosting, you have options. Docker + VPS. Platform-as-a-Service providers like <a href="https://render.com/">Render</a> let you deploy without thinking about infrastructure: push your code and it runs. For more control, managed container services like AWS ECS or Google Cloud Run give you automatic scaling and health checks without managing servers directly.</p><p>Cloud infrastructure is no fun, but it&#8217;s often necessary in the real world. Keep things simple if you can. A PaaS that handles the boring stuff lets you focus on your actual application. Only move to more complex setups when you&#8217;ve genuinely outgrown the simple option.</p><p>Set up proper CI/CD so deployments are automated. Make it easy to roll back when something goes wrong. Have logs you can search, traces to see what agents are actually doing, dashboards that show what&#8217;s happening, and alerts for the important stuff.</p><h2>Stage 7: Security and Compliance</h2><p>Security is no longer an afterthought: it&#8217;s often the reason AI projects get killed.</p><p><strong>Data privacy comes first.</strong> Don&#8217;t send customer data to third-party providers like OpenAI unless you know what you&#8217;re doing. Understand your data processing agreements. Know where your data is stored and who can access it. Many promising demos become liabilities the moment real customer information flows through them.</p><p>LLM applications have unique attack surfaces:</p><p><strong>Prompt injection</strong> is where malicious inputs that try to override your system instructions and make the model do something unintended.</p><p><strong>Data leakage</strong> happens when your model accidentally reveals sensitive information from its context.</p><p>Practical tools to protect your applications:</p><ul><li><p><a href="https://github.com/protectai/llm-guard">LLM Guard</a> - scans inputs and outputs for prompt injections, toxic content, and data leakage.</p></li><li><p><a href="https://github.com/leondz/garak">Garak</a> - command-line vulnerability scanner for LLMs from NVIDIA. Red-team your own application before someone else does.</p></li></ul><p><strong>Human-in-the-loop (HITL)</strong> isn&#8217;t just a UX pattern: it&#8217;s a security architecture. For high-stakes actions, require human approval before the system executes. This catches both model errors and successful attacks.</p><p><strong>A note on local inference:</strong> Tools like Ollama let you run models locally, keeping data off third-party servers entirely. This sounds appealing for privacy, but running your own inference is a full-time job done properly. You&#8217;re now responsible for hardware, scaling, model updates, and performance optimisation. Use this as a last resort when compliance requirements leave no alternative: not as a default choice.</p><p>The pattern is straightforward: validate inputs before they reach your model, monitor outputs for sensitive data, require human approval for high-stakes actions, and log everything. This isn&#8217;t different from securing any other application: it&#8217;s just that the attack vectors are newer.</p><div><hr></div><h2>The Path Forward</h2><p>Here&#8217;s what to remember:</p><ul><li><p><strong>You&#8217;re a software engineer first.</strong> The fundamentals matter more than the frameworks.</p></li><li><p><strong>Learn the APIs directly.</strong> Understand what&#8217;s underneath before reaching for abstractions.</p></li><li><p><strong>Start simple.</strong> Boring patterns win. Frameworks aren&#8217;t necessary. </p></li><li><p><strong>Think about production early.</strong> Where you deploy matters. So does security.</p></li><li><p><strong>Build things.</strong> A GitHub full of working projects beats any credential.</p></li></ul><p>The path to AI Engineering is simpler than it looks. You don&#8217;t need every framework. You don&#8217;t need to chase every new tool. You need solid engineering fundamentals, direct experience with LLM APIs, and the judgment to keep things as simple as possible.</p><p>The demand for people who can build reliable AI applications far exceeds the supply. If you put in the work to develop genuine skills, you&#8217;ll find plenty of opportunities.</p><p>Now go build something.</p><div><hr></div><p>Thanks for reading. Have an awesome week : )</p><p>P.S. If you want to go deeper on building AI systems, I run a community where we build agents hands-on: <a href="https://skool.com/aiengineer">https://skool.com/aiengineer</a></p>]]></content:encoded></item><item><title><![CDATA[The 10x skill for AI engineers in 2026: agent feedback loops]]></title><description><![CDATA[How to give AI coding agents the feedback they need]]></description><link>https://newsletter.owainlewis.com/p/the-10x-skill-for-ai-engineers-in</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/the-10x-skill-for-ai-engineers-in</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Sun, 28 Dec 2025 15:09:05 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/8b8c14a1-b58e-4f0d-b1a2-7716069e209a_1200x627.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there &#128075;,</p><p>Here&#8217;s a truth: no one can build software without feedback.</p><p>Engineers don&#8217;t one-shot code. We make mistakes on the first try. Syntax errors, wrong variable names, off-by-one bugs. But we run the code and use that feedback to self-correct. Red squiggles in the IDE. Stack traces in the terminal. Failing tests. We fix, run again, iterate until it works.</p><p>This is so fundamental we take it for granted. <strong>The feedback loop </strong><em><strong>is</strong></em><strong> the process</strong>.</p><p>Agents are the same way.</p><p>Right now, most engineers treat agents like they should do something we&#8217;ve never done: write working code on the first try without running it. When the agent fails (code doesn&#8217;t work), we call it &#8220;hallucination.&#8221; But imagine trying to write code without having the ability to run tests or run the app to verify it&#8217;s working.</p><p>It&#8217;s not a reasoning problem. It&#8217;s a visibility problem.</p><h2>The Problem: You&#8217;ve Become the Loop</h2><p>Watch what happens in most agent workflows:</p><p><strong>The Manual Loop (Slow):</strong> Agent &#8594; You &#8594; Terminal &#8594; You &#8594; Copy/Paste &#8594; Agent</p><p>You&#8217;re the feedback loop. The agent generates code in seconds, but you take minutes to close each iteration.</p><p>You&#8217;ve become the slowest part of the system.</p><p><strong>The Closed Loop (Fast):</strong> Agent &#8596; Terminal</p><h2>Agents Are Brilliant But Blind</h2><p>Here&#8217;s the mental model that changed how I work with agents:</p><p><strong>Before every task, ask: what can my agent actually see?</strong></p><p>Each session starts fresh. No memory of your codebase. No context from yesterday. The agent only knows what&#8217;s in its context window right now.</p><p>If it can&#8217;t see the error, it can&#8217;t fix the error. If it can&#8217;t see the test output, it doesn&#8217;t know something is broken. If it can&#8217;t see the logs, it can&#8217;t debug the integration.</p><p>There are three types of feedback agents need:</p><p><strong>Execution output.</strong> Stack traces with line numbers. The agent needs to run the code and see what happens.</p><p><strong>Test results.</strong> Specific assertion failures. &#8220;Expected 200, got 401&#8221; is actionable. &#8220;Tests failed&#8221; is noise.</p><p><strong>System logs.</strong> API responses, container logs. For anything with dependencies, the bug is often in the integration.</p><p>If you can&#8217;t debug it with the information available, neither can the agent.</p><h2>The Fix: CLAUDE.md</h2><p>Claude Code reads a file called <code>CLAUDE.md</code> from your project root at the start of every conversation. This is where you tell the agent to verify its own work.</p><p>The same concept can be found in other coding agents like OpenCode. </p><pre><code><code># Development Process

## After code changes:
1. Run `uv run pytest` - all tests must pass
2. Run `ruff check . --fix` - fix linting issues

## Rules:
- Do NOT ask me to run tests. Run them yourself.
- If tests fail, read the output, fix, re-run.
- Provide a summary including the files you've changed and test results</code></code></pre><p>This is the first step towards building a closed loop workflow. Agent writes code and then it verifies it&#8217;s own work. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hIBj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hIBj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png 424w, https://substackcdn.com/image/fetch/$s_!hIBj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png 848w, https://substackcdn.com/image/fetch/$s_!hIBj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png 1272w, https://substackcdn.com/image/fetch/$s_!hIBj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hIBj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png" width="1456" height="742" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:742,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:366422,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182700811?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hIBj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png 424w, https://substackcdn.com/image/fetch/$s_!hIBj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png 848w, https://substackcdn.com/image/fetch/$s_!hIBj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png 1272w, https://substackcdn.com/image/fetch/$s_!hIBj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a364484-f9dc-4827-97bf-ddfdd117c5b8_1940x988.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Commands for Heavy Workflows</h2><p>Tests run fast. But spinning up servers, running E2E suites, checking logs? Make those on-demand commands.</p><p>Create <code>.claude/commands/e2e.md</code>:</p><pre><code><code># End To End Test

Test the endpoint: $ARGUMENTS

1. Run `./scripts/e2e.sh $ARGUMENTS`
2. If it fails, read the logs, fix, test again.</code></code></pre><p>Keep heavy verification logic in shell scripts. The command just tells the agent what to run and what to do when it fails.</p><h2>The Shift</h2><p>Every time you run tests and copy output back, start a server and paste the error, check logs to see what went wrong, you&#8217;re doing work the agent should do.</p><p>The agent can run commands. The agent can read output. The agent can iterate. You just have to tell it what &#8220;working&#8221; looks like.</p><p>Build feedback loops for your coding agents.</p><p>Thanks for reading.</p><p>Have an awesome week : )</p><p>P.S. If you want to go deeper on building professional AI systems, I run a community where we do this hands-on: <a href="https://www.skool.com/aiengineer/about">https://skool.com/aiengineer</a></p>]]></content:encoded></item><item><title><![CDATA[The AI design pattern playbook]]></title><description><![CDATA[A practical reference for every AI system you'll build]]></description><link>https://newsletter.owainlewis.com/p/the-ai-design-pattern-playbook</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/the-ai-design-pattern-playbook</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Sun, 21 Dec 2025 13:12:55 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/5fad39df-b5b9-403f-80f1-c3e925476e4a_1200x627.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there &#128075;,</p><p>When it comes to designing AI systems, it helps to have a high-level view of what patterns are available. You don&#8217;t want to reinvent the wheel every time you start a new project.</p><p>There are two typical ways to structure LLM applications: <strong>workflows</strong> and <strong>agents</strong>.</p><p>These aren&#8217;t mutually exclusive. A workflow is a graph of steps you define upfront (a &#8594; b &#8594; c). You control what happens and in what order. An agent is where the LLM controls the flow. It decides what steps to take based on results.</p><p>Agents are powerful when you don&#8217;t know ahead of time what work needs to be done. Deep research is an example - you can&#8217;t predict what searches you&#8217;ll need or what rabbit holes matter until you start exploring. </p><p>Before we dive in, one principle worth internalising: LLM calls are expensive. Not just in cost, but in latency. Every call you add is another round trip, another few seconds of wait time. Everything in this guide is a trade-off between capability and speed. Always ask: do I really need another LLM call here, or can code handle it?</p><p>This is a reference for the patterns I find most useful.</p><h2>Start Here: The Single LLM Call</h2><p>This sounds obvious, but it&#8217;s worth stating: you can go a long way with a single, well-crafted LLM call.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8jAR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8jAR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png 424w, https://substackcdn.com/image/fetch/$s_!8jAR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png 848w, https://substackcdn.com/image/fetch/$s_!8jAR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png 1272w, https://substackcdn.com/image/fetch/$s_!8jAR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8jAR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png" width="411" height="70" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d459fc7-8d27-416b-a571-e3348338a588_411x70.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:70,&quot;width&quot;:411,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4167,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8jAR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png 424w, https://substackcdn.com/image/fetch/$s_!8jAR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png 848w, https://substackcdn.com/image/fetch/$s_!8jAR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png 1272w, https://substackcdn.com/image/fetch/$s_!8jAR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d459fc7-8d27-416b-a571-e3348338a588_411x70.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>CV parsing. Email drafting. Structured data extraction. Classification. Summarisation. A single call with a good prompt handles all of these.</p><p>Before reaching for frameworks or agents, ask yourself: can one prompt do the job? Often the answer is yes. And when it is, you get the fastest possible response time and the lowest possible cost.</p><p>The rest of this guide is for when a single call isn&#8217;t enough.</p><h2>Workflow Patterns</h2><p>You define the graph. The LLM is one step among many - composed with code, API calls, database operations.</p><h3>1. Chain</h3><p>Sequential steps where each builds on the previous.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0fZ7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0fZ7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png 424w, https://substackcdn.com/image/fetch/$s_!0fZ7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png 848w, https://substackcdn.com/image/fetch/$s_!0fZ7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png 1272w, https://substackcdn.com/image/fetch/$s_!0fZ7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0fZ7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png" width="784" height="64" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:64,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8121,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0fZ7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png 424w, https://substackcdn.com/image/fetch/$s_!0fZ7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png 848w, https://substackcdn.com/image/fetch/$s_!0fZ7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png 1272w, https://substackcdn.com/image/fetch/$s_!0fZ7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be6f113-ea19-4f80-bdac-3ac509f74fbc_784x64.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This is the most common pattern. Do a, then b, then c. </p><p>Not every step needs to be an LLM call. You can mix LLM calls with deterministic code. Every LLM call you can replace with deterministic code is latency saved.</p><h3>2. Parallel</h3><p>Run independent operations simultaneously.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7-EP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7-EP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png 424w, https://substackcdn.com/image/fetch/$s_!7-EP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png 848w, https://substackcdn.com/image/fetch/$s_!7-EP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png 1272w, https://substackcdn.com/image/fetch/$s_!7-EP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7-EP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png" width="700" height="278" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:278,&quot;width&quot;:700,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14414,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7-EP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png 424w, https://substackcdn.com/image/fetch/$s_!7-EP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png 848w, https://substackcdn.com/image/fetch/$s_!7-EP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png 1272w, https://substackcdn.com/image/fetch/$s_!7-EP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af7e09a-00a3-41c5-9241-14b1c4d61bfe_700x278.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Image generation is the classic example because it&#8217;s notoriously slow and each image generation can  be parallelised. </p><p>If the operations can be done in parallel, it&#8217;s a no brainer. Use a semaphore to avoid hitting rate limits. </p><pre><code>sem = asyncio.Semaphore(5)

async def generate_image(prompt: str) -&gt; bytes:
    async with sem:
        return await image_model(prompt)

images = await asyncio.gather(
    generate_image("generate an image of ..."),
    generate_image("generate an image of ..."),
    generate_image("generate an image of ..."),
    ...
)</code></pre><h3>3. Route</h3><p>This is my favourite pattern. Classify first, then dispatch to specialised handlers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QrPR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QrPR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png 424w, https://substackcdn.com/image/fetch/$s_!QrPR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png 848w, https://substackcdn.com/image/fetch/$s_!QrPR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png 1272w, https://substackcdn.com/image/fetch/$s_!QrPR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QrPR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png" width="784" height="423" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:423,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22581,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QrPR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png 424w, https://substackcdn.com/image/fetch/$s_!QrPR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png 848w, https://substackcdn.com/image/fetch/$s_!QrPR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png 1272w, https://substackcdn.com/image/fetch/$s_!QrPR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04709a63-88be-4c0c-8b04-053e869cddfb_784x423.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Different inputs need different treatment. Billing questions need account context. Technical questions need documentation RAG. Sales inquiries need a human.</p><p>One prompt can&#8217;t handle all of this well. Classify first, then route to the appropriate subsystem. Each branch can have its own prompts, tools, even models.</p><p>Use a fast, cheap model (or code) for classification. Save the expensive model for the actual work.</p><h3>4. Map-Reduce</h3><p>Process many items, then synthesise.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EAtI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EAtI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png 424w, https://substackcdn.com/image/fetch/$s_!EAtI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png 848w, https://substackcdn.com/image/fetch/$s_!EAtI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png 1272w, https://substackcdn.com/image/fetch/$s_!EAtI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EAtI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png" width="784" height="221" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:221,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12723,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EAtI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png 424w, https://substackcdn.com/image/fetch/$s_!EAtI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png 848w, https://substackcdn.com/image/fetch/$s_!EAtI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png 1272w, https://substackcdn.com/image/fetch/$s_!EAtI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25f9009e-c7d2-4a3c-9828-492eac8d2fab_784x221.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Due diligence across 50 contracts. Research synthesis across 20 papers. Log analysis across gigabytes of files. No way any of this fits in one context window.</p><p>The map phase splits out the work. The reduce phase is where you lose information, so be deliberate about it. For critical details, keep structured data rather than summarising to prose too early.</p><h3>5. Orchestrator-Workers</h3><p>When you can&#8217;t predict the subtasks upfront, but you still want workflow-level control. This is similar to map reduce but with an LLM dynamically making a plan. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0LIL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0LIL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png 424w, https://substackcdn.com/image/fetch/$s_!0LIL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png 848w, https://substackcdn.com/image/fetch/$s_!0LIL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png 1272w, https://substackcdn.com/image/fetch/$s_!0LIL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0LIL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png" width="784" height="266" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:266,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15255,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0LIL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png 424w, https://substackcdn.com/image/fetch/$s_!0LIL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png 848w, https://substackcdn.com/image/fetch/$s_!0LIL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png 1272w, https://substackcdn.com/image/fetch/$s_!0LIL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd4c275f-40bf-4251-8669-99a709cbe29b_784x266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Example: &#8220;Add authentication to the app.&#8221; Which files need to change? You don&#8217;t know until you analyse the codebase.</p><p>The orchestrator examines the task and spawns workers (sub agents) dynamically. This is like parallel, but the LLM decides at runtime what workers are needed and what each should do.</p><h3>6. Evaluate-Refine</h3><p>Generate, check, improve. Loop until good enough. Most of us do this manually when working with LLMs. We ask a question. Ask for improvements. Keep iterating until done.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bl1O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bl1O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png 424w, https://substackcdn.com/image/fetch/$s_!Bl1O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png 848w, https://substackcdn.com/image/fetch/$s_!Bl1O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png 1272w, https://substackcdn.com/image/fetch/$s_!Bl1O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bl1O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png" width="503" height="200" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4377b42-69bf-4911-9363-01345c94e3e0_503x200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:200,&quot;width&quot;:503,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9080,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bl1O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png 424w, https://substackcdn.com/image/fetch/$s_!Bl1O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png 848w, https://substackcdn.com/image/fetch/$s_!Bl1O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png 1272w, https://substackcdn.com/image/fetch/$s_!Bl1O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4377b42-69bf-4911-9363-01345c94e3e0_503x200.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Example: generate a blog post, get feedback from an LLM, improve it based on that feedback.</p><p>The evaluator doesn&#8217;t have to be an LLM. Code checks are often better. Run the tests. Validate the schema. Lint the output. Deterministic evaluation is faster, cheaper, and more reliable.</p><h3>7. Fallback</h3><p>Try cheap first. Escalate when needed.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5Lg6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Lg6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png 424w, https://substackcdn.com/image/fetch/$s_!5Lg6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png 848w, https://substackcdn.com/image/fetch/$s_!5Lg6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png 1272w, https://substackcdn.com/image/fetch/$s_!5Lg6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Lg6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png" width="784" height="124" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:124,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9625,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5Lg6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png 424w, https://substackcdn.com/image/fetch/$s_!5Lg6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png 848w, https://substackcdn.com/image/fetch/$s_!5Lg6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png 1272w, https://substackcdn.com/image/fetch/$s_!5Lg6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef94636d-94e0-49cf-bd6c-533d019665d6_784x124.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Most requests are straightforward. &#8220;What are your opening hours?&#8221; doesn&#8217;t need a frontier model.</p><p>Route everything through a fast model first. When confidence is low, escalate to something more powerful. This can cut costs dramatically while maintaining quality where it matters.</p><p>The trick is reliable confidence detection. </p><h2>Agent Patterns</h2><p>Unlike a workflow, an agent makes decision about control flow. The LLM decides what to do next. You provide instructions, tools and set boundaries. The agent decides what tools to call. </p><h3>8. Tool Loop</h3><p>The core agent pattern. Call tools until the task is done.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2qKa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2qKa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png 424w, https://substackcdn.com/image/fetch/$s_!2qKa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png 848w, https://substackcdn.com/image/fetch/$s_!2qKa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png 1272w, https://substackcdn.com/image/fetch/$s_!2qKa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2qKa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png" width="517" height="180" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/554bff61-b016-4633-87f0-4df66f0084ca_517x180.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:180,&quot;width&quot;:517,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8720,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2qKa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png 424w, https://substackcdn.com/image/fetch/$s_!2qKa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png 848w, https://substackcdn.com/image/fetch/$s_!2qKa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png 1272w, https://substackcdn.com/image/fetch/$s_!2qKa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F554bff61-b016-4633-87f0-4df66f0084ca_517x180.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Claude Code is a good example. Read file, make change, run tests, see error, fix error, run tests again. The model decides what to do based on what it observes.</p><p>The critical mechanism is feedback. Model writes buggy code - sees the stack trace - fixes it. API returns an error - model adjusts parameters. Without feeding errors back to the model, agents can&#8217;t easily self correct.</p><p>The obvious downside of agents is loss of control and less predictability in exchange for more power. </p><h3>9. Plan-Execute</h3><p>Separate thinking from doing.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6xTr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6xTr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png 424w, https://substackcdn.com/image/fetch/$s_!6xTr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png 848w, https://substackcdn.com/image/fetch/$s_!6xTr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png 1272w, https://substackcdn.com/image/fetch/$s_!6xTr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6xTr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png" width="784" height="129" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9484b076-5e17-4d14-a538-b6308d77c421_784x129.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:129,&quot;width&quot;:784,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:10256,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/182225524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6xTr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png 424w, https://substackcdn.com/image/fetch/$s_!6xTr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png 848w, https://substackcdn.com/image/fetch/$s_!6xTr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png 1272w, https://substackcdn.com/image/fetch/$s_!6xTr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9484b076-5e17-4d14-a538-b6308d77c421_784x129.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Deep research is the canonical use case. You can&#8217;t know upfront what searches you&#8217;ll need, what sources matter, what rabbit holes are worth exploring. The agent plans an investigation, executes steps, and replans as it learns.</p><div><hr></div><h3>10. Human-in-the-Loop</h3><p>Pause for approval on high-stakes actions. For a simple terminal agent, this is a trivial if statement:</p><pre><code><code>for tool_call in response.tool_calls:
    if requires_approval(tool_call):
        approved = input(f"Execute {tool_call.name}? (y/n): ")
        if approved != "y":
            continue
    
    result = execute(tool_call)</code></code></pre><p>That&#8217;s really all it is. Before executing sensitive operations - sending emails, making payments, deleting data - ask first.</p><p>Auto-approve small refunds. Human review for large ones. Auto-send routine confirmations. Human review for anything sensitive.</p><p>This is how you build trust in a new system. Start with humans approving everything. Track what gets approved versus rejected. Identify patterns. Automate the safe categories. Keep humans on the edge cases.</p><p>Over time, the system earns more autonomy.</p><h2>Choosing Between Them</h2><p>Start with the simplest option: a single LLM call. Only add complexity when you have a clear reason.</p><p>Default to workflows over agents. Workflows are predictable, debuggable, and easier to reason about. You know exactly what will happen because you defined the graph.</p><p>Reach for agents when you don&#8217;t know the steps upfront. Research, coding, customer questions. The flexibility is worth the unpredictability.</p><p>Every pattern is a trade-off. The best AI systems aren&#8217;t the most sophisticated - they&#8217;re the simplest thing that solves the problem.</p><div class="poll-embed" data-attrs="{&quot;id&quot;:422326}" data-component-name="PollToDOM"></div><p>Thanks for reading. Have an awesome week : )</p><p>P.S. If you want to go deeper on building AI systems, I run a community where we build these patterns hands-on: <a href="https://skool.com/aiengineer">https://skool.com/aiengineer</a></p>]]></content:encoded></item><item><title><![CDATA[My 8 principles for agentic coding]]></title><description><![CDATA[The shift from AI coding to agentic coding is a shift in identity]]></description><link>https://newsletter.owainlewis.com/p/my-8-principles-for-agentic-coding</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/my-8-principles-for-agentic-coding</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Sun, 14 Dec 2025 16:34:05 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/b2697405-e43a-437c-80ce-cc7d3439f836_1200x627.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey friend,</p><p>I&#8217;ve been coding for 20 years. I&#8217;ve written code in Scala, OCaml, Lisp, Ruby, Python, Java; I genuinely love the craft. But lately, I&#8217;ve had to rethink what that craft actually is.</p><p>AI coding was phase one - AI helps you write code faster, but you&#8217;re still driving every decision. Useful, but limited.</p><p>We&#8217;re in phase two now. Agents (like Claude Code) that plan, execute, test, deliver. You review what comes back. The job is moving up the stack: <em>building systems that write better code than you could write yourself</em>.</p><p>The shift from AI coding to agentic coding is a shift in identity.</p><p>You&#8217;re not a <em>developer</em> who uses AI tools. You&#8217;re an <em>engineer</em> who builds systems that build software.</p><p>Here are the 8 principles that finally made it click for me.</p><h2>1. Your Agent Is Capable But Contextless</h2><p>Agents can read codebases, edit files, run terminal commands. They&#8217;re genuinely capable.</p><p>But every session starts empty. No memory of your architecture. No understanding of your conventions. No awareness of what you tried yesterday.</p><p>Before getting frustrated that &#8220;AI sucks&#8221;, ask yourself: does the agent have everything it needs to succeed without me? If yes, let it run. If no, either provide the missing context or plan to stay in the loop.</p><h2>2. Plan First, Execute Second</h2><p>This is probably the biggest unlock: don&#8217;t ask the same agent to plan the work and do the work.</p><p>When you tell an agent to &#8220;build feature X,&#8221; it tends to rush. It makes assumptions. It starts coding immediately and wanders.</p><p>Instead, use a two-step process:</p><ol><li><p><strong>Plan:</strong> Ask an agent to analyze the problem and write a plan. Review it. Iterate until it meets your standards.</p></li><li><p><strong>Execute:</strong> Start a <strong>fresh</strong> session. The new agent reads the approved plan and executes with focus.</p></li></ol><p>The pattern: Plan &#8594; Review &#8594; Execute (new terminal session) &#8594; Ship.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1KsO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1KsO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png 424w, https://substackcdn.com/image/fetch/$s_!1KsO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png 848w, https://substackcdn.com/image/fetch/$s_!1KsO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png 1272w, https://substackcdn.com/image/fetch/$s_!1KsO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1KsO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png" width="1376" height="1100" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1100,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:186784,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/181577815?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1KsO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png 424w, https://substackcdn.com/image/fetch/$s_!1KsO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png 848w, https://substackcdn.com/image/fetch/$s_!1KsO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png 1272w, https://substackcdn.com/image/fetch/$s_!1KsO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b17cdd8-b4bc-4a10-90be-13808c6067bc_1376x1100.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s an example of Claude Code command to generate a detailed plan from a high level instruction. </p><pre><code>&gt; /plan create a new python project for a rag agent using postgresql and hybrid search.</code></pre><p>Iterate on the plan</p><pre><code>&gt; Update the plan to use uv for package management. Use Gemini as the model. Skip frameworks and use the SDK directly. </code></pre><p>Match spec weight to task weight. Most tasks are small. The spec should be too.</p><div><hr></div><h2>3. Match Your Involvement to the Task</h2><p>Not every task needs the same level of attention.</p><ul><li><p><strong>Low Ambiguity:</strong> Writing a unit test, adding a standard endpoint, fixing a lint error. Hand these off completely and go get coffee.</p></li><li><p><strong>High Ambiguity:</strong> Designing an auth system, refactoring core abstractions, debugging race conditions. Keep these in the loop; treat it like pair programming.</p></li></ul><p>The goal isn&#8217;t maximum autonomy everywhere. It&#8217;s spending your attention where it matters.</p><h2>4. The SDLC Still Applies</h2><p>The software development lifecycle didn&#8217;t disappear. It just runs differently with agents. Plan, code, test, review, document: agents can handle all five phases if you set them up for it.</p><p>The mistake I see: skipping straight to code. When agents plan first, they execute better. When they test their own work, they self-correct. </p><div><hr></div><h2>5. Stack Your Leverage Points</h2><p>A few things multiply your agent&#8217;s effectiveness:</p><p>Context files. A README, a conventions doc, an AGENTS/CLAUDE.md. If it exists in a file, you don&#8217;t have to explain it in prompts.</p><p>Runnable tests. This is the highest leverage thing you can provide. If an agent can run tests and see green or red, it validates its own work without waiting for you.</p><p>Concrete plans. A good plan includes verification steps. When &#8220;done&#8221; is defined by a passing test, the agent knows when to stop.</p><p>Reusable workflows. Solve a process once, save it. Planning, testing, shipping-each becomes something you invoke rather than explain.</p><h2>6. Write for Agents, Not Humans</h2><p>Documentation for humans assumes shared context. Documentation for agents needs to be blunt and assume nothing. Provide commands to run. Don&#8217;t be vague. Define success criteria concretely.</p><h2>7. Encode Your Workflows</h2><p>If you&#8217;re typing a long, complex prompt more than twice, save it.</p><p>Here&#8217;s an example: a workflow I use for creating pull requests:</p><ol><li><p>Check branch name, recent commits, and changed files</p></li><li><p>Think hard about what this change accomplishes and why</p></li><li><p>Write a PR title and concise body with summary, changes, and testing notes</p></li><li><p>Push the branch, run gh pr create, return the URL</p></li></ol><p>I type one command, the agent handles everything. Five minutes to write, hundreds of uses afterward.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nVcR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nVcR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png 424w, https://substackcdn.com/image/fetch/$s_!nVcR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png 848w, https://substackcdn.com/image/fetch/$s_!nVcR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png 1272w, https://substackcdn.com/image/fetch/$s_!nVcR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nVcR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png" width="1456" height="735" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:735,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:256626,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/181577815?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nVcR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png 424w, https://substackcdn.com/image/fetch/$s_!nVcR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png 848w, https://substackcdn.com/image/fetch/$s_!nVcR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png 1272w, https://substackcdn.com/image/fetch/$s_!nVcR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4a3e68-88bb-4fda-8654-015425346065_2112x1066.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Pair a planning workflow with an execution workflow and you have a system: Plan &#8594; Build &#8594; Ship.</p><h2>8. Measure Your Progress</h2><p>How do you know you&#8217;re getting better at this?</p><ul><li><p><strong>Longer autonomous runs:</strong> The agent works for 10 minutes without asking a clarifying question.</p></li><li><p><strong>Fewer iteration cycles:</strong> Tasks complete in 1-2 rounds instead of 5-6.</p></li><li><p><strong>Higher first-try success:</strong> Tests pass without manual fixes more often.</p></li></ul><p>When the agent gets stuck, don&#8217;t just fix the code-fix the workflow so it won&#8217;t get stuck there next time. That&#8217;s how you improve a system. </p><p>Thanks for reading. </p><div class="poll-embed" data-attrs="{&quot;id&quot;:419336}" data-component-name="PollToDOM"></div><p></p>]]></content:encoded></item><item><title><![CDATA[AI code has no taste]]></title><description><![CDATA[The shift from writing code to building systems that write code. How to stop optimising for speed and start doing impossible work.]]></description><link>https://newsletter.owainlewis.com/p/ai-code-has-no-taste</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/ai-code-has-no-taste</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Thu, 04 Dec 2025 17:05:19 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/9a2ddd90-ae55-4379-9c3d-7da4f01c781b_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey friend &#128075;,</p><p>AI code has no taste.</p><p>It doesn&#8217;t know why you avoided that dependency. Why you chose that abstraction. Why the naming convention matters. It produces functioning code with no opinion - code that passes review but that nobody wants to maintain.</p><p>Most people see this as AI&#8217;s limitation. I see it as an invitation.</p><p>Because AI code has no taste of its own. It&#8217;s waiting for yours.</p><p>The rules you&#8217;ve built up over years of engineering - the heuristics that live in your head during code review, the patterns you enforce without thinking - they can become explicit. They can become instructions or SOPs.</p><p>And once they do, something interesting happens.</p><p>Your taste scales.</p><h2>The Shift</h2><p>There&#8217;s a transition happening that most engineers haven&#8217;t named yet.</p><p>Most of us are still &#8220;in the loop.&#8221; Prompting back and forth. Reviewing every output. Tweaking, fixing, regenerating. The AI codes faster, but you&#8217;re still the bottleneck.</p><p>The shift is stepping out of the loop entirely.</p><p>You stop doing the work. You start building systems that do the work - better than you could do it yourself, at a scale you could never sustain.</p><p>Not &#8220;AI as assistant.&#8221; Not autocomplete on steroids. Something more fundamental: you encode your taste, your standards, your judgment into a system. Then you let it run.</p><p>The craft doesn&#8217;t disappear. It moves. Out of the code, into the rules that shape the code. Out of the output, into the system that produces the output.</p><p>This is a different job. And it requires an uncomfortable question.</p><h2>The Question Most People Aren&#8217;t Asking</h2><p>When I was managing software teams, I saw the same pattern constantly.</p><p>A team gets stuck on some tedious process. They brainstorm improvements. They shave 10% off the time. Everyone feels productive.</p><p>But that&#8217;s not strategy. That&#8217;s minor optimisation.</p><p>The real question was always simpler: &#8220;Why are we doing this work at all?&#8221;</p><p>I see the same thing with AI now. Most engineers ask: &#8220;How can AI help me do my old work faster?&#8221;</p><p>Fine question. But you&#8217;re still in the loop. Thinking small. The constraint isn&#8217;t the model. It&#8217;s not the tools. It&#8217;s you - still reviewing everything, thinking &#8220;only I can write this code well&#8221;, still the ceiling on what gets shipped.</p><p>The better question: <strong>How do I use AI to do previously impossible work - at a quality level that reflects or exceeds my standards?</strong></p><p>Not &#8220;code faster.&#8221; Build something that couldn&#8217;t exist without the system.</p><p>Let me make this concrete.</p><h2>What This Looks Like</h2><p>A few weeks ago I built a system that generates ambient electronic mixes - focus music for coding. It produces about 2 hours of original music daily, runs a 24/7 YouTube livestream, and keeps going without me.</p><p>I could not do this manually. Not &#8220;it would take a long time&#8221; - I literally could not sustain this output.</p><p>But I didn&#8217;t just prompt &#8220;generate lo-fi beats&#8221; and walk away. That would produce garbage.</p><p>I built filter chains to mix audio to my taste - EQ, compression, analog warmth. I listened to hundreds of outputs to tweak parameters. I encoded my standards into every part of the pipeline.</p><p>The craft didn&#8217;t go into each track. The craft went into the system.</p><p>Same principle applies to code. You don&#8217;t review every line the agent writes. You encode your standards into the rules it follows - architectural preferences, naming conventions, the stuff you&#8217;d flag in PR review. The agent becomes the system. Your taste becomes the instructions.</p><p>You build the system that builds the system.</p><p>So how do you know if you&#8217;ve actually done this, or just built another automation?</p><h2>Two Tests</h2><p><strong>The Impossible Test.</strong> Could a motivated human do this sustainably - without burning out or cutting corners?</p><p>Scheduling a cron job? Automation. An agent that monitors your on-call queue, root-causes incidents, and pushes a fix before you&#8217;ve opened your laptop? No human can do that.</p><p>Linting code? Automation. An agent that reviews every PR against your team&#8217;s architectural principles, catches subtle violations, and explains <em>why</em> it flagged them - across 50 PRs a day, without getting tired or sloppy? Impossible.</p><p><strong>The Craft Test.</strong> Does the output reflect the taste of whoever built it?</p><p>If a different person built this system, would the results be different? If the answer is no, there&#8217;s no craft encoded. Just a generic pipeline anyone could spin up.</p><p>Scale without craft produces slop. Craft without scale means you&#8217;re still in the loop, doing everything yourself.</p><p>Scale <em>plus</em> craft is the unlock.</p><p>But passing both tests once is easy. Keeping quality high over time - that&#8217;s the hard part.</p><h2>Closed Loops</h2><p>Every system drifts toward garbage. Entropy always wins unless you fight it.</p><p>When you&#8217;re coding manually, <em>you</em> are the feedback loop. You notice when something&#8217;s off. In a system that runs without you, that loop has to be built in.</p><p>Three components:</p><p><strong>The Generator</strong> produces output - the LLM, the agent, the pipeline.</p><p><strong>The Sensor</strong> measures quality (evals). Did tests pass? Does it follow conventions? Did latency spike?</p><p><strong>The Controller</strong> enforces your standards. It rejects bad outputs, adjusts parameters, decides what ships and what gets thrown away.</p><p>The craft lives in the Controller. That&#8217;s where you encode &#8220;good enough&#8221; vs &#8220;not good enough.&#8221;</p><p>For code: the Sensor runs your test suite, reviews for coherence across the codebase, checks for simplifications, flags complexity. The Controller decides whether to retry, refactor, or escalate to a human.</p><p>Without this loop, quality drifts. Always. This is where most &#8220;AI slop&#8221; comes from - not bad models, but missing feedback loops.</p><h2>Build This</h2><p>Want to try it? Start small.</p><p>Pick something repetitive - code review, documentation, test generation. Something you do weekly and see if you can encode it as a system that does the work without you. Before you build the pipeline:</p><ol><li><p><strong>Write down your standards.</strong> What makes output &#8220;good&#8221; vs &#8220;acceptable&#8221; vs &#8220;garbage&#8221;?</p></li><li><p><strong>Build a sensor.</strong> How will you measure whether output meets those standards automatically?</p></li><li><p><strong>Build a controller.</strong> What happens when it fails? Retry? Adjust? Flag for review?</p></li></ol><p>You&#8217;ll learn more about your own taste by trying to encode it than you ever did by just doing the work yourself.</p><h2>The New Job</h2><p>Here&#8217;s the thing most engineers haven&#8217;t internalised yet:</p><p>This isn&#8217;t a productivity hack. It&#8217;s a different job.</p><p>You stop doing the work. You start building systems that do the work - better than you could do it yourself, at a scale you could never sustain.</p><p>The engineers who figure this out first will build things the rest of us can&#8217;t compete with. Not because they&#8217;re smarter. Because they stepped out of the loop.</p><p>Different question. Different results.</p><p>Have an awesome week :)</p><div><hr></div><p>Want to build these systems alongside other engineers doing the same? I run a community where we work on real projects and share what&#8217;s actually working.</p><p>&#128073; <a href="https://skool.com/aiengineer">Join the AI Engineering Community</a></p>]]></content:encoded></item><item><title><![CDATA[4 context engineering strategies every AI engineer needs to know]]></title><description><![CDATA[The thing nobody explains about building AI agents.]]></description><link>https://newsletter.owainlewis.com/p/4-context-engineering-strategies</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/4-context-engineering-strategies</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Thu, 27 Nov 2025 17:01:45 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6b8f5ce0-efc5-4409-90a7-957415c3bc78_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey friend &#128075;,</p><p>A few months ago, I was building an AI agent to help engineers debug production issues. The idea was simple: pull logs from multiple sources, find patterns, and explain what went wrong.</p><blockquote><p>&#8220;Search the logs and tell me why this alert fired.&#8221;</p></blockquote><p>The agent would come back with something like:</p><blockquote><p>&#8220;At 14:32 UTC, the checkout service started returning 503 errors. The root cause was the Redis cache hitting memory limits. The issue self-resolved at 14:47.&#8221;</p></blockquote><p>Incredible, right?</p><p>Except it didn&#8217;t work.</p><p>The log data was massive and noisy. Within a few conversational turns, I&#8217;d maxed out the context window. The agent couldn&#8217;t keep all those log outputs in memory. It would start strong, then eventually fail or hallucinate.</p><p>The solution wasn&#8217;t to switch models or add more data. It was to rethink the context management strategy.</p><h2>What Is Context Engineering?</h2><p>When you talk to an AI model, it sees more than just your prompts. Your instructions, the conversation so far, tool call results, documents-all of it sits in this window together.</p><p>Andrej Karpathy has a useful mental model for this: the LLM is the CPU, and the context window is the RAM. It&#8217;s the model&#8217;s working memory. Everything has to fit there.</p><p>But it&#8217;s not just about overflow. Even before you hit the limit, models suffer from &#8220;context rot&#8221;-performance degrades as more tokens are added, even within the window size.</p><p>Think about finding one important note on a desk. Easy with 10 papers. Hard with 1,000. The note is still there-but good luck finding it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UyL8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UyL8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png 424w, https://substackcdn.com/image/fetch/$s_!UyL8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png 848w, https://substackcdn.com/image/fetch/$s_!UyL8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png 1272w, https://substackcdn.com/image/fetch/$s_!UyL8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UyL8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png" width="1456" height="907" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:907,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:206210,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/180013006?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UyL8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png 424w, https://substackcdn.com/image/fetch/$s_!UyL8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png 848w, https://substackcdn.com/image/fetch/$s_!UyL8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png 1272w, https://substackcdn.com/image/fetch/$s_!UyL8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4eae328-3957-41c3-936a-e386e419f2a3_1766x1100.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Drew Breunig outlined four ways bad context breaks your agent:</p><ul><li><p><strong>Context Poisoning:</strong> A hallucination enters context and corrupts all future reasoning</p></li><li><p><strong>Context Distraction:</strong> Too much context overwhelms the model</p></li><li><p><strong>Context Confusion:</strong> Irrelevant information influences responses</p></li><li><p><strong>Context Clash:</strong> Different parts of the context contradict each other</p></li></ul><p>If your agent works at first then drifts later, one of these is usually why.</p><h2>Why This Matters For Agents</h2><p>Here&#8217;s the thing that makes this click: LLMs are stateless.</p><p>They don&#8217;t &#8220;remember&#8221; anything between calls. Every time you call the model, you pass in the entire conversation history via an API call.</p><ul><li><p>&#8594; User asks a question (20 tokens) </p></li><li><p>&#8594; Assistant decides to call a tool (50 tokens) </p></li><li><p>&#8594; Tool returns results (2,000 tokens) </p></li><li><p>&#8594; Assistant reasons about the results (100 tokens) </p></li><li><p>&#8594; ...repeat 50 times...</p></li></ul><p>Eventually, you&#8217;re passing hundreds of thousands of tokens just to generate the next sentence.</p><p>This is the context engineering problem.</p><div><hr></div><div class="poll-embed" data-attrs="{&quot;id&quot;:411450}" data-component-name="PollToDOM"></div><div><hr></div><h2>The Four Strategies</h2><p>So how do you actually manage context? There are four main strategies. We&#8217;ll use Claude Code (a terminal AI agent) as a reference because it uses all of these.</p><h3>1. Write (External Memory)</h3><p>Don&#8217;t keep everything in context. Have your agent write important stuff somewhere external.</p><p>Claude Code writes its plans to disk. It also uses a TodoWrite tool to persist task state. When debugging a complex issue across 15 files, instead of holding &#8220;fixed auth.ts, need to check db.ts, then run tests&#8221; in context, it writes each step to a structured todo list. The todos live outside the window-the agent references them when needed, not constantly.</p><p>Cursor and Windsurf use rules files. ChatGPT saves memories across sessions. Same idea: give your agent a write_to_scratch tool that writes findings and plans to a file. Those notes don&#8217;t cost attention until the agent pulls them back in.</p><h3>2. Select (Just-in-Time Retrieval)</h3><p>Some people dump all docs and tools into context upfront. Don&#8217;t do this.</p><p>Claude Code never reads an entire codebase upfront. It uses Glob to find file paths matching a pattern (e.g., <code>**/*.ts</code>), Grep to locate specific code references, then Read to pull in only the relevant file. A question like &#8220;where is authentication handled?&#8221; triggers a targeted search-not a 50-file dump into context.</p><p>Keep references instead (file paths, database queries). When the agent needs the content, it loads it then. And if you have 50 tools, the model parses 50 descriptions every turn-keep your toolset minimal or dynamically load definitions based on the task.</p><p>Claude Skills was a recent feature that used this approach - it read&#8217;s a description of the tools - not the &#8220;entire&#8221; tool definition. </p><p>Here&#8217;s the algorithm:</p><ol><li><p>Give agent a compressed summary of the tools (&#8220;Use this tool if the user asks about LinkedIn posts&#8221;)</p></li><li><p>Read the full tool description dynamically if the agent thinks it&#8217;s needed</p></li></ol><h3>3. Compress and Prune</h3><p>Even with a 200k token window, a messy context leads to bad answers.</p><p><strong>Summarization:</strong> If you&#8217;ve used Claude Code, you&#8217;ve seen this. When the window fills, it summarizes the conversation-preserving architectural decisions but dropping the exploration that led there.</p><p><strong>Context editing (pruning):</strong> Sometimes you don&#8217;t need a summary. You just need to delete. Anthropic found that simply removing stale tool outputs reduced token usage by 84% on long-running tasks.</p><ul><li><p>Did the agent run a <code>ls -la</code> command 10 turns ago? Delete the output. The model already used that info.</p></li><li><p>Did a tool return 5,000 lines of logs? Summarize it to &#8220;Found 847 errors, 92% were Redis timeouts: org.redis.client.RedisTimeoutException: Redis server response timeout (3000 ms) occured for command: (GET),&#8221; then delete the raw data.</p></li></ul><h3>4. Isolate (Multi-Agent Systems)</h3><p>This is my favourite technique for complex tasks. Instead of one agent drowning in context, split the work.</p><p>Claude Code spawns specialized agents by type: Explore for codebase navigation, Plan for architecture decisions, claude-code-guide for documentation lookup. Each operates in its own context window. If the user asks &#8220;how does billing work?&#8221; and &#8220;what&#8217;s in the docs about webhooks?&#8221;-two agents run in parallel, each with fresh context, returning focused summaries to the main conversation.</p><p>When delegating to a sub-agent, the prompt is compressed: &#8220;Find all API endpoints that modify user data&#8221; rather than passing the full conversation history. The sub-agent explores freely, then returns a summary. The orchestrator never sees the 30 files the sub-agent read-just the 500-token answer.</p><p>Uses more total tokens. Gets better results.</p><h2>TL;DR</h2><p>Managing context is important when building long running AI agents. The window quickly fills up. </p><p>The counterintuitive thing about context windows is that bigger doesn&#8217;t always mean better. A 200k window full of noise performs worse than a 20k window with exactly what matters. Context engineering isn&#8217;t about cramming more in. It&#8217;s about curating what the model sees.</p><p>How to solve this:</p><ul><li><p><strong>Write:</strong> Save state to external files. </p></li><li><p><strong>Select:</strong> Load data only when needed. </p></li><li><p><strong>Compress:</strong> Summarise history and delete stale tool outputs. </p></li><li><p><strong>Isolate:</strong> Use sub-agents to encapsulate high-token tasks.</p></li></ul><p>Back to my logs agent: the fix was combining a few of these strategies together. What felt like a model limitation was actually a context engineering problem.</p><div><hr></div><p>Thanks for reading.</p><p>Have an awesome week : )</p><p>P.S. If you want to go deeper on building AI systems, I run a community where we build agents hands-on: https://skool.com/aiengineer</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.owainlewis.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The AI Engineer. Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[How to build AI RAG agents with the new gemini file tool. ]]></title><description><![CDATA[Google just made RAG stupidly simple.]]></description><link>https://newsletter.owainlewis.com/p/how-to-build-ai-rag-agents-with-the</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/how-to-build-ai-rag-agents-with-the</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Tue, 18 Nov 2025 17:17:09 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/32d3af40-671b-4ba4-aab8-011c13ee5051_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there &#128075;,</p><p>Infrastructure complexity sucks.</p><p>As someone who&#8217;s spent years building public cloud services, I still find it frustrating how much time we waste on undifferentiated infrastructure setup.</p><p>You&#8217;ve probably been here before:</p><p>You want to build a simple RAG agent over your company docs. Suddenly you&#8217;re drowning in vector database setup (which one should I use?), writing complex embedding pipelines, spending weeks on undifferentiated infrastructure and wondering: why is this so hard?</p><p>For production systems with complex needs, this overhead makes sense. But for freelance projects, prototypes, and MVPs? It&#8217;s often a huge time sink. </p><p>Google recently shipped a feature that eliminates most of the headaches around basic RAG: the File Search Tool in the Gemini API.</p><p>I just built a customer support agent using it. The whole thing was a few lines of Python. No vector database. No embedding pipeline. No chunking logic.</p><p>Let me show you exactly how it works.</p><h2>What Is Gemini File Search Tool?</h2><p>Gemini File Search Tool handles the entire RAG pipeline through a simple API:</p><p>Create a store &#8594; Upload documents &#8594; Start querying. That&#8217;s it.</p><p>Behind the scenes, Google handles document parsing, automatic chunking with configurable overlap, semantic search using <code>gemini-embedding-001</code>, and citation extraction with grounding metadata.</p><p>Everything that used to take weeks of infrastructure work now takes one API call.</p><h2>Building a Customer Support Agent (Step by Step)</h2><p>Let me walk you through building a real FAQ agent using actual code from the Google documentation. This kind of setup has a lot of value to businesses (for example allowing employees to get instant answers to their questions rather than waiting for a human to respond). </p><h3>Step 1: Create A File Store </h3><p>The actual API is remarkably simple</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8HO9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8HO9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png 424w, https://substackcdn.com/image/fetch/$s_!8HO9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png 848w, https://substackcdn.com/image/fetch/$s_!8HO9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png 1272w, https://substackcdn.com/image/fetch/$s_!8HO9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8HO9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png" width="1456" height="432" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:432,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:83419,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/179262372?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8HO9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png 424w, https://substackcdn.com/image/fetch/$s_!8HO9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png 848w, https://substackcdn.com/image/fetch/$s_!8HO9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png 1272w, https://substackcdn.com/image/fetch/$s_!8HO9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57101fc0-272f-4ab0-be70-5f348d98ff17_1536x456.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Step 2: Upload Your Docs</h3><p>We list all files in our docs directory and upload them to the file store. That&#8217;s it. Your documents are now chunked, embedded, indexed, and ready to query.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OrP8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OrP8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png 424w, https://substackcdn.com/image/fetch/$s_!OrP8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png 848w, https://substackcdn.com/image/fetch/$s_!OrP8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png 1272w, https://substackcdn.com/image/fetch/$s_!OrP8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OrP8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png" width="1895" height="546" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:546,&quot;width&quot;:1895,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:173111,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/179262372?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd208afd0-054e-4de4-89ac-7816f5f0cb54_1912x648.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OrP8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png 424w, https://substackcdn.com/image/fetch/$s_!OrP8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png 848w, https://substackcdn.com/image/fetch/$s_!OrP8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png 1272w, https://substackcdn.com/image/fetch/$s_!OrP8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57460bd2-65d9-4a1e-9a81-1a32990edb67_1895x546.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Optionally, you can control the chunking if you need to. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q1T6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q1T6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png 424w, https://substackcdn.com/image/fetch/$s_!Q1T6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png 848w, https://substackcdn.com/image/fetch/$s_!Q1T6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png 1272w, https://substackcdn.com/image/fetch/$s_!Q1T6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q1T6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png" width="1456" height="652" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:652,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:138576,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/179262372?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q1T6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png 424w, https://substackcdn.com/image/fetch/$s_!Q1T6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png 848w, https://substackcdn.com/image/fetch/$s_!Q1T6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png 1272w, https://substackcdn.com/image/fetch/$s_!Q1T6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78448751-b5ae-48e7-b0fb-dcf9e0c258c2_1496x670.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Step 3: Create Your Agent </h3><p>You can simply use the file store as a tool when working with Gemini. This makes it incredibly easy. If you&#8217;re using another model or framework, you could wrap the API and provide it as a regular tool. Here&#8217;s the most minimal example of a RAG agent I could come up with.</p><p>This is basically all you need to build RAG agent in Gemini.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!i61t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!i61t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png 424w, https://substackcdn.com/image/fetch/$s_!i61t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png 848w, https://substackcdn.com/image/fetch/$s_!i61t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png 1272w, https://substackcdn.com/image/fetch/$s_!i61t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!i61t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png" width="1642" height="1374" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1374,&quot;width&quot;:1642,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:277658,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/179262372?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1edd94de-6780-4847-8a8d-09d537b524fa_1668x1374.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!i61t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png 424w, https://substackcdn.com/image/fetch/$s_!i61t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png 848w, https://substackcdn.com/image/fetch/$s_!i61t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png 1272w, https://substackcdn.com/image/fetch/$s_!i61t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56fa481e-7d80-4337-a960-86dedb6e056d_1642x1374.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Final Thoughts</h2><p>Google priced this aggressively. You only pay for one off costs when uploading your docs. </p><p>File Search removes the infrastructure tax from basic RAG. It lets you validate your idea in hours instead of weeks.</p><p>If you want the complete code you get get it all <a href="https://owainlewis.com">here</a> for free.</p><div><hr></div><p>Thanks for reading.</p><p>Have an awesome week : )</p><p><strong>P.S.</strong> If you&#8217;re tired of learning this stuff alone, I run a community where ambitious software engineers master production AI by building real working projects together. The fastest path to master AI engineering: <a href="https://skool.com/aiengineer">https://skool.com/aiengineer</a></p>]]></content:encoded></item><item><title><![CDATA[Build production AI agents with LiteLLM (in 70 lines of code)]]></title><description><![CDATA[One interface for every AI provider. Switch between OpenAI, Anthropic, Google in seconds. No refactoring.]]></description><link>https://newsletter.owainlewis.com/p/build-production-ai-agents-with-litellm</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/build-production-ai-agents-with-litellm</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Thu, 06 Nov 2025 17:26:53 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0570eac6-9b34-4913-b118-09a389278627_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>One of the biggest pain points for developers working with Generative AI is the <strong>explosion of incompatible provider SDKs</strong>.</p><p>Every major AI provider has a completely different API. Different request formats. Different response parsing. Different tool schemas. Different authentication.</p><p>Want to switch from OpenAI to Anthropic? That&#8217;s not a config change. That&#8217;s days of refactoring. Every request builder, response parser, error handler, and tool definition has to change.</p><p>Want a fallback when OpenAI goes down? You&#8217;re maintaining parallel implementations. Want to test if Claude is cheaper? You&#8217;re duplicating your entire LLM layer.</p><p>If you hardcode to one provider, you&#8217;re locked in. When new models come out you have to refactor all your code. </p><h2>Frameworks</h2><p>The natural response is reaching for a framework. LangChain, PydanticAI - they promise to abstract away provider differences.</p><p>And they do. But they replace provider lock-in with framework lock-in.</p><p>Now you&#8217;re learning abstractions. What&#8217;s a Chain? What&#8217;s a Runnable? When do you use LLMChain vs ConversationChain? The learning curve is steep.</p><p>Need custom logic? You&#8217;re fighting opinionated patterns. Bug in production? You&#8217;re debugging through abstraction layers trying to figure out what is actually sent to the models. New version ships? Breaking changes force refactoring.</p><p>Here&#8217;s the thing: agent logic is straightforward. Deciding what to do next, calling tools, managing state - that&#8217;s basic software engineering. You can write a capable agent loop in 70 lines of code.</p><h2>What You&#8217;re About to Learn</h2><p>This tutorial shows you how to build production-ready AI agents in simple Python with no complex agent frameworks required.</p><p>We&#8217;ll use LiteLLM - a lightweight library that standardizes all LLM providers onto a single interface.</p><p>You&#8217;ll learn how to:</p><ul><li><p>Write agents that work with any AI model (OpenAI, Anthropic, Google, local models) without needing frameworks. </p></li><li><p>Switch between providers with a single line change</p></li><li><p>Define tools once using a standard schema that works everywhere</p></li></ul><p>The complete agent: ~70 lines of code. No abstractions. No magic. Just simple Python you control and understand.</p><h2>What Is LiteLLM?</h2><p>LiteLLM comes in two forms:</p><p><strong>LiteLLM Python SDK</strong> - For developers building LLM applications who want to integrate directly into their Python code. It provides unified access to 100+ LLMs with built-in retry/fallback logic across multiple deployments.</p><p><strong>LiteLLM Proxy Server</strong> - For teams that need a centralized LLM gateway. This is typically used by Gen AI Enablement and ML Platform teams who want to manage LLM access across multiple projects with unified cost tracking, logging, guardrails, and caching.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q4nK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q4nK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png 424w, https://substackcdn.com/image/fetch/$s_!q4nK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png 848w, https://substackcdn.com/image/fetch/$s_!q4nK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png 1272w, https://substackcdn.com/image/fetch/$s_!q4nK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q4nK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png" width="1456" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:791873,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/178018301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q4nK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png 424w, https://substackcdn.com/image/fetch/$s_!q4nK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png 848w, https://substackcdn.com/image/fetch/$s_!q4nK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png 1272w, https://substackcdn.com/image/fetch/$s_!q4nK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b7a1d43-48a4-4828-bf79-02f00518956c_2050x884.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>LiteLLM has built-in fallbacks and retries. If your primary model fails: maybe OpenAI is down, or you hit a rate limit, LiteLLM can automatically try a series of fallback models in sequence (super useful).</p><p>In this article, we&#8217;re using <em><strong>only the</strong></em> <em><strong>SDK</strong> </em>since it&#8217;s very useful for building simple AI agents that can use different AI models through simple configuration. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4n-V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4n-V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png 424w, https://substackcdn.com/image/fetch/$s_!4n-V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png 848w, https://substackcdn.com/image/fetch/$s_!4n-V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png 1272w, https://substackcdn.com/image/fetch/$s_!4n-V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4n-V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png" width="1456" height="560" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:560,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:166032,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/178018301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4n-V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png 424w, https://substackcdn.com/image/fetch/$s_!4n-V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png 848w, https://substackcdn.com/image/fetch/$s_!4n-V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png 1272w, https://substackcdn.com/image/fetch/$s_!4n-V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abf024d-eba2-481a-b4c8-9484363d7a16_1914x736.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Switching providers with LiteLLM is a single-line config change. Your agent logic doesn&#8217;t change. Your tool definitions don&#8217;t change. Your error handling doesn&#8217;t change. Just the model string.</p><h3>Building the Agent: Tool Definitions</h3><p>Now let&#8217;s build the agent. In order for an AI agent to be useful, it needs a way to call tools to do work (these tools can be simply local deterministic code or remote API calls). </p><p>In LiteLLM, tools are defined using the OpenAI function calling schema. You specify a name, description, and parameters using JSON Schema. These tool definitions work with every LLM provider that supports function calling. OpenAI, Anthropic, Google - they all use the same definitions. No provider-specific schemas.</p><p>Here&#8217;s a tool definition for executing bash commands:</p><pre><code>tools = [
    {
        &#8220;name&#8221;: &#8220;bash_command&#8221;,
        &#8220;description&#8221;: &#8220;Execute a bash command (e.g.&#8216;ps aux&#8217;)&#8221;,
        &#8220;function&#8221;: bash_command,
        &#8220;requires_approval&#8221;: True,
        &#8220;parameters&#8221;: {
            &#8220;type&#8221;: &#8220;object&#8221;,
            &#8220;properties&#8221;: {
                &#8220;command&#8221;: {
                    &#8220;type&#8221;: &#8220;string&#8221;,
                    &#8220;description&#8221;: &#8220;The bash command to execute&#8221;,
                }
            },
            &#8220;required&#8221;: [&#8221;command&#8221;],
        },
    },
]</code></pre><h2>Implementing Tool Execution</h2><p>Tool definitions tell the LLM what tools exist. Tool execution is the code that runs them.</p><p>For <code>bash_command</code>, we implement a function that takes the command, shows it to the user for approval, and executes it using Python&#8217;s <code>subprocess</code> module. When using potentially dangerous tools like this, it&#8217;s good practice to add an approval step (a simple user prompt to confirm if it&#8217;s OK to execute the command). </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!36n1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!36n1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png 424w, https://substackcdn.com/image/fetch/$s_!36n1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png 848w, https://substackcdn.com/image/fetch/$s_!36n1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png 1272w, https://substackcdn.com/image/fetch/$s_!36n1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!36n1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png" width="1456" height="706" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:706,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:172179,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/178018301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!36n1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png 424w, https://substackcdn.com/image/fetch/$s_!36n1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png 848w, https://substackcdn.com/image/fetch/$s_!36n1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png 1272w, https://substackcdn.com/image/fetch/$s_!36n1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d112945-ef0a-420c-9ed6-c5e3290fc676_1542x748.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>After executing the tool, the response gets sent back to the LLM as the tool result. The agent then uses the result to decide what to do next.</p><h2>The Agent Loop</h2><p>Here&#8217;s where it all comes together. The agent loop is the orchestration logic that makes this an agent instead of just an LLM call.</p><blockquote><p>Moving forward, when I talk about agents I&#8217;m going to use this:</p><p><strong>An LLM agent runs tools in a loop to achieve a goal.</strong></p><p>https://simonwillison.net/2025/Sep/18/agents/</p></blockquote><p>We start with the user&#8217;s query. Then we loop:</p><ol><li><p>Call the LLM with messages and tools</p></li><li><p>If the LLM wants to call a tool, we execute it and add the result to the conversation</p></li><li><p>If the LLM doesn&#8217;t call any tools, we have the final answer</p></li></ol><pre><code># AI agent loop: Call LLM &#8594; Execute tools &#8594; Repeat until done
for _ in range(self.max_iterations):
    
    # Ask LLM what to do passing in previous context
    response = 
    completion(model=self.model, messages=self.messages, tools=tools)
    message = response.choices[0].message
    
    # No tools needed? Return final answer to user!
    if not message.tool_calls:
        return message.content
        
    # Execute each tool the LLM wants
    for tool_call in message.tool_calls:
        result = execute_tool(tool_call)
        self.messages.append(result)</code></pre><p>The LLM sees tool results and decides what to do next. Maybe it calls another tool. Maybe it has enough information to answer.</p><h2>Why This Approach Works</h2><p>This core agent is about 70 lines of code. No framework. No abstractions. Just clear Python.</p><p>The agent logic is straightforward because we&#8217;re not fighting provider integrations. LiteLLM handles all the provider-specific complexity. We focus on the orchestration: deciding what to do, executing tools, managing state.</p><h2>Production Considerations</h2><p>This is a minimal example, but it&#8217;s production-ready in the sense that it&#8217;s simple, debuggable, and extensible.</p><p>Want to add more tools? Define them and add to the tools list.</p><p>Want custom approval logic? Add it to the tool execution functions.</p><p>Want logging or tracing? Simple to add or use a gateway. </p><p>Want to stream responses? LiteLLM supports that too.</p><p>Need LLM retries? Already supported by LiteLLM. </p><p>The code is yours. You understand it. You control it. No framework magic to debug or work around.</p><h2>The Bottom Line</h2><p>Building AI agents doesn&#8217;t require heavyweight frameworks. It requires standardized model access. LiteLLM gives you that standardization. One interface for 100+ providers. Standardized tool calling. Built-in fallbacks. Automatic cost tracking.</p><p>The result is agent code that&#8217;s simpler, more portable, and easier to maintain. You can switch providers in seconds. You can add complex logic without fighting abstractions. You can debug issues without diving through framework internals.</p><p>The complete code is on <a href="https://github.com/the-ai-engineer/ai-engineer-tutorials/blob/main/src/02-ai-agents-lite-llm/02-agent.py">GitHub</a> with examples.</p><div><hr></div><p>Thanks for reading. If you find mistakes, disagree, have feedback, or want to chat about anything <a href="https://www.linkedin.com/in/lewisowain/">send me a DM</a>. </p><p>Have an awesome week : )</p><p>P.S. If you&#8217;re serious about mastering AI engineering and want to build production systems like this, join the <strong>AI Engineer Community</strong> where we build projects from scratch and learn by doing. Inside: AI Agents from Scratch course ($499 value), hands-on projects, insider lessons from 20 years in tech, and direct access to a community of ambitious engineers. &#8594; <a href="https://skool.com/aiengineer">https://skool.com/aiengineer</a></p><p></p>]]></content:encoded></item><item><title><![CDATA[How I’m using Claude skills to become 10x more productive]]></title><description><![CDATA[The new Claude feature that's saving me hours every single day.]]></description><link>https://newsletter.owainlewis.com/p/how-im-using-claude-skills-to-become</link><guid isPermaLink="false">https://newsletter.owainlewis.com/p/how-im-using-claude-skills-to-become</guid><dc:creator><![CDATA[Owain Lewis]]></dc:creator><pubDate>Wed, 22 Oct 2025 16:02:43 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0e3126f7-6e83-43e7-89dd-519c70aa72ac_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there &#128075;,</p><p>You&#8217;ve probably done this a hundred times:</p><p>Start a new Claude chat. Copy your prompts. Paste your context. Explain how you want things formatted. Get your result. Close the chat. Next day? Start over.</p><p>Anthropic just released Claude Skills, and it eliminates this entire cycle.</p><p>Skills turn your repetitive explanations into reusable, executable instructions that Claude loads automatically whenever needed.</p><p><strong>Before Skills:</strong> &#8220;I need to explain this to Claude again.&#8221;</p><p><strong>After Skills:</strong> &#8220;I should turn this into a skill so I never explain it again.&#8221;</p><p>I&#8217;ve personally found this a really useful feature (I&#8217;m obsessed with it), and so I&#8217;ll explain some of the practical ways you can use Claude skills in your own work.</p><h2>What are Claude Skills?</h2><p>Claude Skills are portable instruction packages that teach Claude how to perform specialized tasks your way.</p><p>A Skill is simply a folder that contains:</p><ul><li><p><strong>SKILL.md</strong> &#8212; A markdown file with your instructions, patterns, and best practices</p></li><li><p><strong>Supporting resources</strong> &#8212; Scripts, templates, configuration files, or reference code</p></li><li><p><strong>Execution logic</strong> &#8212; Optional code for tasks that need to work the same way every time</p></li></ul><p>The beauty of Skills<strong>:</strong> They work everywhere: Claude UI, Claude Code, and the API. You build the Skill once, and it follows you across every Claude product you use.</p><p>Claude Skills are folders that store instructions, scripts, and workflows. Claude loads them only when relevant. I think of them like SOPs or plugins that codify how to do a specific task. You write the skill once (e.g code review skill, LinkedIn post writing skill) and can use it in the UI, in your agents, and in Claude Code. </p><p>When a conversation starts, Claude scans skill names and descriptions (maybe 50 tokens per skill). If something matches, it loads the full details. You can have dozens of skills available without wasting context. This is an important point (e.g. an MCP server can consume a lot of the context window). </p><p>Each skill is just a folder with a <code>SKILL.md</code> file and any other supporting files or scripts (skills can run code). </p><pre><code><code>---
name: your-skill-name
description: what this skill helps with
---

# Instructions
[Be specific about structure, tone, and output.]
</code></code></pre><p>Zip it, upload once, and you can use it in every Claude product: desktop, Claude Code, and via the API.</p><h2>Example: Code Review Skill</h2><p>Here&#8217;s an example of what a simple skill might look like for code review standards in your team. </p><pre><code>---
name: Code Review
description: Review code for security, performance, and team standards. Checks for SQL injection, XSS vulnerabilities, and enforces naming conventions.
version: 1.0.0
---

# Team Code Review Skill

## Overview
This Skill performs comprehensive code reviews based on our team&#8217;s standards, security requirements, and performance best practices.

## When to Use This Skill
- User asks to &#8220;review this code&#8221; or &#8220;check my PR&#8221;
- User mentions &#8220;security review&#8221; or &#8220;performance check&#8221;
- Before committing significant changes

## Review Checklist

### Performance Checks
1. **Database Queries**
   - Flag N+1 query patterns
   - Suggest eager loading for related data
   - Recommend indexing for frequently queried fields

2. **Code Efficiency**
   - Identify unnecessary loops or redundant operations
   - Suggest caching for expensive computations
   - Flag blocking operations in async contexts

### Team Standards
1. **Naming Conventions**
   - Functions: `snake_case`
   - Classes: `PascalCase`
   - Constants: `UPPER_SNAKE_CASE`
   - Private methods: `_leading_underscore`

2. **File Structure**
   - Routes in `/routes`
   - Models in `/models`
   - Business logic in `/services`
   - Utilities in `/utils`

...</code></pre><h2>Skills vs Projects vs MCP</h2><p>You might be thinking: &#8220;Isn&#8217;t this what Model Context Protocol (MCP) solves?&#8221;</p><p>Not quite. Here&#8217;s the distinction:</p><p><strong>MCP (Model Context Protocol):</strong></p><ul><li><p>Connects Claude to external data sources and tools</p></li><li><p>Examples: Databases, APIs, file systems, Slack, GitHub</p></li><li><p>Purpose: Giving Claude ACCESS to information</p></li></ul><p><strong>Skills:</strong></p><ul><li><p>Teach Claude HOW to perform tasks</p></li><li><p>Examples: Code patterns, workflows, best practices</p></li><li><p>Purpose: Giving Claude EXPERTISE and standards</p></li></ul><h2>What Should You Use Skills For?</h2><p>Build skills for any task do more than once:</p><p><strong>LinkedIn posts:</strong> Specific format, tone, and structure that works for technical content. No more copy pasting guidelines or having to explain how to write a hook back-and-forth.</p><p><strong>Video prompts:</strong> Instructions for generating cinematic prompts for AI video tools like Veo3. Detailed formula, examples, cinematography best practices all packaged up.</p><p><strong>Document writing:</strong> Templates for documents you write often (e.g. a PRD writing skill). </p><p>The pattern: any task I do repeatedly goes into a skill.</p><h2>3. How To Create Skill In The UI</h2><p>The quickest way to create a skill is to use the built in skill builder feature in Claude. Just ask it to help you build a skill and describe what you need it to do. It will create a skill and package it up as a zip file you can upload. </p><p>To upload your skill: Go to Settings &gt; Capabilities &gt; Skills </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sGxq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sGxq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png 424w, https://substackcdn.com/image/fetch/$s_!sGxq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png 848w, https://substackcdn.com/image/fetch/$s_!sGxq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!sGxq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sGxq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png" width="1456" height="796" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:796,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:326013,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/176772233?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sGxq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png 424w, https://substackcdn.com/image/fetch/$s_!sGxq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png 848w, https://substackcdn.com/image/fetch/$s_!sGxq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!sGxq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e938258-be55-4e7c-a8e1-e73c1759ac28_2316x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Once uploaded you can use it in any chat: &#8220;Use the LinkedIn skill to turn this article into a high quality post&#8221;.</p><h2>4. How To Create A Skill In Claude Code</h2><p>Skills can be used in Claude Code. I often use Claude Code for non obvious tasks like writing, but you can also add skills for code review, API standards e.t.c. </p><p>Simply put your skills in: .claude/skills/.</p><h2>5. How To Create A Skill In Python</h2><p>If you&#8217;re building AI systems, you can manage skills programmatically.</p><p>You can create, update, list, and delete skills via the API and use them in your AI projects or workflows with the Anthropic client.</p><p>If you want to get a full code example, I&#8217;ve shared an example on Github <a href="https://github.com/the-ai-engineer/ai-engineer-tutorials/tree/main/src/claude-skills">here</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q2xC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q2xC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png 424w, https://substackcdn.com/image/fetch/$s_!q2xC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png 848w, https://substackcdn.com/image/fetch/$s_!q2xC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png 1272w, https://substackcdn.com/image/fetch/$s_!q2xC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q2xC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png" width="1456" height="795" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:795,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:368152,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.owainlewis.com/i/176772233?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q2xC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png 424w, https://substackcdn.com/image/fetch/$s_!q2xC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png 848w, https://substackcdn.com/image/fetch/$s_!q2xC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png 1272w, https://substackcdn.com/image/fetch/$s_!q2xC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcda15883-4aa9-4eed-94e3-9af977ce5c19_2770x1512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This opens up workflows you couldn&#8217;t do before:</p><ul><li><p>Version control your skills in GitHub</p></li><li><p>Deploy them programmatically in CI/CD</p></li><li><p>Test them systematically</p></li><li><p>Build systems that leverage consistent expertise at scale</p></li><li><p>Reuse skills across AI projects</p></li></ul><p>The API approach means your skills become infrastructure. They&#8217;re not just UI features - they&#8217;re components in your production systems.</p><h2>Summary</h2><p>I&#8217;ve found this feature really helpful for productivity and think a lot of providers will copy it. </p><p>Here&#8217;s what I recommend: Pick ONE thing you explain to Claude repeatedly this week. Document it in a SKILL.md file. Test it. Refine it. Next week, add another. In three months, you&#8217;ll have a library of expertise that follows you everywhere. Same standards. Same patterns. Same quality. </p><p>The developers who win with AI aren&#8217;t the ones writing better prompts. They&#8217;re the ones who systematically document their expertise, then apply it across every interaction with AI.</p><p>Thanks for reading. </p><p>Have an awesome week : )</p><h2>Useful Links</h2><ul><li><p>https://www.anthropic.com/news/skills</p></li><li><p>https://github.com/the-ai-engineer/ai-engineer-tutorials/tree/main/src/claude-skills</p></li><li><p>https://github.com/anthropics/claude-cookbooks/tree/main/skills/notebooks</p></li></ul><div><hr></div><p>P.S. Want to become the engineer companies fight to hire? Join my private AI Engineering Community - ship production systems through hands-on projects, get exclusive courses, and tactics I only share with members. </p><p>Get access here: https://skool.com/aiengineer</p>]]></content:encoded></item></channel></rss>