<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Jellyfish Research]]></title><description><![CDATA[Insights about software engineering teams from the Jellyfish Research team]]></description><link>https://jellyfishresearch.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!Kioy!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57d4c023-fbf3-4996-b9dd-1cb76e830f75_1280x1280.png</url><title>Jellyfish Research</title><link>https://jellyfishresearch.substack.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 23 Jun 2026 19:12:19 GMT</lastBuildDate><atom:link href="https://jellyfishresearch.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Nik Albarran]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[jellyfishresearch@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[jellyfishresearch@substack.com]]></itunes:email><itunes:name><![CDATA[Nik Albarran]]></itunes:name></itunes:owner><itunes:author><![CDATA[Nik Albarran]]></itunes:author><googleplay:owner><![CDATA[jellyfishresearch@substack.com]]></googleplay:owner><googleplay:email><![CDATA[jellyfishresearch@substack.com]]></googleplay:email><googleplay:author><![CDATA[Nik Albarran]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[The State of AI: H1 2026]]></title><description><![CDATA[Exploding Spend, Diverging Returns]]></description><link>https://jellyfishresearch.substack.com/p/the-state-of-ai-h1-2026</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/the-state-of-ai-h1-2026</guid><dc:creator><![CDATA[Tomas Pardinas]]></dc:creator><pubDate>Tue, 23 Jun 2026 14:20:58 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6520301f-379d-4f93-abcf-7e5aaf6380bb_1962x1134.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1><strong>Exploding Spend, Diverging Returns</strong></h1><p>2026 was a big year for AI, with agents taking the main stage and tools like Claude Code going mainstream. We put together a summary of what we've researched so far. The story of H1 2026 is <strong>Exploding Spend, Diverging Returns</strong>. You can read the first half of our report below &#8212; and if you want the full version, you can download it <a href="https://jellyfish.co/state-of-ai-software-engineering/">here</a>.</p><h1><strong><span>Introduction</span></strong></h1><p style="text-align: justify;"><span>In March 2026, we saw the first user spend more on tokens in a month than a junior developer earns. Power users are bending what is possible in the new AI-SDLC, pulling away from the median engineer. Spend is exploding, returns are diverging, and most teams cannot yet tell which side of that divergence they are on.</span></p><p style="text-align: justify;"><span>Whether AI helps engineers write code is no longer the interesting question. The sharper one is about outcomes: can an agent accomplish a goal when it is set up for success, and what does that setup actually cost? Two more sit behind it. Why is the bill rising faster than the sticker price, and why do a few teams convert tokens into throughput while most do not?</span></p><p style="text-align: justify;"><span>This release concentrates our H1 2026 research from Jellyfish to answer those questions.</span></p><p style="text-align: justify;"><strong><span>Exploding spend, diverging returns</span></strong><span> is the story of H1 &#8217;26. This report highlights a major shift in how AI is used in software development:</span></p><ol><li><p style="text-align: justify;"><strong><span>The cost shock. </span></strong><span>Spend has gone from a rounding error to a real constraint, rising 16x across the board with a tail that now rivals a junior engineer&#8217;s salary.</span></p></li><li><p style="text-align: justify;"><strong><span>The economics of tokens. </span></strong><span>What AI actually costs, why effective cost keeps rising even when sticker prices hold still, and where more tokens stop buying more output.</span></p></li><li><p style="text-align: justify;"><strong><span>The harness advantage. </span></strong><span>What the best teams invest in instead of raw spend: context, harnesses, and agent-friendly codebases.</span></p></li></ol><p style="text-align: justify;"><span>Leading teams are chasing what we call the &#8220;AI pop&#8221;: a clear jump in productivity and throughput. Standing in their way is what our research calls the</span><strong><span> agentic barrier:</span></strong><span> the hurdles in human attention, infrastructure, and cost that teams hit as they push agents further. There is no single way through it, but the industry is pressing on two fronts:</span></p><ul><li><p style="text-align: justify;"><span>Running agents on more complex, longer tasks with less supervision: </span><strong><span>Autonomy</span></strong></p></li><li><p style="text-align: justify;"><span>Running more agents in parallel: </span><strong><span>Concurrency</span></strong></p></li></ul><p style="text-align: justify;"><span>Both patterns drive an explosion of token consumption. At the same time, new models and agentic features keep raising the effective cost of running the same workload, so bills pile up and become unsustainable for users who are not converting tokens into productivity. A minority is converting, and converting hard. That is the divergence we are witnessing.</span></p><p style="text-align: justify;"><span>AI is collapsing old processes and creating new ones at an extremely fast pace. We hope this research helps you navigate what comes next in 2026.</span></p><p><strong><span>PART 1</span></strong></p><h1><strong><span>The cost shock - explosion of intelligence</span></strong></h1><p style="text-align: justify;"><em><span>Cost is rising across the board, with a runaway tail.</span></em></p><p style="text-align: justify;"><span>For most of 2025, AI spend in software engineering was a rounding error. It was a predictable utility: you spent a little more on tokens, you got a little more code. It scaled linearly, cleanly, and predictably.</span></p><p style="text-align: justify;"><span>In the first half of 2026, that relationship broke.</span></p><p style="text-align: justify;"><span>The shift to autonomous, long-running agentic workflows didn&#8217;t just increase usage; it fundamentally changed the unit economics of software development. As teams moved from &#8216;AI as a chat assistant&#8217; to &#8216;AI as an engineer,&#8217; token consumption exploded&#8212;not by percentage points, but by orders of magnitude. By March 2026, we saw the first users spending more on tokens in a month than a junior developer earns. We call it the </span><strong><span>cost shock</span></strong><span>: a structural decoupling of spend from output that has caught the industry off guard.</span></p><p style="text-align: justify;"><strong><span>Cost per user has risen across the board:</span></strong><span> at a typical company the median user went from about $5 a month to $81 in ten months, a 16x climb, and even mid-tier (P75) companies went from $10 to $170. This is a broad-based shift in what software development costs in the new AI-SDLC.</span></p><h3 style="text-align: justify;"><span>Cost in AI is rising broadly for all companies, a typical company&#8217;s cost per user grew 16x in ten months</span></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cqSO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cqSO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png 424w, https://substackcdn.com/image/fetch/$s_!cqSO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png 848w, https://substackcdn.com/image/fetch/$s_!cqSO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png 1272w, https://substackcdn.com/image/fetch/$s_!cqSO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cqSO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png" width="2045" height="968" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:968,&quot;width&quot;:2045,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:164820,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cqSO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png 424w, https://substackcdn.com/image/fetch/$s_!cqSO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png 848w, https://substackcdn.com/image/fetch/$s_!cqSO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png 1272w, https://substackcdn.com/image/fetch/$s_!cqSO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd164548f-7f2a-4c40-a7b9-6ff29cc331aa_2045x968.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><strong><span>Figure 1. </span></strong><em><span>Per-company cost per user by company percentile, Aug 2025 to May 2026. The median company&#8217;s typical user went from $5 to $81 a month, a 16x rise, increase is broad-based.</span></em></p><h2><strong><span>Spend is rising across the board, and the extremes are shocking</span></strong></h2><p style="text-align: justify;"><span>On top of that broad rise sits an </span><strong><span>extremely skewed tail</span></strong><span>. In May 2026, more than half of users still spent under $100 a month, yet the mean ($248) ran 3.5x the median and the 99th percentile ($2,452) ran 35x it. A small number of power users spend enormously, and </span><strong><span>at the very end of that tail the numbers start to rival human salaries.</span></strong></p><p style="text-align: justify;"><span>Charting the single heaviest user&#8217;s monthly spend against a US junior developer&#8217;s salary (about $12,500 a month) shows how far the tail reaches: that user touched and at times exceeded the line in spring 2026.</span></p><h3 style="text-align: justify;"><span>At the very tail, a single user&#8217;s bill has touched a junior engineer&#8217;s salary</span></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7Ofw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7Ofw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png 424w, https://substackcdn.com/image/fetch/$s_!7Ofw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png 848w, https://substackcdn.com/image/fetch/$s_!7Ofw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png 1272w, https://substackcdn.com/image/fetch/$s_!7Ofw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7Ofw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png" width="2048" height="937" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:937,&quot;width&quot;:2048,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:183677,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7Ofw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png 424w, https://substackcdn.com/image/fetch/$s_!7Ofw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png 848w, https://substackcdn.com/image/fetch/$s_!7Ofw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png 1272w, https://substackcdn.com/image/fetch/$s_!7Ofw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e7cdf06-632a-4d81-939c-ddc82f8d73ad_2048x937.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><strong><span>Figure 2. </span></strong><em><span>Highest single-user monthly spend vs a US junior developer salary (~$12,500/mo). The tail touches salary territory but does not hold there.</span></em></p><p style="text-align: justify;"><span>Although this behavior is concentrated only on the top few percent of users at the top few percent of companies, it might be a light of the future we might expect on AI usage in software development.</span></p><p style="text-align: justify;"><span>Dollars follow tokens, so the next question is how fast token consumption is climbing, and then why the bill rises even when prices do not.</span></p><h2><strong><span>Token spend tripled since October, and power users are decoupling from the median</span></strong></h2><p style="text-align: justify;"><span>Tokens are how LLMs process text, and AI labs&#8217; business models are based on how tokens are priced and charged per million.</span></p><p style="text-align: justify;"><span>We have been tracking token usage, and in January something changed. November 2025 brought the release of Opus 4.5, Anthropic&#8217;s main model, and usage patterns shifted sharply afterward.</span></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g6-v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g6-v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png 424w, https://substackcdn.com/image/fetch/$s_!g6-v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png 848w, https://substackcdn.com/image/fetch/$s_!g6-v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png 1272w, https://substackcdn.com/image/fetch/$s_!g6-v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g6-v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png" width="1456" height="837" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:837,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g6-v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png 424w, https://substackcdn.com/image/fetch/$s_!g6-v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png 848w, https://substackcdn.com/image/fetch/$s_!g6-v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png 1272w, https://substackcdn.com/image/fetch/$s_!g6-v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad8b32a-2cf0-4d63-b375-c93a925fc26b_1472x846.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><strong><span>Figure 3. </span></strong><em><span>Weekly Claude Code tokens per user, by percentile. Median usage is flat; the top decile separates and accelerates.</span></em></p><p style="text-align: justify;"><span>Token consumption climbed steeply from the start of 2026, tripling for power users (P90) from 50M to 170M.</span></p><p style="text-align: justify;"><span>We can also see a </span><strong><span>divergence</span></strong><span> between power users (P90) and the median (P50). The first group </span><strong><span>completely decoupled</span></strong><span> from the rest of the group, benefiting from the increase in throughput and widening the </span><strong><span>gap</span></strong><span> from a 50M token difference to 150M tokens.</span></p><h2><strong><span>Who is justifying the bill, and why most cannot follow</span></strong></h2><p style="text-align: justify;"><span>If a thin tail is pulling away, the obvious question is what that tail does differently. Three frontier behaviors separate them: they put more work through agents, they let agents run longer with less supervision, and they push on running more agents at once. Each one helps explain who sits on the paying-off side of the divergence, and why the rest are locked out for now.</span></p><h3 style="text-align: justify;"><span>Autonomous agents are spreading, leading teams now run 1 in 3 PRs through agents</span></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T2oR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T2oR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png 424w, https://substackcdn.com/image/fetch/$s_!T2oR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png 848w, https://substackcdn.com/image/fetch/$s_!T2oR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png 1272w, https://substackcdn.com/image/fetch/$s_!T2oR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T2oR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png" width="2048" height="939" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:939,&quot;width&quot;:2048,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:137177,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T2oR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png 424w, https://substackcdn.com/image/fetch/$s_!T2oR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png 848w, https://substackcdn.com/image/fetch/$s_!T2oR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png 1272w, https://substackcdn.com/image/fetch/$s_!T2oR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f569669-7f2f-42ed-91d6-9e1d30a48fb7_2048x939.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong><span>Figure 4. </span></strong><em><span>Share of pull requests opened by autonomous agents, by company percentile. The median holds near 2% while the 90th percentile bends steadily upward.</span></em></p><p style="text-align: justify;"><span>The clearest picture of the widening gap is the </span><strong><span>share of pull requests opened by autonomous agents</span></strong><span>, tracked at the company level. At the median, the line barely lifts off 2%. At the 90th percentile it rises and keeps rising: 10% in January, 14.5% by the end of February, 35% by April, against roughly 2% a year earlier. The top of the market is not adopting agents so much as compounding on them, and the distance to the median grows month over month rather than closing.</span></p><p style="text-align: justify;"><span>What are those leading teams doing differently while running autonomous agents? Sorting Claude Code agents by session length and by how often a human interrupted them, supervision barely matters in short and medium sessions.</span></p><p style="text-align: justify;"><span>In the longest sessions, the top 1% by duration, the pattern turns stark: runs left largely alone (under 30% human intervention) produce </span><strong><span>about eleven times the net lines of code of heavily supervised runs of the same length</span></strong><span>. Low supervision drives the most lines of code, but the biggest impact lands in the power-user segment. What duration measures here is whether the agent was set up to succeed without rescue, and the output gap suggests </span><strong><span>power users are engineering that setup differently from their peers</span></strong><span>.</span></p><h3 style="text-align: justify;"><span>Long-running agents with light supervision produce 11x more code</span></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wCdu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wCdu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png 424w, https://substackcdn.com/image/fetch/$s_!wCdu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png 848w, https://substackcdn.com/image/fetch/$s_!wCdu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png 1272w, https://substackcdn.com/image/fetch/$s_!wCdu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wCdu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png" width="1456" height="912" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:912,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wCdu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png 424w, https://substackcdn.com/image/fetch/$s_!wCdu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png 848w, https://substackcdn.com/image/fetch/$s_!wCdu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png 1272w, https://substackcdn.com/image/fetch/$s_!wCdu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fcf7f80-ea45-48b1-bad3-62c6c99df55c_1472x922.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><strong><span>Figure 5. </span></strong><em><span>Average net lines per turn by session length and supervision level. Teams with low supervision and long running turns generate 11x more net code than turns with high supervision</span></em></p><p style="text-align: justify;"><span>The longest sessions in the dataset, the top 0.1%, ran past 94 minutes, and 83% of them involved little human contact start to finish. According to </span><strong><a href="https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/"><span>METR</span></a></strong><span>, the length of tasks AI can do is doubling every 7 months (at 50% success rate). At this pace, by the end of 2026 we will have agents that will be able to run for as long as 4 hours on a single task.</span></p><p style="text-align: justify;"><span>Setting agents up to complete long tasks autonomously is becoming a huge unlock, but only P99 users are seeing the gains so far. If net lines of added code is a signal of productivity of long workflows - power users understand how to run them and have increased productivity.</span></p><p style="text-align: justify;"><strong><span>Autonomy</span></strong><span> is one of the challenges that teams are still trying to overcome. We asked ourselves: how many of these agents are power users able to run </span><strong><span>concurrently</span></strong><span>? To do so, we studied agent concurrency.</span></p><h3 style="text-align: justify;"><span>Concurrency hits a ceiling: 84% run just one or two agents at once</span></h3><p style="text-align: justify;"><span>If running agents for long periods of time in a loop with a goal until it completes is what unlocks output, the obvious next step is to run </span><strong><span>more agents</span></strong><span> at once. But we discovered that high concurrency levels are rare.</span></p><p style="text-align: justify;"><span>Measuring peak concurrent agent use across customers&#8217; turns (each session is composed of multiple turns) hit a ceiling: </span><strong><span>84% of active users top out at one or two agents at a time</span></strong><span>, and even the small group launching four or more spends </span><strong><span>over 80% of session time attending to a single agent</span></strong><span>.</span></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j7HH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j7HH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png 424w, https://substackcdn.com/image/fetch/$s_!j7HH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png 848w, https://substackcdn.com/image/fetch/$s_!j7HH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png 1272w, https://substackcdn.com/image/fetch/$s_!j7HH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j7HH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png" width="2048" height="1125" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1125,&quot;width&quot;:2048,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:270709,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j7HH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png 424w, https://substackcdn.com/image/fetch/$s_!j7HH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png 848w, https://substackcdn.com/image/fetch/$s_!j7HH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png 1272w, https://substackcdn.com/image/fetch/$s_!j7HH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9adadba-0f2a-4a51-b5c4-24af852f8e34_2048x1125.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><strong><span>Figure 6. </span></strong><em><span>Peak concurrent agents per active user. Almost everyone runs one or two; the agentic barrier is human attention, not tooling.</span></em></p><p style="text-align: justify;"><span>This is the </span><strong><span>agentic barrier</span></strong><span> at its clearest. Human attention is the blocker on multi-agent workflows, and attention does not parallelize. Scaling infrastructure to orchestrate more than a pair of agents is not trivial either, and the industry is still building the tooling for users to scale agents. As Karpathy put it, </span><em><a href="https://x.com/karpathy/status/2031767720933634100"><span>we will need a bigger IDE</span></a></em><span>.</span></p><p style="text-align: justify;"><span>While agent concurrency currently faces a functional ceiling, the market is introducing solutions to address these limitations. This includes Anthropic&#8217;s release of agent teams and platforms like </span><a href="https://www.conductor.build/"><span>Conductor</span></a><span> that facilitate parallel agent execution. For organizations aiming to scale productivity, prioritizing new infrastructure and tools alongside strategies to manage human attention challenges has become essential. We think that the 16% of users running 3+ agents in multi-agentic workflows are highly correlated to teams spreading autonomous agents and getting the productivity gains we see in PRs from Figure 4.</span></p><p style="text-align: justify;"><span>Concentrated gains among elite teams are driven by these frontier behaviors. Because human attention remains a non-parallel resource, most users face rising costs with diminishing returns, while only a few outliers successfully convert high spend into significant output. Cost shock is affecting companies broadly, but these frontier capabilities are creating a divergence in productivity that median users aren&#8217;t seeing.</span></p><p style="text-align: justify;"><span>These three behaviors are why the gains stay concentrated. Attention does not parallelize, so only outliers turn heavy spend into output while everyone else pays a rising bill for less. The next part looks at why the bill climbs even when sticker prices hold still, and why more tokens eventually stop buying more output.</span></p><p><strong><span>PART 2</span></strong></p><h1><strong><span>The economics of tokens</span></strong></h1><p style="text-align: justify;"><em><span>Where the cost comes from, and why more stops working.</span></em></p><p style="text-align: justify;"><span>At GTC 2026, Jensen Huang told the room that a $500,000 engineer who isn&#8217;t spending </span><a href="https://www.rdworldonline.com/nvidia-ceo-jensen-huang-says-spend-250k-on-ai-tokens-annually-or-hell-go-ape/"><span>$250,000 a year on tokens is a wasted seat</span></a><span>. Tokens, in his framing, now belong in the comp package. Sam Altman went further, floating universal basic compute for everyone. The industry pitch is that more tokens are always worth buying, and this part is where that pitch meets the invoice. We looked into whether it is always true, and whether AI labs genuinely mean it or are selling a narrative.</span></p><p style="text-align: justify;"><span>To deep dive into this, we decided to study the price per token: how is the </span><strong><span>effective cost</span></strong><span> affected by the release of a new model, how are new </span><strong><span>agentic features</span></strong><span> pushing token consumption, and whether </span><strong><span>tokenmaxxing</span></strong><span> is the right strategy for companies to build.</span></p><h3><em>If you want to read the full report you can do it <a href="https://jellyfish.co/state-of-ai-software-engineering/">here</a></em></h3><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://jellyfishresearch.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://jellyfishresearch.substack.com/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[How building an MCP App changed the way we understand the value of MCP]]></title><description><![CDATA[It&#8217;s been over a year now since we released our Jellyfish MCP.]]></description><link>https://jellyfishresearch.substack.com/p/how-building-an-mcp-app-changed-the</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/how-building-an-mcp-app-changed-the</guid><dc:creator><![CDATA[Sofia Thompson]]></dc:creator><pubDate>Thu, 11 Jun 2026 13:30:36 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1d7b77ff-437f-46fc-9bbf-ea1b2fbc679c_2400x1256.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>It&#8217;s been over a year now since we released our Jellyfish MCP. We&#8217;ve learned a ton, had a lot of fun keeping up with the latest technological developments, and delivered real tangible value to our customers along the way. Since the beginning of this year alone, usage (requests) has nearly tripled and the number of customers using our MCP has grown around 40% month over month.</p><p>The most important value that it brings to <em>us</em>, however, is how usage patterns and customer interactions with it inform our thinking about the future of our product. As more and more SaaS companies move towards a &#8220;headless&#8221; product vision, we&#8217;ve been thinking a lot about where Jellyfish fits into that world, and the MCP has provided a perfect testing ground for our hypotheses and experiments.</p><p>For a product like ours, MCP isn&#8217;t just a new piece of UI. Its promise is deeper: a flexible, conversational access point to the underlying data. A user can point an LLM at our server and interrogate their Jellyfish data in essentially any way they want, including questions we never built a view for in the web app. The data stops being trapped behind the views we happened to build. The user brings their own client, asks whatever they want, and gets it back in whatever shape is useful that day: a table, a summary, a chart, without us having to anticipate the question in advance.</p><p>So when Anthropic launched MCP Apps earlier this year, we were thrilled by the thought of being able to offer our customers a new, highly flexible way to interact with their data. We thought this might be the thing we had been waiting for, to finally bridge the gap between our MCP and the main primary product surface. We had to try it out, and we found what we thought was a perfect use case.</p><h2>The Experiment</h2><p>A customer of ours, a Jellyfish MCP power user, had been running a Monday-morning ritual on top of our MCP server for months. He&#8217;d paste a carefully refined prompt into Claude Desktop and it would pull a curated subset of his initiatives from Jellyfish, format them into a card grid, and generate a week-over-week prose summary. He&#8217;d built the workflow himself, refined it over time, and while a few of the steps were manual and a little annoying, for the most part it worked.</p><p>We thought an MCP App could be a great way to convert his improvised solution into something more polished, and in many ways it was. Just not in all the ways we wanted it to be. We spoke to him about it, identified his pain points, and set out to build him a better solution. The gap we set out to close was the lack of a persistent, refreshable, and sharable home for his workflow: he was regenerating his summary from scratch every week, with no reliable record of old data and no way to refresh without retyping his long, carefully crafted prompt. So we built him the real version: a highly configurable interactive dashboard, served from our MCP server, rendered inline in Claude Desktop.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d4Kz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d4Kz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png 424w, https://substackcdn.com/image/fetch/$s_!d4Kz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png 848w, https://substackcdn.com/image/fetch/$s_!d4Kz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png 1272w, https://substackcdn.com/image/fetch/$s_!d4Kz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d4Kz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png" width="1401" height="1327" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1327,&quot;width&quot;:1401,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d4Kz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png 424w, https://substackcdn.com/image/fetch/$s_!d4Kz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png 848w, https://substackcdn.com/image/fetch/$s_!d4Kz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png 1272w, https://substackcdn.com/image/fetch/$s_!d4Kz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c981b08-e618-4f78-be81-c6401f3cefb3_1401x1327.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It had a card grid of major active initiatives, FTE investment, progress bars, projected dates, week-over-week deltas, and per-card drill-downs into weekly trajectory data for each one. There were status and team filters at the top, plus a saved combination of fields, statuses, and team scope that the user could configure to their liking. Mechanically, it was five MCP tools, an HTML resource for the iframe, and local JSON persistence for configuration preferences in the user&#8217;s home directory. The whole thing ran as part of our existing MCP server, with no separate deployment, no hosted infrastructure, and no new auth flow. We used the standard patterns for MCP Apps: iframe-as-resource, companion tools backing every interactive control, and the ability to fetch additional data on demand without an additional LLM round-trip. It worked exactly as promised.</p><h2>The realization</h2><p>Ultimately, an MCP App doesn&#8217;t extend the flexible-data-access promise. In some ways it even cuts against it. The tool layer gives the user open-ended ways to interrogate the data. An MCP App takes a slice of that and freezes it back into a fixed, pre-built view. That&#8217;s genuinely useful, but it <em>is</em> still a view, with the same essential shape and constraints as any other UI we&#8217;d built.</p><p>So the offering ends up being a third category of thing, sitting alongside (1) our main web app and (2) the flexible tool layer. It doesn&#8217;t have the infinite flexibility of the tool layer, and it doesn&#8217;t have the depth of the full app. It&#8217;s its own surface, in between.</p><p>That&#8217;s not a failure mode&#8211; it&#8217;s just what an App is, and it&#8217;s worth being clear-eyed about that before you build one. For a user who already lives in our main app, a thinner version of that, embedded in chat, isn&#8217;t actually much of an upgrade. The chat container, the iframe sandbox, and the available tool surface all cap how much depth you can push through it.</p><p>The chat surface is for glanceable previews and quick actions. The full product is for deep work. The tool layer is for everything in between that we hadn&#8217;t anticipated and built a view for.</p><h2>Where we&#8217;re going next</h2><p>While we do still see genuine value in MCP apps, and we&#8217;re planning to release the one we have, we&#8217;re largely keeping our efforts focused for now on expanding and reinforcing the underlying infrastructure on top of which our MCP is built. This means designing better, more efficient, MCP-friendly endpoints, implementing a more flexible permissions structure for improved shareability, and better instrumentation.</p><p>What we want our MCP to offer people right now is <em>more of the flexibility:</em> the open-ended, interrogate-it-any-way-you-want access that drew us to the protocol in the first place. That&#8217;s the real value of MCP to us, and this project helped to clarify that.</p><p>We&#8217;ve been doing our homework and learning about some really interesting, creative solutions that others are implementing when faced with these same challenges, and we&#8217;re testing out some of them now. Stay tuned for more about how we&#8217;re approaching these challenges and what we&#8217;ve been learning along the way!</p>]]></content:encoded></item><item><title><![CDATA[Product Update: The Jellyfish MCP Is Now Documentation-Aware]]></title><description><![CDATA[Grounded answers, plus real docs for your agentic coding loops]]></description><link>https://jellyfishresearch.substack.com/p/product-update-the-jellyfish-mcp</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/product-update-the-jellyfish-mcp</guid><dc:creator><![CDATA[Tomas Pardinas]]></dc:creator><pubDate>Wed, 03 Jun 2026 14:17:26 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/00cab37b-66f3-4e98-a1f4-e3f73dc59ed6_2400x1256.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!55rF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!55rF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png 424w, https://substackcdn.com/image/fetch/$s_!55rF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png 848w, https://substackcdn.com/image/fetch/$s_!55rF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!55rF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!55rF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png" width="1456" height="762" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:762,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!55rF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png 424w, https://substackcdn.com/image/fetch/$s_!55rF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png 848w, https://substackcdn.com/image/fetch/$s_!55rF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!55rF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9244f3b-7095-4d0c-a532-92d24ad3f20b_2048x1072.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The Jellyfish MCP just got a little smarter about Jellyfish itself. With our latest update, your AI assistant can now reach into the Jellyfish help center &#8212; searching our documentation and reading full articles on demand &#8212; so the answers it gives are <strong>grounded</strong> in how the product actually works.</p><p style="text-align: justify;">If you&#8217;ve used the Jellyfish MCP to ask about your team&#8217;s effort allocation, delivery metrics, or AI impact, you already know how much faster it is to get insights in a single conversation. But until now, questions about <em>how Jellyfish works</em> &#8212; how to configure a feature, where a metric comes from, how to set something up &#8212; were answered from the <strong>model&#8217;s general knowledge alone</strong>. That left room for <strong>vague or out-of-date guidance.</strong></p><p style="text-align: justify;"><strong>What&#8217;s new</strong></p><p style="text-align: justify;">We added two new tools, backed by two new endpoints, that connect the MCP to the Jellyfish help center. When you ask a &#8220;how do I&#8230;&#8221; or &#8220;how does this work&#8221; question, the assistant first searches our documentation to find the most relevant articles, then pulls in their full content as context before answering.</p><p style="text-align: justify;"><strong>What this means for you</strong></p><ul><li><p style="text-align: justify;">Fewer generic answers. Guidance reflects the current product, not a best guess.</p></li><li><p style="text-align: justify;">The source is right there. Responses are backed by the actual help articles, so you can dig deeper when you want to.</p></li><li><p style="text-align: justify;">One conversation, end to end. Ask about your data and how to act on it without leaving the assistant or hunting through docs in another tab.</p></li></ul><p style="text-align: justify;"><strong>How to use it</strong></p><p style="text-align: justify;">You don&#8217;t need to do anything special &#8212; just ask your questions naturally, and the assistant decides when to consult the help center on its own. Two patterns where this is especially powerful:</p><ul><li><p><strong>Ask questions:</strong> in plain language via you agent. <em>&#8220;How do I set up SSO in Jellyfish?&#8221;</em> or <em>&#8220;How does AI Impact measure adoption?&#8221;</em> The assistant searches the documentation, pulls the most relevant article, and answers with steps grounded in how the product actually works &#8212; no tab-switching, no guesswork</p></li></ul><blockquote></blockquote><ul><li><p><strong>Build context for agentic coding loops:</strong> If you&#8217;re using an AI agent to write code against the Jellyfish API, the MCP can now feed it the right documentation as it works. Instead of the agent guessing at endpoints or parameters, it can search the help center, read the relevant articles, and ground its implementation in real, current docs. The result is more accurate code, fewer wrong turns, and a tighter loop &#8212; the agent pulls its own context exactly when it needs it rather than relying on what it happened to know.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5Uos!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Uos!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png 424w, https://substackcdn.com/image/fetch/$s_!5Uos!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png 848w, https://substackcdn.com/image/fetch/$s_!5Uos!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png 1272w, https://substackcdn.com/image/fetch/$s_!5Uos!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Uos!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png" width="692" height="747.0533736153071" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2144,&quot;width&quot;:1986,&quot;resizeWidth&quot;:692,&quot;bytes&quot;:491181,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/200175361?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd06dbb8-56a6-4efd-af9d-61ba9bbc08d8_1986x2464.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5Uos!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png 424w, https://substackcdn.com/image/fetch/$s_!5Uos!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png 848w, https://substackcdn.com/image/fetch/$s_!5Uos!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png 1272w, https://substackcdn.com/image/fetch/$s_!5Uos!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c02cc1b-df22-4334-8610-3d163968fbdc_1986x2144.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></li></ul><p style="text-align: justify;"><strong>Image:</strong> A documentation example showing how the MCP returns reliable information about our API.</p><p style="text-align: justify;"><strong>Setup:</strong></p><p style="text-align: justify;">Nothing changes about how you use the MCP. If you already have the Claude Desktop extension installed, you will need to update the MCP config with the latest release 1.1.1 </p><ul><li><p><strong>For MCP Extension Users:</strong> Update your MCP config with new release. Install the extension by downloading the .mcpb file from the <a href="https://github.com/Jellyfish-AI/jellyfish-mcp/releases">Jellyfish MCP GitHub repository</a>, double-clicking to install, and providing your Jellyfish API token when prompted.</p></li><li><p><strong>For NPM users: </strong>Update will be pushed automatically, so no setup is required for existing users. New users can follow instructions from the <a href="https://github.com/Jellyfish-AI/jellyfish-mcp/releases">Jellyfish MCP GitHub repository</a></p></li></ul><p style="text-align: justify;">Grounding <strong>AI in a trustworthy</strong>, current context is one of the hardest parts of making it genuinely useful &#8212; and connecting the MCP to our own documentation is a meaningful step in that direction.</p><p style="text-align: justify;">It&#8217;s one more way we&#8217;re working to make sure that when you ask Jellyfish a question, you get an answer you can <strong>rely on and</strong> <strong>trust</strong></p>]]></content:encoded></item><item><title><![CDATA[Our Journey with OpenClaw: Jellyclaw]]></title><description><![CDATA[At Jellyfish we at the research team are looking for ways to innovate and research new trends. Openclaw was one under our radar since it&#8217;s inception and we decided to start this journey to understand]]></description><link>https://jellyfishresearch.substack.com/p/our-journey-with-openclaw-jellyclaw</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/our-journey-with-openclaw-jellyclaw</guid><pubDate>Thu, 21 May 2026 19:41:20 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6a03fb54-d89f-4d40-8725-e5bd2b526eca_2400x1256.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Authors: Josiah Bruner , Tomas Pardi&#241;as</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SJ9H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SJ9H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png 424w, https://substackcdn.com/image/fetch/$s_!SJ9H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png 848w, https://substackcdn.com/image/fetch/$s_!SJ9H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png 1272w, https://substackcdn.com/image/fetch/$s_!SJ9H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SJ9H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png" width="1456" height="762" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:762,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:75912,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/198745476?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SJ9H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png 424w, https://substackcdn.com/image/fetch/$s_!SJ9H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png 848w, https://substackcdn.com/image/fetch/$s_!SJ9H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png 1272w, https://substackcdn.com/image/fetch/$s_!SJ9H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b90184d-58ef-44f9-95f0-0034b6c3d176_2400x1256.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Context</h1><p style="text-align: justify;">When Openclaw came out it changed the AI industry in a similar way ChatGPT did and rapidly became one of the fastest growing repos in history to have more stars.</p><p style="text-align: justify;">Even major AI players named it:</p><blockquote><p style="text-align: justify;"><em>As the most important software release in history, calling it the &#8220;new operating system&#8221; for agentic AI <a href="https://fathomjournal.org/3b2e0f42smm/51e19d92-5uQcDQs2SfQ.html">(1)</a></em></p></blockquote><p style="text-align: justify;">During launch in February our journey started with major roadblocks - it seemed to be a great tool for consumers to experiment with but major security concerns were still unsolved leading to a wave of articles naming it &#8220;Unusable&#8221; or &#8220;Giving the keys to the kingdom&#8221;.</p><p style="text-align: justify;">This slowed us down to verify if concerns were genuine or not, but eventually good ideas win out and we decided to start simple and take the necessary precautions.</p><p style="text-align: justify;">From our research, autonomous agents are becoming more present and power users are using agents to run on a single task for more than 160 minutes per our last research. Although this was based on usage of Claude Code, we realized that this same behaviour is becoming present in other agents like Openclaw and we decided to research them.</p><h1 style="text-align: justify;">Build it simple and show it works</h1><p style="text-align: justify;">The first time a big idea is released it generates a lot of excitement and suspicion, fear can affect reasoning and bypass it entirely. This happened throughout history with major inventions, my favourite one is when bridges were first built and citizens didn&#8217;t believe they were safe to use. To prove it worked, it needed a &#8220;show me, don&#8217;t tell me&#8221; moment, in 1884 when the Brooklyn Bridge was inaugurated 20 elephants at the same time crossed the bridge to show it was safe - <em>show me don&#8217;t tell me</em>.</p><p style="text-align: justify;">We decided to start simple and move forward with more complex use cases as we go. If we could at least demonstrate in a simple environment that Openclaw is safe, then we could move faster to understand Openclaws capabilities.</p><h1 style="text-align: justify;">Security</h1><p style="text-align: justify;">Jellyfish&#8217;s security program heavily prioritizes understanding <em>risk</em> and ensuring such risk is reasonable for the business and our customers before pursuing any initiatives. We do this using standard security engineering practices (threat modeling, etc.), but uniquely incorporate actuarial-style concepts to ensure there is statistical rigor in our methods.</p><p style="text-align: justify;">OpenClaw deployments are a unique challenge. OpenClaw is highly configurable, and its own security model is highly dependent on such configuration (their <a href="https://docs.openclaw.ai/gateway/security">security documentation</a> is about 41 pages long)! Further, the attack surface is quite large and changes are made quickly. At the time of this writing, there have been <a href="https://github.com/jgamblin/OpenClawCVEs/#-all-security-advisories-165">165 security advisories</a> (48 CVEs) since its release which is unusually high.</p><p style="text-align: justify;">A few things became quickly apparent during the security review:</p><ol><li><p style="text-align: justify;">There is a lot of <em>uncertainty</em> in the product, use-case, and non-determinism of LLMs.</p></li><li><p style="text-align: justify;">The primary way to reliably manage risk is to ensure the <em>impact</em> of any issues is small.</p></li><li><p style="text-align: justify;">The secondary way to manage risk is to constrain the capabilities. This is actually very hard because as you make a system more flexible, the capabilities increase superlinearly.</p></li></ol><p style="text-align: justify;">This is no surprise and suggests a decent model for thinking about risk of agentic systems:</p><p>Agentic systems&#8217; risk profile is related strongly to two factors:</p><ol><li><p><em>Flexibility of the harness</em></p></li><li><p><em>Sensitivity of the harness.</em></p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RsnD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RsnD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png 424w, https://substackcdn.com/image/fetch/$s_!RsnD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png 848w, https://substackcdn.com/image/fetch/$s_!RsnD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!RsnD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RsnD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RsnD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png 424w, https://substackcdn.com/image/fetch/$s_!RsnD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png 848w, https://substackcdn.com/image/fetch/$s_!RsnD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!RsnD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb4fb8db-4a42-45c0-ba32-7039c1f36e21_1600x1200.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: center;"><em>Graph showing risk as it relates to flexibility (f) and sensitivity (s).</em></p><p style="text-align: center;"><em>Note that flexibility grows more quickly than sensitivity.</em></p><p style="text-align: justify;">To address sensitivity, we got our first security requirement for this project: no corporate (and definitely no production) data can be remotely possible to touch.</p><p style="text-align: justify;">This is a guarantee we <em>really</em> needed. To ensure this we proposed an architecture where:</p><ol><li><p style="text-align: justify;">OpenClaw is deployed in a:</p><ol><li><p style="text-align: justify;">Dedicated AWS account; and</p></li><li><p style="text-align: justify;">No one except a few IT and security people have access to the account; and</p></li><li><p style="text-align: justify;">In a dedicated VPC; and</p></li><li><p style="text-align: justify;">In its own EC2 instance; and</p></li><li><p style="text-align: justify;">Security group settings that limit any network movement in the VPC; and</p></li><li><p style="text-align: justify;">IAM policies that allow the EC2 instance to make Bedrock API calls and nothing else</p></li></ol></li></ol><p style="text-align: justify;">At first glance, this seemed extremely thorough, but a key risk vector remained: ultimately our employees need to interact with this service (using a Slack channel) and we can&#8217;t guarantee they won&#8217;t accidentally provide something sensitive. We also didn&#8217;t want Slack to become an attack vector if the OpenClaw instance was to become compromised. This led to another slew of (flexibility constraint) requirements like:</p><ol><li><p style="text-align: justify;">Read-only file systems</p></li><li><p style="text-align: justify;">Dedicated internal Slack app with minimal permissions</p></li><li><p style="text-align: justify;">Disable &#8220;Elevated Tools&#8221; in OpenClaw</p></li><li><p style="text-align: justify;">Disable &#8220;Plugins&#8221; in OpenClaw</p></li><li><p style="text-align: justify;">Auto-update to ensure security updates are applied</p></li><li><p style="text-align: justify;">TLS to ensure communication is encrypted</p></li><li><p style="text-align: justify;">No incoming traffic (yes, even with Slack as an &#8220;input&#8221;. Thanks <a href="https://docs.slack.dev/apis/events-api/using-socket-mode/">socket mode</a>)</p></li></ol><p style="text-align: justify;">This is a lot, but it was important, especially for a phase 1 experiment. The 5-year risk of deploying OpenClaw without mitigations was estimated <strong>two orders of magnitude </strong>higher than with these applied.</p><p style="text-align: justify;">At this point we felt confident that we could get something deployed and used by research without there being any significant risk to the business.</p><h1 style="text-align: justify;">Solution: Lightsail</h1><p style="text-align: justify;">Since we have most of our LLM models in Bedrock, using a built-in solution in AWS to run the first v0 sounded like a good first iteration.</p><p style="text-align: justify;">We pursued two approaches in parallel to see which worked better:</p><ol><li><p style="text-align: justify;">Deploying our own EC2 instance and configuring OpenClaw on it manually using Docker; or</p></li><li><p style="text-align: justify;">Using the (at the time just released) AWS Lightsail option</p></li></ol><p style="text-align: justify;">We quickly discovered Lightsail was the better option for an MVP, since:</p><ol><li><p style="text-align: justify;">It gave us credential-less management</p></li><li><p style="text-align: justify;">An OpenClaw deployment that was mostly configured in a &#8220;secure default&#8221; way</p></li><li><p style="text-align: justify;">Automatic TLS management</p></li><li><p style="text-align: justify;">Dedicated networking and EC2 instances</p></li></ol><p style="text-align: justify;">Of course there were downsides:</p><ol><li><p style="text-align: justify;">It&#8217;s not particularly configurable, so doing things like read-only filesystem mounts were less trivial.</p></li><li><p style="text-align: justify;">Some things, like filesystem access control, is more opaque.</p></li></ol><p style="text-align: justify;">We discovered that some of these requirements were already built in for us and it was easier to set it up instead of using an EC2 instance - with the tradeoff of losing some capabilities like setting up read-only filesystem mount directory via Lightsail.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xVWX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xVWX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png 424w, https://substackcdn.com/image/fetch/$s_!xVWX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png 848w, https://substackcdn.com/image/fetch/$s_!xVWX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png 1272w, https://substackcdn.com/image/fetch/$s_!xVWX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xVWX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png" width="1456" height="1134" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1134,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xVWX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png 424w, https://substackcdn.com/image/fetch/$s_!xVWX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png 848w, https://substackcdn.com/image/fetch/$s_!xVWX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png 1272w, https://substackcdn.com/image/fetch/$s_!xVWX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9ea3f9-9e9d-48a1-b6b7-0d4ad5cb959a_2048x1595.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Some other interesting features built in for us:</strong></p><ol><li><p><strong>Memory Snapshots: </strong>Version control of our setup saved in AWS</p></li><li><p><strong>Buckets: </strong>Save the data in one of AWS instances to provide context to our openclaw agent</p></li></ol><h1 style="text-align: justify;">Initial Thoughts and Next Steps</h1><p>Good news: It worked! We now have a persistent agent running that employees can communicate with.</p><p>Bad news: It basically just acts like ChatGPT when it was first released. It can answer a few questions, perform benign actions, and use the Slack channel as persistent context, but not much else. This is not surprising: we knew this was going to be heavily constrained.</p><p>This is the start of the journey of running agentic solutions and we have specific use cases we want to unlock next for both research purposes and productivity gains, specifically, we are interested in the following use cases:</p><ul><li><p>Tracking telemetry (OTEL)</p></li><li><p>Ability to run cron jobs for long periods of time</p></li><li><p>Give it write access to its own workspace to see if it can automate non-sensitive workflows</p></li><li><p>Could we have it write software + push to a dedicated repo? What happens if you do?</p></li></ul><p>We will continue searching for the value/risk optimum and how far can this technology help us.</p><p></p>]]></content:encoded></item><item><title><![CDATA[The Opus 4.7 Tokenizer Tax]]></title><description><![CDATA[Anthropic quietly raised Opus prices by changing the tokenizer, not the price tag. Here's what it actually costs when measured across 17K real-world Claude Code users.]]></description><link>https://jellyfishresearch.substack.com/p/the-opus-47-tokenizer-tax</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/the-opus-47-tokenizer-tax</guid><dc:creator><![CDATA[Nicholas Arcolano]]></dc:creator><pubDate>Thu, 07 May 2026 19:42:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KzAO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;ve all had a few weeks to sit with Opus 4.7, and consensus is that Anthropic just raised the price of Opus as much as ~40% without changing the price tag.</p><p>The nominal per-token rate is identical to Opus 4.6 ($5/$25 per million), but a new tokenizer maps the same input to more tokens. Anthropic&#8217;s own stated range was 1.0-1.35x, but others such as Simon Willison are reporting inflation at levels up to 46%. (He saw this, for example, on the model&#8217;s own system prompt.)</p><p>Same prompt + same words &#8594; bigger bill.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gfh_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gfh_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!gfh_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!gfh_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!gfh_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gfh_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4822232,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/199374408?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gfh_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!gfh_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!gfh_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!gfh_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbff3d5b0-50d9-494d-9daf-b7fa62c74bf3_2816x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Tomasz Tunguz framed this well in his recent newsletter: smarter models used to be <em>cheaper</em> per outcome. Opus 4.5 needed 76% fewer tokens than Sonnet to reach the same result, making it 60% cheaper despite a higher sticker price. Opus 4.7 reverses the pattern. He calls it a sawtooth: resolution goes up and cost goes up, then efficiency gains bring it back down. Rinse and repeat. The net effect across each cycle is more tokens consumed industrywide.</p><p>This connects directly to the data we&#8217;ve been tracking at Jellyfish. Recently I shared data showing 10x token cost for ~2x throughput at the highest usage decile... and that exponential cost curve just got steeper. If your heaviest AI users were already in the $90/PR range, they may have quietly crossed into $125+ territory in the last week without writing a single line of code differently.</p><p>So what has the tokenizer tax turned out to be in reality? Our current estimate: <strong>17%</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KzAO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KzAO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png 424w, https://substackcdn.com/image/fetch/$s_!KzAO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png 848w, https://substackcdn.com/image/fetch/$s_!KzAO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png 1272w, https://substackcdn.com/image/fetch/$s_!KzAO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KzAO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png" width="1456" height="849" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:849,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:227407,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/199374408?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KzAO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png 424w, https://substackcdn.com/image/fetch/$s_!KzAO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png 848w, https://substackcdn.com/image/fetch/$s_!KzAO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png 1272w, https://substackcdn.com/image/fetch/$s_!KzAO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b955dab-6cb1-4975-9af5-43f7f28e8a79_3600x2100.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What&#8217;s going on? First, note that input tokens are just part of the equation. In practice, the real costs depend on how much you use Claude Code, the proportion of extended thinking (since input tokens are proportionally less of the total as thinking time increases), and the type of work you do. (For example, perhaps Opus 4.7 works more efficiently, resulting in fewer tokens overall?)</p><p>To sort this out, we took a look at Jellyfish data across 17K Claude Code users who switched from Opus 4.6 to 4.7. Using a statistical model to control for individual differences in the type and amount of work across developers, we saw that switching to 4.7 was associated with a 17% increase on average, with this &#8220;tax rate&#8221; decreasing as thinking time increases (and increasing as thinking decreases).</p><p>The chart above illustrates what this looks like for three scenarios, based on level of Claude Code token consumption: low (bottom 20%), median, and high (top 20%). These figures represent typical levels of weekly usage cost, proportion of extended thinking, and total expected change due to switching from Opus 4.6 to 4.7.</p><p>The headline: the tokenizer tax is real and significant, but also less than half what you&#8217;d estimate from first principles. It&#8217;s a good reminder that these systems are complex, and each engineering team&#8217;s use cases are unique &#8211; the only reliable way to know the actual cost impact is to measure it yourself.</p>]]></content:encoded></item><item><title><![CDATA[AI Agent Autonomy]]></title><description><![CDATA[Autonomous agents are becoming more ubiquitous, with releases from the industry and users experimenting with running an AI on their computer. We asked ourselves: what does autonomy look like in AI age]]></description><link>https://jellyfishresearch.substack.com/p/ai-agent-autonomy</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/ai-agent-autonomy</guid><dc:creator><![CDATA[Tomas Pardinas]]></dc:creator><pubDate>Thu, 07 May 2026 13:48:01 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ff982fd0-3501-4b6a-b931-920fb727ba7b_960x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y0NM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y0NM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png 424w, https://substackcdn.com/image/fetch/$s_!y0NM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png 848w, https://substackcdn.com/image/fetch/$s_!y0NM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!y0NM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y0NM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png" width="717" height="375.24313186813185" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:762,&quot;width&quot;:1456,&quot;resizeWidth&quot;:717,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y0NM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png 424w, https://substackcdn.com/image/fetch/$s_!y0NM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png 848w, https://substackcdn.com/image/fetch/$s_!y0NM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!y0NM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44afc2d5-ce06-4ebc-8ef7-57d5e19d01f6_2048x1072.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2 style="text-align: justify;">Main Findings:</h2><ul><li><p style="text-align: justify;"><strong>Sessions on average are very short in span, but the top 0.1% of agents are running for more than an hour</strong></p></li><li><p style="text-align: justify;"><strong>New models like Opus 4.7 are increasing the time agents can run on long-running tasks</strong></p></li><li><p><strong>Long running agents are more productive both in commits and PRs and do more feature work and less refactoring work compared to short running agents</strong></p></li><li><p><strong>+80% of long running turns present low supervision involvement and low supervision produces the most code during long tasks  </strong></p></li></ul><p style="text-align: justify;">Below we present our methodology and findings in detail. </p><p style="text-align: justify;">Autonomy is a nascent concept in AI agents and the industry is just adapting to the idea of delegating to a system running for hours. The trend seems to be that users will get more comfortable with computers running and taking longer to arrive at a decision. According to <a href="https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/">METR</a>, the length of tasks AI can do is doubling every 7 months. At this pace, by the end of 2026 we will have agents that will be able to run for as long as 4 hours on a single task.</p><p style="text-align: justify;">We are developing this research to better understand how autonomous runs are evolving, how productive these agents are, what type of work autonomy is useful for, and whether there is an unexpected side effect for users running these long tasks.  </p><p style="text-align: justify;">To study autonomy we are going to focus into two traits: </p><ul><li><p style="text-align: justify;">Length of task - to capture how long agents are capable of running productively</p></li><li><p style="text-align: justify;">Supervision - understand how humans are interacting with agents running long periods of time. </p></li></ul><h2>Methodology </h2><p style="text-align: justify;">To start: what is an agent and how should we define it? There are a variety of definitions of what an agent is and how we should measure it. </p><p style="text-align: justify;">Another complexity of studying agents is the fast evolution of the agent landscape and its workflows, as we concluded in our previous research: In a span of months new workflows appeared moving the user away from the IDE to a fully agentic workflow</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6a401350-51e2-4c16-b1cc-7581abcf50ee&quot;,&quot;caption&quot;:&quot;As AI coding tools mature, some questions are rising about AI usage: How my team is using agents differently and what the workflow looks like ? How should we measure the different agents and workflows?&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Measuring Agentic Workflows &quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:411773788,&quot;name&quot;:&quot;Tomas Pardinas&quot;,&quot;bio&quot;:&quot;Product Researcher&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!RrRs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e0ee42-8daa-4853-8a1d-2b43c0f89f67_200x200.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-02-20T20:59:36.062Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Hp7k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://jellyfishresearch.substack.com/p/measuring-agentic-workflows&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:188555742,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:7,&quot;comment_count&quot;:0,&quot;publication_id&quot;:6816196,&quot;publication_name&quot;:&quot;Jellyfish Research&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Kioy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57d4c023-fbf3-4996-b9dd-1cb76e830f75_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p style="text-align: justify;"></p><p style="text-align: justify;">We took a pragmatic approach to studying agents and we used the <a href="https://simonwillison.net/2025/Sep/18/agents/">definition</a> that an agent is <em>&#8220;<strong>An LLM agent runs tools in a loop to achieve a goal&#8221;. </strong></em></p><p style="text-align: justify;">For this research we won&#8217;t propose a definition of what autonomy is and simply define a threshold to categorize agents per turn duration and human involvement with the agent. To understand autonomy we are going to focus on agents at the top percentile, of turn duration. We define a turn as a sequence of consecutive CLI interactions bounded by human input. To analyze supervision, we condense turns into sessions, since productivity metrics align more naturally at the session level than at the turn level.</p><p style="text-align: justify;">Our main findings are derived from a proprietary dataset that combines OTEL schema telemetry from Claude Code with productivity metrics sourced from GitHub signals. More information around metric definition can be found in the appendix. </p><h2>Main Findings:</h2><h3><strong>Sessions on average are very short in span, but the top 0.1% of agents are running for more than an hour </strong></h3><p><em>How long are agents running autonomously? </em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-OQT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-OQT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png 424w, https://substackcdn.com/image/fetch/$s_!-OQT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png 848w, https://substackcdn.com/image/fetch/$s_!-OQT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png 1272w, https://substackcdn.com/image/fetch/$s_!-OQT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-OQT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png" width="989" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:989,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30600,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-OQT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png 424w, https://substackcdn.com/image/fetch/$s_!-OQT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png 848w, https://substackcdn.com/image/fetch/$s_!-OQT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png 1272w, https://substackcdn.com/image/fetch/$s_!-OQT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc652a74-fb18-45de-ad51-740f3eca3de9_989x562.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"><strong>Figure 1. </strong>Distribution of Claude Code session duration in wall minutes across percentiles (P50, P90, P95, P99, P99.9). The median session lasts just 1 minute, but durations grow sharply in the long tail &#8212; reaching 8 minutes at P95, 27 minutes at P99, and 94 minutes (~1.6 hours) at P99.9. </p><p style="text-align: justify;">We can see how the median session of an agent has a one-minute duration. To understand autonomy in agents, it is useful to zoom into the long tail and understand the behaviour of longer sessions, where only a subset of agents are running long-running tasks. </p><p style="text-align: justify;">Understanding the evolution of long tasks provides us a view of how users are running agents in new creative ways and more complex tasks. </p><h3><strong>New models like Opus 4.7 are increasing the time agents can run on long-running tasks</strong></h3><p><em>Are long running agents running for longer periods of time? </em></p><p style="text-align: justify;">One interesting pattern was to see the trend of how long-running tasks are evolving with new models. We found that for the last 2 months Opus 4.6 was stable in terms of long-running turns, while when the new model Opus 4.7 was released we realized that this new model pushed the boundaries on how long agents are capable of running. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q_NM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q_NM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png 424w, https://substackcdn.com/image/fetch/$s_!q_NM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png 848w, https://substackcdn.com/image/fetch/$s_!q_NM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png 1272w, https://substackcdn.com/image/fetch/$s_!q_NM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q_NM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png" width="896" height="499" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:499,&quot;width&quot;:896,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57952,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q_NM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png 424w, https://substackcdn.com/image/fetch/$s_!q_NM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png 848w, https://substackcdn.com/image/fetch/$s_!q_NM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png 1272w, https://substackcdn.com/image/fetch/$s_!q_NM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1095989-2cc2-4257-9876-807d9e2ebc26_896x499.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"><strong>Figure 2. </strong>P99.9 turn duration in wall minutes over time, segmented by model (Opus 4.6 vs. Opus 4.7), shown as a smoothed trend with daily values. Opus 4.6 holds steady around 70&#8211;100 minutes, while Opus 4.7 starts lower in mid-March, spikes above 150 minutes during early adoption, and stabilizes near 90 minutes by late April.</p><p style="text-align: justify;">One interesting aspect is how the time an agent is able to run has evolved. We have checked this trend across models. It&#8217;s interesting to see how users are running longer tasks over a longer period of time, but this is only true when they run the latest models like Opus 4.7.</p><p style="text-align: justify;">It seems that with each new model release users start exploring capabilities and gaining trust in how long new models can run, until it reaches a stable range where long runs are consistent. The increase in time isn&#8217;t immediate, and we think this happens because models get more efficient and users need to explore capabilities until they are able to inquire around longer tasks. </p><h3><strong>Long running agents are more productive both in commits and PRs </strong></h3><p><em>How do long running turns compare in productivity vs short running turns? </em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oD7g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oD7g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png 424w, https://substackcdn.com/image/fetch/$s_!oD7g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png 848w, https://substackcdn.com/image/fetch/$s_!oD7g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png 1272w, https://substackcdn.com/image/fetch/$s_!oD7g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oD7g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png" width="728" height="279.69635941130906" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:496,&quot;width&quot;:1291,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:44963,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oD7g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png 424w, https://substackcdn.com/image/fetch/$s_!oD7g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png 848w, https://substackcdn.com/image/fetch/$s_!oD7g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png 1272w, https://substackcdn.com/image/fetch/$s_!oD7g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9daa25-11e8-4013-b2cb-949b5e8f6842_1291x496.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 3. </strong>Average commits per turn and merged PRs per turn across four turn-duration cohorts (Below P90, P90&#8211;99, P99&#8211;99.9, P99.9+). Both metrics scale roughly linearly with duration: commits per turn rise from 0.020 to 0.404 (~20x), and PRs per turn rise from 0.016 to 0.178 (~11x). This analysis reflects all interactive Claude Code turns linked to git activity.</p><p style="text-align: justify;">It&#8217;s interesting to see how productivity favours long-running agents. Although they spend more than an hour running these tasks, their productivity is substantially higher than that of agents running a short span of tasks.</p><p style="text-align: justify;">This pattern is reflected in the idea of <em>tokenmaxing</em>: to see more-than-proportional gains in productivity we need to look at the top 0.1% of long-running tasks. It seems that long-running agents are paying off on whatever task they are doing, which uncovers the following question: <em>what type of tasks are these long running agents running? Are long running agents generalists or specialists? </em>We explore this in the following section.</p><h3><strong>Long running agents do more feature work and less refactoring </strong></h3><p><em>What type of tasks are long running agents working on? </em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!69wP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!69wP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png 424w, https://substackcdn.com/image/fetch/$s_!69wP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png 848w, https://substackcdn.com/image/fetch/$s_!69wP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png 1272w, https://substackcdn.com/image/fetch/$s_!69wP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!69wP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png" width="1390" height="491" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:491,&quot;width&quot;:1390,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!69wP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png 424w, https://substackcdn.com/image/fetch/$s_!69wP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png 848w, https://substackcdn.com/image/fetch/$s_!69wP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png 1272w, https://substackcdn.com/image/fetch/$s_!69wP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F920963c8-e936-4ba1-8fa3-474969bf00e6_1390x491.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 4. </strong>Pull request type distribution (feat, fix, refactor, chore, other, test, docs, infra/style) by turn-duration cohort. The mix is largely stable across cohorts &#8212; feat ranges 22&#8211;25%, fix 25&#8211;27%, and refactor 11&#8211;14% &#8212; suggesting long-running agents are not specialized but rather doing the same work mix at greater volume. This analysis reflects PRs attributable to Claude Code turns across the duration distribution.</p><p>If we had to categorize long-running agents, they would look closer to generalists doing more work overall but in similar proportions to their short-running peers. With the caveat that they tend to do slightly more feature work (~3% more) and slightly less refactoring (~2% less). </p><h3><strong>Most of feature tasks are related to API or UI services</strong></h3><p><em>Feature development tasks by P99.9 sessions </em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D0pJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D0pJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png 424w, https://substackcdn.com/image/fetch/$s_!D0pJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png 848w, https://substackcdn.com/image/fetch/$s_!D0pJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png 1272w, https://substackcdn.com/image/fetch/$s_!D0pJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D0pJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png" width="2048" height="1124" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1124,&quot;width&quot;:2048,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:169426,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!D0pJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png 424w, https://substackcdn.com/image/fetch/$s_!D0pJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png 848w, https://substackcdn.com/image/fetch/$s_!D0pJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png 1272w, https://substackcdn.com/image/fetch/$s_!D0pJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faeb13b-8b0e-4c17-8fbe-167f5769e38d_2048x1124.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 5. </strong>Distribution of feature types built across long-running (P99.9) sessions, ranked by share of feature-development sessions. API &amp; Backend Services (18.2%) and UI &amp; Frontend (14.7%) dominate, followed by Authentication &amp; Authorization (10.6%), Data Pipelines &amp; ETL (9.4%), and Integration &amp; External APIs (8.5%); the long tail covers Database &amp; Storage, Notifications, Configuration, Monitoring, and Security &amp; Compliance, each below 6%. This analysis reflects 341 P99.9 feature-development sessions classified by commit and PR text. </p><p>If we zoom in into feature development to understand if this is productive work or the agent is drifting off to non-related feature work we actually see how commits and PRs are building all sorts of features, being backend and frontend the most relevant in terms of sessions. </p><p style="text-align: justify;">But <em>why</em> is it that agents run for long periods of time to accomplish these tasks, our hypotheses are: either agents are drifting from their main objective and taking longer than needed or because the agent needs to wait in a sequential workflow that has high external dependencies.</p><p style="text-align: justify;">From a small sample of P99.9 long runs we were able to see how one agent was running over 3 hours because it was waiting for a CI/CD process to finish or because an agent was running a cron job every 10 minutes without any output just checking almost as a &#8220;bot&#8221; and less than an agent. </p><p style="text-align: justify;">We studied in depth each of these feature types that are acting like agents (excluding any bot behaviour) and reached on why each of these features might be taking longer times to finish by analyzing commit text and PR text. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6ZjZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6ZjZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png 424w, https://substackcdn.com/image/fetch/$s_!6ZjZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png 848w, https://substackcdn.com/image/fetch/$s_!6ZjZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png 1272w, https://substackcdn.com/image/fetch/$s_!6ZjZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6ZjZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png" width="2047" height="931" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:931,&quot;width&quot;:2047,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:423959,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6ZjZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png 424w, https://substackcdn.com/image/fetch/$s_!6ZjZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png 848w, https://substackcdn.com/image/fetch/$s_!6ZjZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png 1272w, https://substackcdn.com/image/fetch/$s_!6ZjZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae309b80-db2b-4c59-b3ae-f465c35fa2ae_2047x931.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 6. </strong>Qualitative deep-dive into the top six feature categories from long-running (P99.9) sessions, pairing each category&#8217;s share with a &#8220;why it runs long&#8221; annotation. API &amp; Backend (18.2%) drives duration through iterative refinement and waits on CI/CD, integration tests, and cross-service coordination; UI &amp; Frontend (14.7%) and Data Pipelines &amp; ETL (9.4%) run long mainly through idle waits on E2E suites, visual regression checks, and Spark/schema-migration jobs; Auth &amp; Authorization (10.6%) shows the lowest idle ratio &#8212; meaning more genuinely active development on security and edge cases; Integration &amp; External APIs (8.5%) is dominated by waits on third-party systems; AI/ML &amp; Analytics (6.5%) reflects model-training and experiment-validation cycles. </p><p style="text-align: justify;">Although agents run for long periods of time, the agent probably isn't active the whole time &#8212; it's waiting on external dependencies, depending on what the task is doing. Long agent runtimes might just be a reflection of how today's software is built, and as agents compress their internal cycle time, inefficiencies in sequential workflows will become more evident. </p><h3><strong>Supervision</strong></h3><p style="text-align: justify;">We define supervision as the percentage of time a human is involved per turn. We want to understand how supervision affects long running tasks and code quality output. </p><p style="text-align: justify;"><em>Is supervision affecting quality of work, and if so how?</em></p><h3><strong>+80% of long running turns present low supervision involvement </strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!scHG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!scHG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png 424w, https://substackcdn.com/image/fetch/$s_!scHG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png 848w, https://substackcdn.com/image/fetch/$s_!scHG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png 1272w, https://substackcdn.com/image/fetch/$s_!scHG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!scHG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png" width="658" height="495" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:495,&quot;width&quot;:658,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:34006,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!scHG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png 424w, https://substackcdn.com/image/fetch/$s_!scHG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png 848w, https://substackcdn.com/image/fetch/$s_!scHG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png 1272w, https://substackcdn.com/image/fetch/$s_!scHG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bb78654-6995-44f1-982a-fd6958e47b6f_658x495.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 7. </strong>Share of turns falling into Low (&#8804;30%), Medium (30&#8211;50%), and High (&gt;50%) supervision bands across the four turn-duration cohorts, where supervision is defined as the percentage of turn time the human is actively involved. Low supervision rises sharply with duration &#8212; from 9% in sub-P90 turns to 83% in P99.9+ turns &#8212; while high-supervision turns nearly disappear in the long tail. </p><h3><strong>Low supervision produces the most code during long tasks</strong></h3><p>Sessions where the human barely types (&#8804;30% user share) generate 11x more net lines per turn than high-supervision sessions. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lkkp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lkkp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png 424w, https://substackcdn.com/image/fetch/$s_!lkkp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png 848w, https://substackcdn.com/image/fetch/$s_!lkkp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png 1272w, https://substackcdn.com/image/fetch/$s_!lkkp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lkkp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png" width="625" height="529" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:529,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33367,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lkkp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png 424w, https://substackcdn.com/image/fetch/$s_!lkkp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png 848w, https://substackcdn.com/image/fetch/$s_!lkkp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png 1272w, https://substackcdn.com/image/fetch/$s_!lkkp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1db2c6cd-8f2f-4ade-ab31-3d5ee57495db_625x529.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"><strong>Figure 8. </strong>Average net lines of code (additions minus deletions) per turn, split by duration cohort (Short, Medium, Long) and supervision band (Low, Medium, High). Long-duration, low-supervision turns produce ~11,800 net lines on average &#8212; roughly 11x the output of long-duration, high-supervision turns (~1,000 net lines) &#8212; and the gap is invisible in shorter cohorts. </p><p style="text-align: justify;">As we observe, the long-duration turns tend to create more lines of code when there is low supervision, reflecting the new workflow of &#8220;letting it run&#8221; and letting agents build the code without checking or corroborating. What&#8217;s interesting is that when supervision is high for the longest runs, the net added code is less than that of the previous (medium-duration) cohort. This could be because users running agents for longer also have a higher bar on correcting what Claude is generating as a solution or simply because the type of task requires less code and more external dependencies. </p><p style="text-align: justify;">We don&#8217;t know if human supervision it&#8217;s because there is degradation after running for long hours(and more human correction is required), or simply because of the type of task, which requires less code when there is human involvement. Low supervision may also correlate with sessions where the agent drifts &#8212; taking the user down a rabbit hole without converging on a stable solution. Drift, coherence, and supervision are areas we plan to continue monitoring in future research.</p><h3><strong>What&#8217;s Next? </strong></h3><p style="text-align: justify;">If 2026 was the year of coding agents, 2026-2027 will be the year of autonomous agents running 24/7. We suspect that new bottlenecks will emerge: like infrastructure and reviewing the work agents are running in the background. Infrastructure will become critical as agents scale up, where having a surface area available for any numbers of agents requesting programmatic <strong>access will be business critical.</strong>  </p><p style="text-align: justify;">In the future we want to follow our research on how code quality might be affected by autonomous agents and how solutions like Openclaw, Hermes agents, NemoClaw and headless Claude Code agents are being used under the new SDLC model. If these solutions end up being the new industry standard then agents will diffuse into other areas beyond coding.  </p><h3><strong>Notes &amp; Limitations: </strong></h3><ul><li><p><strong>Data coverage:</strong> Can&#8217;t differentiate between a human interrupting a run vs. providing more context to the agent. </p></li><li><p><strong>User linkage:</strong> Only sessions from companies with GitHub integrations (via Jellyfish) produce commit/PR linkage. Companies without GitHub data contribute turn-duration metrics but not productivity metrics. </p></li><li><p><strong>OTEL Schema:</strong> The schema we use as data source, doesn&#8217;t expose a direct invocation-mode field, so we&#8217;d have to infer it from terminal.type and query_source, which we haven&#8217;t done.</p></li></ul><h3><strong>Appendix </strong></h3><ul><li><p><strong>Agent:</strong> Unique session of a Claude Code instance - compatible with Simon Willison&#8217;s definition of an agent. </p></li></ul><ul><li><p><strong>Session:</strong> One unique session_id in the OTEL telemetry. Represents a single continuous Claude Code process invocation.</p></li><li><p><strong>Turn:</strong> A stretch of consecutive CLI-active minutes within a session, bounded by human interactions. Starts at the first CLI minute after a user minute (or session start) and ends just before the next user minute (or session end).</p></li><li><p><strong>Active minutes:</strong>  Count of minutes in the turn with at least one CLI heartbeat.</p></li><li><p><strong>Wall minutes:</strong> (turn_end &#8722; turn_start) / 60 + 1. Includes idle gaps where the CLI was not active</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Teaching Your AI Agent Pays Off]]></title><description><![CDATA[Investing in skills and context files yields measurable productivity gains]]></description><link>https://jellyfishresearch.substack.com/p/teaching-your-ai-agent-pays-off</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/teaching-your-ai-agent-pays-off</guid><dc:creator><![CDATA[Nicholas Arcolano]]></dc:creator><pubDate>Wed, 29 Apr 2026 19:15:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xzLQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We analyzed Jellyfish data across 135K developers and 670 companies. The question was: how do investments in agent customization (particularly, context files like CLAUDE.md, Cursor rules, and custom skills) improve outcomes?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xzLQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xzLQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png 424w, https://substackcdn.com/image/fetch/$s_!xzLQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png 848w, https://substackcdn.com/image/fetch/$s_!xzLQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png 1272w, https://substackcdn.com/image/fetch/$s_!xzLQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xzLQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png" width="1456" height="966" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:966,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:254081,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/199370971?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xzLQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png 424w, https://substackcdn.com/image/fetch/$s_!xzLQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png 848w, https://substackcdn.com/image/fetch/$s_!xzLQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png 1272w, https://substackcdn.com/image/fetch/$s_!xzLQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfc5c6cf-160f-46cf-a0a7-ce52d705ac53_3599x2387.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We found that not only does investing in these customizations increase pull request output, the results compound &#8211; the more you invest in these tools, the more additional PRs you ship.</p><p>Controlling for other factors (e.g. individual baseline output, temporal trends, and overall AI usage intensity), <strong>developers with 6+ customizations ship an additional 2.3 PRs/week over what they would have done otherwise</strong>.</p><p>Overall, doubling your cumulative investment in these tools is associated with an <strong>additional +29% increase</strong> in PRs merged per person per week.</p><p>So, developers who are spending time teaching their AI tools are shipping meaningfully more code. Will be interesting to see how this trend evolves as context engineering becomes more of a team-level practice rather than an individual one!</p>]]></content:encoded></item><item><title><![CDATA[Tokenmaxxing and the Agentic Barrier]]></title><description><![CDATA[Tokens are like rocket fuel, and just like a rocket, going faster and faster requires burning exponentially more]]></description><link>https://jellyfishresearch.substack.com/p/tokenmaxxing-and-the-agentic-barrier</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/tokenmaxxing-and-the-agentic-barrier</guid><dc:creator><![CDATA[Nicholas Arcolano]]></dc:creator><pubDate>Wed, 15 Apr 2026 19:37:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!GLg7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There&#8217;s been a lot of talk lately about &#8220;tokenmaxxing&#8221; &#8211; the idea that the right metric for AI transformation success is simply using as many tokens as possible.</p><p>I get the appeal. It&#8217;s a crude metric, but it can be useful in a crisis... kind of like checking someone&#8217;s pulse tells you they&#8217;re not dead. But just like your heart rate, you want to be thoughtful about when and how you try to max it out.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GLg7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GLg7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png 424w, https://substackcdn.com/image/fetch/$s_!GLg7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png 848w, https://substackcdn.com/image/fetch/$s_!GLg7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png 1272w, https://substackcdn.com/image/fetch/$s_!GLg7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GLg7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png" width="1456" height="1431" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1431,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:441569,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/199371788?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GLg7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png 424w, https://substackcdn.com/image/fetch/$s_!GLg7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png 848w, https://substackcdn.com/image/fetch/$s_!GLg7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png 1272w, https://substackcdn.com/image/fetch/$s_!GLg7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c951ebb-0b33-4c93-a99f-5b10451a18c1_3206x3151.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This chart tells the broader story: pull request throughput and tokens per PR by token usage decile across ~7,500 engineers in Q1.</p><p>The PR throughput curve (purple) is encouraging: it climbs steadily from 0.77 PRs/week at the lowest token usage to 2.15 at the highest. More tokens = more output.</p><p>But look at the token cost curve (orange). Tokens per PR goes up <strong>exponentially</strong>. The median developer uses about 7M tokens per PR, versus 69M (!!) at the top decile. That&#8217;s roughly 10x more tokens for about 2x the throughput.</p><div class="pullquote"><p>Tokens are like rocket fuel, and just like a rocket, going faster requires exponentially more of it.</p></div><p>The practical implication is that when it comes to tokenmaxxing, there&#8217;s a sweet spot. You&#8217;ll get way more bang for your buck by getting everyone in your org into the middle of the adoption curve (D4 through D6) than by pushing a small group into the stratosphere. Broad, moderate adoption beats narrow, extreme usage.</p><p>What about the promise of massive productivity gains from fleets of autonomous agents? Those can absolutely be unlocked, but they require serious investments in agent infrastructure, sandboxed environments, and context engineering, and most companies in 2026 are still in the early stages of addressing these challenges. Until those are overcome, there's still an "agentic barrier" &#8211; a speed of light you can't get past, no matter how many tokens you burn.</p><p></p>]]></content:encoded></item><item><title><![CDATA[Measuring AI Agent Concurrency]]></title><description><![CDATA[A research piece on AI agent concurrency using Claude Code telemetry]]></description><link>https://jellyfishresearch.substack.com/p/measuring-ai-agent-concurrency</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/measuring-ai-agent-concurrency</guid><dc:creator><![CDATA[Tomas Pardinas]]></dc:creator><pubDate>Mon, 23 Mar 2026 16:08:48 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/39741d22-b9e4-477e-a735-5c71192f047c_1200x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Agents, Agents and yes more agents. This is the decade of agents and every company, startup and business is looking at how to use them and implement them. Understanding how agents are running at scale help us understand how productivity is impacted by AI agents.</p><p style="text-align: justify;">We analyzed millions of data points from Claude Code telemetry and we asked: <em>How many agents each user is running at the same time ? Are users spending most of their time with multiple agents ? How many companies are running multi-agentic systems ?</em></p><h2 style="text-align: justify;">Main Findings:</h2><p style="text-align: justify;"><em>User activity findings</em></p><ul><li><p style="text-align: justify;"><strong>Multi-agents systems are just starting out and users are mostly running one or two agents most of the time.</strong> Using multiple agents at once is not the most adopted pattern yet. 84% of users are still running one or two agents at a time.</p></li><li><p style="text-align: justify;"><strong>Users running multi-agent spend most of their time with one agent. </strong>Users running more than three agents at a time, still require to spend most of their time in a single agent terminal. Even for users using multiple agents, 80% of their time is spent within a single agent. We can&#8217;t verify how this workflow is being used but we have some hypotheses:</p><ul><li><p style="text-align: justify;"><strong>Attention Span: </strong>Users hit the <em>attention barrier</em> when managing more than two agents at the same time.</p></li><li><p><strong>Background Autonomy: </strong>Some agents run in the background with minimal supervision needed, reducing the need for active attention showing early sparks of background autonomy.</p></li></ul></li></ul><p style="text-align: justify;"><strong>Users running multi-agent systems are more productive per hour, and more engaged overall. </strong>They commit more, are more productive per hour and run agents with more <em>background autonomy</em>. Engagement and <em>maybe</em> curiosity seems to be what is sparking this set of users to run multiple workflows in parallel.</p><p style="text-align: justify;"><em>Company activity findings</em></p><ul><li><p style="text-align: justify;"><strong>2.5% of companies are using a </strong><em><strong>&#8220;fleet of agents&#8221;</strong></em><strong>, the majority of companies are running 2 agents in parallel</strong>.Agents are being implemented in an orchestrated pattern but we are not seeing a significant number of companies implementing agents at scale. We suspect this is related to infrastructures challenges, budget limitation or educational learning process. </p></li><li><p style="text-align: justify;"><strong>Companies running a </strong><em><strong>fleet of agents</strong></em><strong> spend only 14% of their time in using multi-agentic mode. </strong>We can&#8217;t validate if this is a consequence of the multi-agentic system or a limitation of the current state of companies implementing this pattern. This pattern follows the same user conclusion. Are companies running true background autonomy agents that don&#8217;t require supervision or interactive agentic workflows per user are limited by the initial experiment and system design? </p></li></ul><p style="text-align: justify;">Below we present our methodology and findings in detail. Our main conclusion is that managing multi-agent systems will require a new paradigm. Neither current infrastructure nor human attention is sufficient to scale agent concurrency through interactive management alone. The transition from interactive concurrency to background autonomy&#8212;what we might call the &#8220;agentic barrier&#8221;&#8212;is the phase change the industry needs to navigate. Novel solutions are being developed by startups and power users to build orchestration layers that reduce the need for constant human attention. This was our first step toward understanding how users run agents concurrently, the current stage of maturity, and what comes next.</p><h2>Methodology </h2><p style="text-align: justify;">What is an agent and how should we define it? There are a variety of definitions of what an agent is and how we should measure it. </p><p style="text-align: justify;">Another complexity of studying agents is the fast evolution of the agent landscape and its workflows, as we concluded in our previous research<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> - In a span of months new workflows appeared moving the user away from the IDE to a fully agentic workflow. </p><p style="text-align: justify;">With these two challenges, how should we study and measure agents running concurrently?</p><p style="text-align: justify;">We took a pragmatic approach to studying agents and we used the definition that an agent is <em>&#8220;<strong>An LLM agent runs tools in a loop to achieve a goal&#8221;.</strong></em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a><em><strong> </strong></em></p><p style="text-align: justify;">Next, we defined a set of metrics to measure agentic concurrency using Claude Code telemetry data. The dataset covers a rolling 30-day window with ~19M rows of active time data. This dataset offers a useful sample for studying concurrency at the session level, with its own limitations: it captures only one tool in the market and does not reflect agent concurrency across different providers.</p><h2>Main Findings:</h2><h3><strong>Multi-agents systems are just starting out and users run one or two agents</strong></h3><p style="text-align: justify;"><em>How many agents are users running concurrently?</em> We measured this by tracking how many Claude Code sessions are active in the same time interval per user.</p><p style="text-align: justify;">Most users run agents solo or in pairs&#8212;84% of users still run one or two agents at a time. To our surprise, we expected a heavier distribution toward three or four-plus concurrent agents.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pWbJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pWbJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png 424w, https://substackcdn.com/image/fetch/$s_!pWbJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png 848w, https://substackcdn.com/image/fetch/$s_!pWbJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png 1272w, https://substackcdn.com/image/fetch/$s_!pWbJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pWbJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png" width="882" height="558" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:558,&quot;width&quot;:882,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pWbJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png 424w, https://substackcdn.com/image/fetch/$s_!pWbJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png 848w, https://substackcdn.com/image/fetch/$s_!pWbJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png 1272w, https://substackcdn.com/image/fetch/$s_!pWbJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a07ec11-3d63-43c3-88bf-fbc7c5f99472_882x558.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 1. </strong><em>Distribution of users by peak concurrent agents. The vast majority of users (84%) run only one or two agents at a time, with a steep drop-off at three agents (11.4%) and only 4.7% reaching four or more. Data reflects the 30-day window of Claude Code telemetry.</em></p><p style="text-align: justify;">It turns out the average user may still be in the early stages of the learning curve. But there is also a deeper question: Can we sustain multiple spans of attention to run different projects at the same time?</p><p style="text-align: justify;">One limitation of our data: we cannot determine whether users running multiple agents are breaking down the same project across sessions or running entirely separate projects. We also cannot see whether individual sessions spawn Claude Code sub-agents internally. Nevertheless, we asked: <em>Are users interacting simultaneously with all their agents equally?</em></p><h3><strong>Users running multi-agent spend most of their time with one agent</strong></h3><p><em>Can users actively work with multiple sessions in parallel ? </em></p><p>It is surprising to see that running multiple agents simultaneously doesn&#8217;t mean that agents are being used equally but rather in a more passive manner. Is this a feature or a bug ? We can&#8217;t verify how this workflow are being used but we have some hypotheses:</p><ol><li><p style="text-align: justify;"><strong>Attention Span:</strong> Users struggle to sustain the same level of attention across all of their agents. As Andrej Karpathy put it, &#8220;<em>We will need a bigger IDE</em>&#8221;<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>. The current tooling assumes a single-focus interaction model, and splitting attention across multiple terminals degrades the quality of each interaction.</p></li><li><p style="text-align: justify;"><strong>Background &#8220;autonomy&#8221; emerging:</strong> Some tasks require less active supervision than others, allowing certain agents to run semi-independently. This is not true background autonomy&#8212;where agents execute end-to-end without human input&#8212;but rather a transitional pattern where one agent gets active attention while others idle or run simple tasks.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k0E4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k0E4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png 424w, https://substackcdn.com/image/fetch/$s_!k0E4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png 848w, https://substackcdn.com/image/fetch/$s_!k0E4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png 1272w, https://substackcdn.com/image/fetch/$s_!k0E4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k0E4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png" width="1456" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k0E4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png 424w, https://substackcdn.com/image/fetch/$s_!k0E4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png 848w, https://substackcdn.com/image/fetch/$s_!k0E4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png 1272w, https://substackcdn.com/image/fetch/$s_!k0E4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418c8106-abf3-4c66-845d-0ee22703b234_1600x670.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 2. </strong><em>Time-weighted concurrency distribution by user tier. Even users who have reached higher peak concurrency levels spend the overwhelming majority of their active time&#8212;over 80%&#8212;focused on a single agent. Only Sustained users (peak &#8805; 4) show a meaningful share of multi-agent time, with 13.8% at two agents and 3.9% at four or more.</em></p><p style="text-align: justify;">The attention barrier limits interactive concurrency. When a user manages multiple agents interactively, they are context-switching &#8212; prompting one agent, waiting, switching to another, prompting again. The agents take turns being active because the human can only drive one at a time. Adding more terminals does not add parallel throughput; it adds more things to serialize through. Current tooling assumes a single-focus interaction model, and splitting attention across agents degrades the quality of each interaction rather than multiplying output.</p><p style="text-align: justify;">Background autonomy is what unlocks actual concurrency. The small fraction of time that users do sustain multiple agents maps almost directly to moments where one agent runs semi-independently while the user drives another. This is not yet full background autonomy &#8212; agents executing end-to-end without human input &#8212; but a transitional pattern where one agent receives active attention while others coast on simpler or longer-running tasks. It is this emerging autonomy, not multitasking skill, that creates the real multi-agent minutes in the data.</p><p style="text-align: justify;">The implication is clear: scaling AI leverage per developer is not a matter of opening more terminals. It requires agents capable of independent execution and tooling designed to orchestrate them. Until then, human attention remains the binding constraint on AI throughput.</p><h3><strong>Multi-agent users are more more productive per hour and more engaged overall </strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kOcN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kOcN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png 424w, https://substackcdn.com/image/fetch/$s_!kOcN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png 848w, https://substackcdn.com/image/fetch/$s_!kOcN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png 1272w, https://substackcdn.com/image/fetch/$s_!kOcN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kOcN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png" width="1456" height="869" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:869,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kOcN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png 424w, https://substackcdn.com/image/fetch/$s_!kOcN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png 848w, https://substackcdn.com/image/fetch/$s_!kOcN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png 1272w, https://substackcdn.com/image/fetch/$s_!kOcN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fb6d4e0-22c3-44ef-bd63-c499a318fb3b_1600x955.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 3. </strong><em>Behavioral comparison across peak concurrency tiers. Sustained users (peak &#8805; 4) show dramatically higher autonomy (47.5% vs 0%), commits per hour (2.3 vs 0.0), total active hours (32.5 vs 0.7), and total commits (69.5 vs 0) compared to single-agent users. Each panel shows median values per tier across a 30-day window</em></p><h2>Company View </h2><p style="text-align: justify;">We wanted to explore how companies are running agents to see whether organizational patterns differ from individual user behavior. In this section, we analyze company-level concurrency patterns and the maturity stages that emerged.</p><p style="text-align: justify;">We categorized each company&#8217;s maturity stage by evaluating the behavior of the majority of its users and assigning the corresponding level.</p><h3><strong>2.5% of companies are using a &#8220;fleet of agents&#8221;, majority of companies are running 2 agents in parallel  </strong></h3><p><em>What level of maturity are companies exhibiting ? How many agents are companies running concurrently? </em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ekmx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ekmx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png 424w, https://substackcdn.com/image/fetch/$s_!Ekmx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png 848w, https://substackcdn.com/image/fetch/$s_!Ekmx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png 1272w, https://substackcdn.com/image/fetch/$s_!Ekmx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ekmx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png" width="1456" height="716" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:716,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ekmx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png 424w, https://substackcdn.com/image/fetch/$s_!Ekmx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png 848w, https://substackcdn.com/image/fetch/$s_!Ekmx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png 1272w, https://substackcdn.com/image/fetch/$s_!Ekmx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F264ce558-0616-456d-85f4-092d7981ccc7_1484x730.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 4. </strong><em>Agentic maturity distribution across companies. The majority of companies (58.1%) fall into the Orchestrator tier, typically running two agents in parallel, while 39.4% remain at the Starter stage. Only 2.5% of companies have reached the Fleet+ tier, indicating that large-scale multi-agent deployments are still rare.</em></p><p style="text-align: justify;">Interestingly when we analyze behaviour at a company level, we see that the majority of companies are acting as <em>orchestrators</em>. Meaning that the majority of users per company are running two agents in parallel. </p><p style="text-align: justify;">We suspect this is part of a learning curve where users are getting more proficient and companies providing richer tools for orchestration and budget to <em>test and learn.</em> We can&#8217;t validate if this is a trend where we will see more companies running <em>fleets</em> of agents. But based on our previous research, it seems that we will have more advanced infrastructure to run a fleet of agents, which could accelerate movement along the maturity curve.</p><h3 style="text-align: justify;"><strong>Companies running fleet of agents, spend only 14% of their time in using multi-agentic mode</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YDPm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YDPm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png 424w, https://substackcdn.com/image/fetch/$s_!YDPm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png 848w, https://substackcdn.com/image/fetch/$s_!YDPm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png 1272w, https://substackcdn.com/image/fetch/$s_!YDPm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YDPm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png" width="1456" height="592" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a29277ac-0367-444a-91df-a58be75a4701_1622x660.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:592,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:136686,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/191617258?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YDPm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png 424w, https://substackcdn.com/image/fetch/$s_!YDPm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png 848w, https://substackcdn.com/image/fetch/$s_!YDPm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png 1272w, https://substackcdn.com/image/fetch/$s_!YDPm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29277ac-0367-444a-91df-a58be75a4701_1622x660.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"><strong>Figure 5. </strong><em>Global and per-stage breakdown of time spent in multi-agent versus single-agent mode. Across all companies, only 6.1% of active time involves multiple concurrent agents, out of +7 million total active minutes</em></p><p style="text-align: justify;">Companies running fleets of agents spend only 14% of their active time in multi-agent mode. This mirrors the individual user finding: interactive concurrency&#8212;actively managing multiple agents&#8212;consumes a small fraction of total time, likely because human attention remains the binding constraint. At the same time, we can&#8217;t validate if these are the first sparks of background autonomy of agents running in the background without requiring the user&#8217;s attention. </p><p style="text-align: justify;">Independently if any of these statements are true, this suggests a future where managing this fleet of agents could require less active time on each individual agent but more time in a meta system organizing and orchestrating agents - something like what Cursor built around <a href="https://cursor.com/blog/self-driving-codebases?utm_source=tldrai">self-driving codebases</a> orchestrating thousands of agents simultaneously or coordination apps emerging like <a href="https://docs.conductor.build/">Conductor app</a> </p><h3><strong>More active days, more multi-agents minutes, and agents running outside of normal hours</strong></h3><p><em>Are mature companies doing something different vs nascent ones ? </em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hqhB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hqhB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png 424w, https://substackcdn.com/image/fetch/$s_!hqhB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png 848w, https://substackcdn.com/image/fetch/$s_!hqhB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png 1272w, https://substackcdn.com/image/fetch/$s_!hqhB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hqhB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png" width="1456" height="1032" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1032,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hqhB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png 424w, https://substackcdn.com/image/fetch/$s_!hqhB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png 848w, https://substackcdn.com/image/fetch/$s_!hqhB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png 1272w, https://substackcdn.com/image/fetch/$s_!hqhB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82c93149-df2e-4d5d-9286-232bcff4bc04_1600x1134.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Figure 6.</strong><em>Behavioral triggers that differentiate company maturity stages across four dimensions. Fleet+ companies show substantially higher active days (21 vs 5), sessions per active day (7 vs 2), multi-agent minutes (194 vs 2), and off-hours activity (223 vs 20 minutes) compared to Starter companies.</em></p><p style="text-align: justify;">The off-hours activity signal is particularly noteworthy. Fleet+ companies show 10x more off-hours agent activity, suggesting early signs of background autonomy&#8212;agents running outside of normal working hours with reduced human supervision. Could this be the first spark of experiments of long running agents in the background, trying to replicate an <em>OpenClaw </em>experience running their <em>own computer </em>24/7 ? That could also be probably true.  </p><h3><strong>Notes &amp; Limitations: </strong></h3><p>This research was a first step. We analyzed what we can with current agentic activity and we encourage transparency of the limits that our data can&#8217;t tell: </p><ul><li><p>We analyzed Claude Code telemetry at the session level. We defined a Claude Code session as a unit of agent activity, but we could not measure sub-agents spawned by Claude Code within a session.</p></li><li><p>This analysis covers a 30-day period, which is our current analytical window. Agentic patterns like concurrency are emerging and evolving quickly. We plan to extend this analysis over time to detect trends, but the current time frame limits what can be observed.</p></li><li><p>We wanted to contrast different providers to compare Claude Code&#8217;s agentic features against alternatives, but we were limited by the data currently available to us.</p></li></ul><h3><strong>What&#8217;s Next ? </strong></h3><p style="text-align: justify;">Seeing only a small fraction of companies showing multi-agentic patterns reminds us that the industry is moving fast, but organizations and companies require time to implement new features and patterns in their current workflows. Understanding how to measure agentic behaviour like concurrency was a relevant first step to understanding how the development cycle of software is evolving. </p><p style="text-align: justify;">The <strong>agentic barrier(</strong>or attention barrier<strong>)</strong> is a central challenge. Our data shows that the transition from interactive concurrency to background autonomy is where the industry stalls. Users can spin up multiple agents, but they cannot meaningfully interact with more than one or two at a time. Crossing this barrier&#8212;through better orchestration tooling, trust frameworks, and infrastructure&#8212;is the phase change that will unlock true multi-agent productivity.</p><p style="text-align: justify;">Infrastructure for <strong>background autonomy</strong> is needed. Companies should start asking how their infrastructure will support running a fleet of agents that operate with minimal human oversight. This goes beyond simply scaling terminals&#8212;it requires new patterns for monitoring, error handling, and human-in-the-loop escalation.</p><p style="text-align: justify;">From a research perspective studying autonomy is an aspect that we will monitor to understand how this transition is happening and how users are solving some of the challenges mentioned. </p><h3><strong>Appendix </strong></h3><ul><li><p><strong>Agent:</strong>Unique session of Claude Code instance - compatible with Simon Willison definition of an agent. </p></li><li><p><strong>Concurrency: </strong>How many agents run simultaneously</p></li><li><p><strong>Interactive Concurrency: </strong>Agent running at the same time requiring human interaction to run </p></li><li><p><strong>Background Autonomy: </strong>Agents capable of running with minimal or null human intervention</p></li><li><p><strong>Peak Concurrency:</strong> Maximum number of agents running at a specific time</p></li><li><p><strong>Autonomous Streak:</strong> Consecutive minutes in a single session with only CLI activity, unbroken by any user action (type=user). Gaps in agent activity (e.g., waiting for API or build responses) do not break the streak; only human intervention does.</p></li><li><p><strong>Autonomous CLI Minutes:</strong> Total CLI-only minutes within qualifying autonomous streaks for a user or session. Minutes from short, non-qualifying streaks are excluded.</p></li><li><p><strong>Autonomous Hours:</strong> This measures the total independent agent work time in qualifying streaks.</p></li><li><p><strong>Autonomous Streak Count:</strong> The number of qualifying autonomous streaks for a user or session.</p></li></ul><p><strong>Concurrency Type:</strong></p><p>User tiers are classified by peak:</p><ul><li><p>Single (peak = 1)</p></li><li><p>Experimenter (peak = 2)</p></li><li><p>Explorer (peak = 3)</p></li><li><p>Sustained (peak &#8805; 4)</p></li></ul><p>Company maturity stages use the company median max peak:</p><ul><li><p>Starter (&#8804; 1)</p></li><li><p>Orchestrator (&#8804; 3)</p></li><li><p>Fleet Manager (&#8804; 6)</p></li><li><p>Fleet+ (&gt; 6)</p></li></ul><p>Companies are categorized by their dominant active cohort to assign a maturity stage</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>&#8203;&#8203;https://jellyfishresearch.substack.com/p/measuring-agentic-workflows</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>https://simonwillison.net/2025/Sep/18/agents/</p><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><a href="https://x.com/karpathy/status/2031767720933634100">Link</a></p></div></div>]]></content:encoded></item><item><title><![CDATA[Measuring Agentic Workflows ]]></title><description><![CDATA[A framework to understand what and how agents work and how the new workflows are evolving]]></description><link>https://jellyfishresearch.substack.com/p/measuring-agentic-workflows</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/measuring-agentic-workflows</guid><dc:creator><![CDATA[Tomas Pardinas]]></dc:creator><pubDate>Fri, 20 Feb 2026 20:59:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Hp7k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>As AI coding tools mature, some questions are rising about AI usage: How my team is using agents differently and what the workflow looks like ? How should we measure the different agents and workflows? </em></p><p>We analyzed usage from three AI coding tools across our customer base throughout 2025 &#8212; GitHub Copilot, Cursor, and Claude Code &#8212; and asked: what does each tool actually do, how should we measure, and what would a fair comparison look like?</p><h3><strong>What We Measured and How</strong></h3><p>Our dataset covers 90M suggestions from January through December 2025, aggregated through the Jellyfish platform from vendor APIs. For each tool, we tracked <strong>suggestion acceptance rate, line acceptance rate, lines per suggestion, and lines per acceptance</strong>. </p><p>We then categorized each tool to a different group that reflects how engineers are using these new tools and new workflows emerging.</p><h3><strong>A Framework for Understanding AI Coding agents</strong></h3><p>We started by analyzing and comparing suggestion acceptance rates, and immediately saw that each tool sits at a <strong>different point on the spectrum</strong>. On one end is the full agentic experience of Claude Code; on the other are GitHub Copilot&#8217;s inline completions and Chat at lower to mid acceptance rates. This tells us something important: each tool is being used in a fundamentally different way, for a different purpose, at a different level of trust.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uhp-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uhp-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png 424w, https://substackcdn.com/image/fetch/$s_!uhp-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png 848w, https://substackcdn.com/image/fetch/$s_!uhp-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png 1272w, https://substackcdn.com/image/fetch/$s_!uhp-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uhp-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png" width="1456" height="842" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:842,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uhp-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png 424w, https://substackcdn.com/image/fetch/$s_!uhp-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png 848w, https://substackcdn.com/image/fetch/$s_!uhp-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png 1272w, https://substackcdn.com/image/fetch/$s_!uhp-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F688e9004-21fe-45fe-8023-65515ca9ca33_1456x842.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We use suggestion acceptance rate (%) to position each tool and define four distinct workflow tiers:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cjc1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cjc1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png 424w, https://substackcdn.com/image/fetch/$s_!cjc1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png 848w, https://substackcdn.com/image/fetch/$s_!cjc1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png 1272w, https://substackcdn.com/image/fetch/$s_!cjc1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cjc1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png" width="728" height="529.3368888888889" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:818,&quot;width&quot;:1125,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cjc1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png 424w, https://substackcdn.com/image/fetch/$s_!cjc1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png 848w, https://substackcdn.com/image/fetch/$s_!cjc1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png 1272w, https://substackcdn.com/image/fetch/$s_!cjc1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560a3615-bf18-47d6-ab23-20c584eb2c7c_1125x818.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>We drew some distinctions within each product to compare fairly, within the limits of our data taxonomy. For example, GitHub Copilot spans W1 and W2. Cursor operates at W3. Claude Code operates at W4.</em></p><h2><strong>Comparing normalized suggestions</strong></h2><h3><strong>What Is a &#8220;Suggestion&#8221;?</strong></h3><p>Suggestion is defined different across tools:</p><ul><li><p><strong>Copilot/Cursor</strong>: How often users Tab-accept the inline ghost completions or block of code. </p></li><li><p><strong>Claude Code</strong>: How often users approve the proposed file changes or commands</p></li></ul><p>Since we identified four different workflows, we wanted to understand whether this behavioral difference was reflected in both suggestion acceptance and line acceptance rates:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g2pd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g2pd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png 424w, https://substackcdn.com/image/fetch/$s_!g2pd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png 848w, https://substackcdn.com/image/fetch/$s_!g2pd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png 1272w, https://substackcdn.com/image/fetch/$s_!g2pd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g2pd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png" width="894" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:894,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g2pd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png 424w, https://substackcdn.com/image/fetch/$s_!g2pd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png 848w, https://substackcdn.com/image/fetch/$s_!g2pd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png 1272w, https://substackcdn.com/image/fetch/$s_!g2pd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd978c9b4-3f28-49cc-a661-07ca1f4bb024_894x627.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1><strong>Findings</strong></h1><p>Normalizing by lines rather than suggestions exposes something important about how each workflow actually works.</p><ol><li><p><strong>Copilot completions: </strong>A 5-point gap between suggestion rate and line rate. The units are small enough that partial acceptance rarely makes sense &#8212; what gets accepted, stays.</p></li><li><p><strong>Cursor: </strong>A 40-point gap. Developers accept 4 out of 5 Cursor suggestions &#8212; but then keep only about half the lines. The real work happens after acceptance: reviewing the diff, editing what doesn&#8217;t fit, deleting what was wrong. Cursor&#8217;s 81% suggestion rate describes the starting point of a review process, not the end of one. We also think that user when using the IDE have a workflow of reviewing the code and rejecting lines because the review is part of the process. Since Cursor uses multiple models there is a follow up question: <em>Could it be that some models are affecting the output more than others?</em> Maybe </p></li><li><p><strong>Claude Code: </strong>Line acceptance rate can&#8217;t be calculated from available data, but the workflow is inherently binary. When you accept Claude Code output you&#8217;re taking an entire implementation &#8212; there&#8217;s no line-by-line diff to partially approve. Developers either trust the result and move forward, or reject it and iterate. This workflow makes sense, it&#8217;s the type of continuous agent workflow that developers work with a forward mindset - instead of reviewing and deleting what doesn&#8217;t work the user accepts most of the code in blocks, checks if it&#8217;s functional and re-accepts another code of block to fix bugs.</p></li></ol><h1><strong>Trends: How are workflows evolving ?</strong></h1><p>We can see how the trend and evolution of acceptance rates can show us how each workflow evolved in the market. </p><p>In May&#8211;June 2025, <a href="https://cursor.com/changelog/page/4">Cursor released its 1.0 experience</a> with significant enhancements including &#8220;background agent&#8221; and &#8220;tool search enhancement.&#8221; This was a pivotal moment: acceptance rates spiked and the tool shifted firmly into W3 (IDE-Native), moving away from the chat-heavy W2 experience that had characterized earlier usage. Once Cursor stabilized above 90% acceptance, that metric essentially lost signal.</p><p>On the other side, Copilot Chat started 2025 around 50% acceptance and stabilized at 10&#8211;20% &#8212; a substantial drop. We think this reflects a shift in how teams are using it: less for direct code generation, more as a contextual thinking tool. Chat becomes a way to understand a problem or explore an approach before actually building, not a way to ship lines. An open question worth following: <em>did users migrate from using W2 alone, or did they converge on a combined W2 (understanding) + W3 (building) workflow? </em>Something for future research. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XW2u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XW2u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png 424w, https://substackcdn.com/image/fetch/$s_!XW2u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png 848w, https://substackcdn.com/image/fetch/$s_!XW2u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!XW2u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XW2u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png" width="728" height="1101.9867549668875" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1057,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XW2u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png 424w, https://substackcdn.com/image/fetch/$s_!XW2u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png 848w, https://substackcdn.com/image/fetch/$s_!XW2u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!XW2u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b0acb2f-1df7-4c4d-a0d5-5d08bcb98eb6_1057x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This suggests two things:</p><p><strong>a)</strong> Tools are evolving quickly on their <em>core</em> use case &#8212; not their peripheral features.</p><p><strong>b)</strong> Knowing which workflow a tool occupies &#8212; and when that changes &#8212; is critical for measurement. After Cursor&#8217;s 1.0 release, it moved firmly into W3 territory, which means acceptance rate no longer has much signal. The right metric to reach for shifts to how developers are editing the code the IDE generates. </p><p>Now that we understand how each workflow behaves, we can find the metrics that actually correlate with quality in each one.</p><h1><strong>One Metric per Workflow</strong></h1><p>To find the right metric for each workflow, it helps to invert the question: <strong>how do we know when a team is using a workflow poorly?</strong></p><ul><li><p>For <strong>Claude Code (W4)</strong>, frequent interruptions signal that teams aren&#8217;t letting the agent run &#8212; which undermines the core value of full agentic output.</p></li><li><p>For <strong>Cursor (W3)</strong>, heavy post-acceptance editing means suggestions aren&#8217;t landing well, often due to insufficient context engineering in the repo.</p></li><li><p>For <strong>Copilot Chat (W2)</strong>, if the chat isn&#8217;t providing useful answers, engineers switch to other sources and the workflow breaks down.</p></li><li><p>For <strong>Copilot Inline (W1)</strong>, a flat or low acceptance rate tells you directly whether suggestions match the patterns engineers actually want to keep.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hp7k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hp7k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png 424w, https://substackcdn.com/image/fetch/$s_!Hp7k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png 848w, https://substackcdn.com/image/fetch/$s_!Hp7k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png 1272w, https://substackcdn.com/image/fetch/$s_!Hp7k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hp7k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png" width="724" height="272.2952548330404" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:428,&quot;width&quot;:1138,&quot;resizeWidth&quot;:724,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Hp7k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png 424w, https://substackcdn.com/image/fetch/$s_!Hp7k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png 848w, https://substackcdn.com/image/fetch/$s_!Hp7k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png 1272w, https://substackcdn.com/image/fetch/$s_!Hp7k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2aefd8-5160-4fac-af93-726aff24993c_1138x428.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Each metric is designed to detect process breakdowns &#8212; so we can surface signals about where the workflow is working and where it isn&#8217;t.</em></p><p><strong>Industry Benchmarks:</strong></p><p>Comparing AI tool performance across companies is genuinely difficult &#8212; especially given how fast these tools are evolving and how differently teams use them. Still, looking at benchmark ranges helps teams understand where they stand and calibrate their expectations.</p><p>We decided to explore how it looks like and see if we find interesting points to consider when using this framework: </p><h4>Copilot improving, Cursor with a healthy rejection rate, Claude Code with near-zero revert rates</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6A1r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6A1r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png 424w, https://substackcdn.com/image/fetch/$s_!6A1r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png 848w, https://substackcdn.com/image/fetch/$s_!6A1r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png 1272w, https://substackcdn.com/image/fetch/$s_!6A1r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6A1r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png" width="1456" height="633" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:633,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6A1r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png 424w, https://substackcdn.com/image/fetch/$s_!6A1r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png 848w, https://substackcdn.com/image/fetch/$s_!6A1r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png 1272w, https://substackcdn.com/image/fetch/$s_!6A1r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f65a101-5d85-4b2b-b369-09db69666e54_1456x633.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Copilot Inline:</strong> Approximately 3 out 4 suggestions are still rejected, but we are seeing how this numbers are improving for the past 6 months. Even among top users, acceptance peaks below 28%, suggesting there&#8217;s still room to improve suggestion fit.</p></li><li><p><strong>Copilot Chat: </strong>Counter-intuitively, most teams using Chat are moving <em>slower</em>, not faster. Our hypothesis is that teams are reaching for Chat on harder, more ambiguous problems &#8212; which increases cycle time by design. This makes Chat&#8217;s value harder to see in throughput metrics, but that doesn&#8217;t mean it&#8217;s absent. <em>(See below for detail.)</em></p></li><li><p><strong>Cursor: </strong>Median rejection rate of 13.7%. A lower rate indicates stronger fit between generated code and what developers actually keep. (<strong>Note:</strong> True post-acceptance edit rate wasn&#8217;t directly available; suggestion rejection rate was used as a proxy.)</p></li><li><p><strong>Claude Code: </strong>Agentic output is being merged with near-zero revert rates at the median. This could mean strong output quality &#8212; or it could reflect the early adoption effect, where teams running lower-stakes experiments simply aren&#8217;t generating the kind of code that gets reverted later. Separating these explanations will require more time and higher-volume data.</p></li></ul><p><strong>A Closer Look at Copilot Chat</strong></p><p>The &#8220;AI makes teams slower&#8221; finding deserves more detail on nuances. We split PRs into Simple and Complex (by number of files changed) and compared cycle times to see if teams where spending more time on harder challenges. </p><p><strong>AI-assisted vs. non-AI cycle times across percentiles:</strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lO9z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lO9z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png 424w, https://substackcdn.com/image/fetch/$s_!lO9z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png 848w, https://substackcdn.com/image/fetch/$s_!lO9z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png 1272w, https://substackcdn.com/image/fetch/$s_!lO9z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lO9z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png" width="728" height="181" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:362,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lO9z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png 424w, https://substackcdn.com/image/fetch/$s_!lO9z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png 848w, https://substackcdn.com/image/fetch/$s_!lO9z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png 1272w, https://substackcdn.com/image/fetch/$s_!lO9z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b03afa5-dbfc-454f-864b-2441bdab25ed_1456x362.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>What we found:</strong></p><ul><li><p><strong>Simple PRs:</strong> Chat actually widens the gap &#8212; AI-assisted PRs are slower across all percentiles. Suggests developers are over-engineering or over-consulting via chat for work that is small and considered <em>simple ? Maybe, we need to validate this further. </em></p></li><li><p><strong>Complex PRs:</strong> At Median and P75, the gap persists &#8212; AI is still slower. Selection bias (using Chat on harder work) could explains part of this, but not all.</p><ul><li><p>At P90 on Complex PRs the gap essentially vanishes: 7.92d (AI) vs 8.00d (non-AI). This means that best teams, on the hardest problems, are moving faster with AI to solve the most complex challenges</p></li><li><p>The teams who extract that value (P90) do so on complex PRs where the alternative isn&#8217;t &#8220;do it faster without AI&#8221; &#8212; it&#8217;s &#8220;get blocked.&#8221;</p></li></ul></li></ul><h1><strong>Conclusions</strong></h1><p>Measuring AI coding tools fairly requires accepting a foundational premise: these tools don&#8217;t all do the same thing. Comparing Copilot&#8217;s suggestion acceptance rate to Claude Code&#8217;s revert rate isn&#8217;t an apples-to-apples comparison &#8212; it&#8217;s comparing individual keystrokes to entire implementations. Having a framework to understand different workflows gives teams a principled way to identify which measurement approach fits each tool in their stack. Like we have seen, tools and workflows are still evolving and that means checking in what is the right metric up-to-date. </p><p>We also seen how some teams have extracted good value from this tools like increasing cycle time or reverting only a fraction of code. Some of this suggest a natural learning curve that we are seeing. On one hand only advanced teams are gaining efficiencies, but on the other time we see cycle time increase suggesting that gains remain to be seen for the beginners. </p><p><em>Methodology</em></p><ul><li><p><strong>Coverage:</strong> January 1 &#8211; December 31, 2025</p></li><li><p><strong>Metric definitions</strong></p></li></ul><ul><li><p><em>Suggestion acceptance rate</em>: % of AI-generated suggestions accepted by the developer </p></li><li><p><em>Line acceptance rate</em>: % of suggested lines retained after acceptance </p></li><li><p><em>Post-acceptance edit rate / Suggestion rejection rate</em>: % of accepted suggestions where generated lines were subsequently removed or modified </p></li><li><p><em>Cycle time delta</em>: Difference in PR cycle time (days) between AI-assisted and non-AI PRs, compared at matching percentiles (P50, P75, P90)</p></li><li><p><em>PR revert rate</em>: % of merged PRs subsequently reverted, segmented by AI tool usage</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Is AI Finally Taming Bug Backlogs?]]></title><description><![CDATA[Bug backlogs never decrease... but with AI at least they increase more slowly]]></description><link>https://jellyfishresearch.substack.com/p/is-ai-finally-taming-bug-backlogs</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/is-ai-finally-taming-bug-backlogs</guid><dc:creator><![CDATA[Nicholas Arcolano]]></dc:creator><pubDate>Fri, 20 Feb 2026 20:07:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!lnBh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For engineering leaders, growing bug backlogs are like the proverbial death and taxes: another of life&#8217;s certainties.</p><p>In fact, we saw on average that Jellyfish customers added about 30 more bugs (per 100 engineers) to their backlogs than they resolved last quarter.</p><p>However, I&#8217;m hearing from a lot of folks pushing hard on AI transformation &#8211; especially those getting good context engineering in place and deploying autonomous agents &#8211; that they&#8217;re using these tools to better tame their bug backlogs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lnBh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lnBh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png 424w, https://substackcdn.com/image/fetch/$s_!lnBh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png 848w, https://substackcdn.com/image/fetch/$s_!lnBh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png 1272w, https://substackcdn.com/image/fetch/$s_!lnBh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lnBh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png" width="1456" height="896" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:289753,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/199370261?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lnBh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png 424w, https://substackcdn.com/image/fetch/$s_!lnBh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png 848w, https://substackcdn.com/image/fetch/$s_!lnBh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png 1272w, https://substackcdn.com/image/fetch/$s_!lnBh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc78751a-49fb-46e8-b1c9-83d507b26194_3600x2216.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Interestingly, the data bears it out. The plot above shows net bugs added in Q4 2025 versus AI adoption level (percentage of engineers that are frequent AI coding tool users) for a set of 500+ Jellyfish customers.</p><p>At the lowest AI adoption levels, companies add ~43 net bugs per 100 engineers per quarter. At the highest? Around 22 &#8211; cut nearly in half.</p><p>Maybe someday AI coding tools and agents will help us get to a place where bug backlogs actually <em>shrink</em> instead of grow... or maybe we&#8217;ll have a whole new set of quality challenges to deal with!</p>]]></content:encoded></item><item><title><![CDATA[Looks Like It Really Was a Claude Christmas After All]]></title><description><![CDATA[While everyone else was enjoying the holidays, Claude users' activity definitely spiked over the holidays]]></description><link>https://jellyfishresearch.substack.com/p/looks-like-it-really-was-a-claude</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/looks-like-it-really-was-a-claude</guid><dc:creator><![CDATA[Nicholas Arcolano]]></dc:creator><pubDate>Wed, 11 Feb 2026 20:03:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!9ILb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Last year we reported that AI-assisted PRs contained about 18% more lines of code than non-AI PRs. That felt significant at the time. Now? We're seeing 50-160% more code per person depending on the tool, and the gap keeps widening.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9ILb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9ILb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png 424w, https://substackcdn.com/image/fetch/$s_!9ILb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png 848w, https://substackcdn.com/image/fetch/$s_!9ILb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png 1272w, https://substackcdn.com/image/fetch/$s_!9ILb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9ILb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png" width="1456" height="973" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:973,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:548167,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/199368994?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9ILb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png 424w, https://substackcdn.com/image/fetch/$s_!9ILb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png 848w, https://substackcdn.com/image/fetch/$s_!9ILb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png 1272w, https://substackcdn.com/image/fetch/$s_!9ILb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c299a38-ccdc-4cb2-a30c-8bf63d9a7da6_4118x2751.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This chart shows average lines of code added per person per week, broken out by AI coding tool, from Jellyfish data over the past two months. In January, Claude Code users averaged 2.6x the lines added compared to engineers not using AI tools. Cursor came in at 1.6x, GitHub Copilot at 1.5x.</p><p>One thing that jumped out: look at the end of December. Activity (predictably) dipped across the board for the holidays... except Claude Code, which spiked. If you were online over the break, you may have heard folks talking about &#8220;Claude Christmas&#8221;, where the November Opus 4.5 release + increased usage limits + holiday downtime led to a big surge in usage. Our data corroborates the hype: whatever was happening over the holidays, Claude Code users were shipping.</p><p>To be clear: lines of code isn&#8217;t the whole story &#8211; quality is paramount, more code isn&#8217;t always better code, and all this volume has the potential to create bottlenecks elsewhere in the SDLC. But this is a striking signal about how much raw output varies by tool, and it&#8217;s consistent with the PR throughput patterns we&#8217;ve been tracking.</p><p>Last year was about exploration. This year is all about scaling. If January is any indication, it&#8217;s going to be a wild ride!</p>]]></content:encoded></item><item><title><![CDATA[Three Improvements to the Jellyfish MCP]]></title><description><![CDATA[We&#8217;re continuously improving the Jellyfish MCP to give you a better experience with AI-powered engineering insights.]]></description><link>https://jellyfishresearch.substack.com/p/three-improvements-to-the-jellyfish</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/three-improvements-to-the-jellyfish</guid><dc:creator><![CDATA[Sophie Goldstein]]></dc:creator><pubDate>Tue, 10 Feb 2026 15:32:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Kioy!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57d4c023-fbf3-4996-b9dd-1cb76e830f75_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;re continuously improving the <a href="https://github.com/Jellyfish-AI/jellyfish-mcp">Jellyfish MCP</a> to give you a better experience with AI-powered engineering insights. Here&#8217;s what&#8217;s updated in the newest version:</p><ol><li><p><strong>Data now outputs in TOON format instead of JSON. </strong>How you send data to AI models matters more than you might think&#8212;send too much or in the wrong format, and you&#8217;re wasting tokens. We ran <a href="https://open.substack.com/pub/jellyfishresearch/p/the-token-tax-why-we-switched-to?utm_campaign=post-expanded-share&amp;utm_medium=web">an analysis</a> on Jellyfish data comparing formats and found that <a href="https://github.com/toon-format/toon">TOON</a>, a YAML-like format, is more token-efficient than JSON in most cases. The MCP now automatically translates data into TOON before sending it to your AI host application.</p></li><li><p><strong>Rebuilt from Python to Node.js. </strong>The MCP server and Claude Desktop extension now share the same codebase in Node.js, which means faster updates and more consistent behavior across platforms. The Python version is no longer being updated, but the source code for the previous version is still available <a href="https://github.com/Jellyfish-AI/jellyfish-mcp/tree/529714a472646460130c5edf2ff20f2e3c8d0d56">here</a>.</p></li><li><p><strong>Docker support. </strong>We published a <a href="https://hub.docker.com/r/jellyfishco/jellyfish-mcp">Docker image on Docker Hub</a> so you can run the MCP server in a container. Once configured, Docker handles everything&#8212;no manual installs, and updates pull automatically so you&#8217;re always on the latest version.</p></li></ol><p><em>What this means for you: </em>Nothing changes about how you use the MCP. Just make sure you&#8217;re on the latest version. If you want to try the new Docker setup, follow the directions in the <a href="https://github.com/Jellyfish-AI/jellyfish-mcp">README</a>, and once configured, it will automatically stay up to date.</p><p>The hardest part of building an MCP isn&#8217;t the engineering &#8212; it&#8217;s managing context. Every tool response sends data to an LLM, and figuring out how much context to include is an ongoing challenge when the model is essentially a black box. There&#8217;s no formula for getting it right, at least not yet. So we&#8217;re iterating: testing different approaches, exploring new ideas, and learning as we go. The switch to TOON came directly out of this process &#8212; and it won&#8217;t be the last change we make. If you&#8217;re building an MCP too, our experience has been that this kind of iteration is where the real progress happens.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://jellyfishresearch.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Jellyfish Research! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Measuring ROI in AI era: The language effect]]></title><description><![CDATA[An analysis on AI ROI, acceptance rate and programing languages]]></description><link>https://jellyfishresearch.substack.com/p/measuring-roi-in-ai-era-the-language</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/measuring-roi-in-ai-era-the-language</guid><dc:creator><![CDATA[Tomas Pardinas]]></dc:creator><pubDate>Fri, 23 Jan 2026 18:36:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kewq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>Overview</h3><p><em>An emerging question is the return on investment (ROI) for AI and how should companies analyze it. Particularly we focused on how the nature of the programming language used significantly influence the acceptance rate and, consequently, the ROI?</em></p><p><em>We analyzed 750+ million lines of AI-suggested code across companies using multiple AI coding tools and discovered that programming language<strong> choice affects the AI ROI</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://jellyfishresearch.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Jellyfish Research! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3>Introduction</h3><p>Every three-six months, a new LLM record is shattered on a public leaderboard, yet for the average developer, the &#8220;revolutionary&#8221; shift rarely feels reflected in the daily git commit. This was true until Anthropic released Opus 4.5 and suddenly the impact was a &#8220;before-after&#8221; Opus 4.5 era, code started to go at another quality and speed level.</p><p>But here&#8217;s the question that experience raises: <em>how do you actually measure that impact?</em></p><p>While industry benchmarks focus on capabilities like quality, performance and safety, they often miss the metrics that truly matter to a business: productivity gains or value generated.</p><p>Quantifying the value derived from lines of code is inherently complex, subjective, and prone to disagreement. Ultimately, value is a business-level concept, tied directly to the organization&#8217;s mission and assessed through core business metrics. However, we can use the adoption of generated &#8220;tokens&#8221; as a proxy for value creation. If an AI tool produces valuable output, its tokens are adopted; if not, they are rejected. By measuring the ratio of accepted to rejected tokens within our codebase, we could gain a clearer perspective on the value (or at least the productivity) generated by using AI, and consequently a <em>proxy</em> of its ROI.</p><p>In this piece, we will measure acceptance rate not with tokens, but rather with lines of code. If tokens are the new &#8220;currency&#8221; of AI, then acceptance rate(lines accepted/lines generated) is the <em>exchange rate</em> that tells you what that currency is actually worth in the real world.</p><h3>Main Findings</h3><p>Our analysis revealed a clear and consistent pattern across the industry:</p><ul><li><p><strong>The Language Gap is Real:</strong> Acceptance rates for code generated by <strong>code</strong> <strong>languages</strong> like Go, Python, and Ruby are significantly higher, achieving <strong>22&#8211;30%</strong>, compared to <strong>configuration</strong> languages such as JSON and YAML, which see only <strong>10&#8211;20% acceptance</strong>. This represents a persistent <strong>2&#8211;3x performance difference</strong> across all companies.</p></li><li><p><strong>Language Type Predicts Success:</strong> <strong>Go</strong> is the most accepted language among the options, leading with a <strong>30.28%</strong> acceptance rate across 121 companies. Python, despite having a high volume of suggestions (<strong>16.2M lines</strong>), has a <strong>24.74%</strong> acceptance rate. JSON trails significantly, with only a <strong>10.30%</strong> acceptance rate across 151 companies.</p></li><li><p><strong>Volume Doesn&#8217;t Explain It:</strong> There is no correlation between languages acceptance rate and volume of code generated - quantity of lines of code generated does not affect accepted lines.</p></li></ul><h3>Coding Language: The most valuable language per output</h3><p>When we group languages by category: Code,Config and Markup the separation becomes clear: Coding leads the value generated in contrast to markups. Comparing the medians between coding vs markup languages we see almost a 2x between these two. We think the reason why this gap exist is because of three characteristics innate of each language: <strong>Complexity, Context and Variation </strong></p><ul><li><p><strong>Code languages:</strong> Median ~24%, range 21-30%</p></li><li><p><strong>Config languages:</strong> Median ~15%, range 10-20%</p></li><li><p><strong>Markup languages:</strong> Median ~13%, range 10-14%</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kewq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kewq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png 424w, https://substackcdn.com/image/fetch/$s_!kewq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png 848w, https://substackcdn.com/image/fetch/$s_!kewq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png 1272w, https://substackcdn.com/image/fetch/$s_!kewq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kewq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png" width="1600" height="985" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:985,&quot;width&quot;:1600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:132231,&quot;alt&quot;:&quot;Language Categories Distribution&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Language Categories Distribution" title="Language Categories Distribution" srcset="https://substackcdn.com/image/fetch/$s_!kewq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png 424w, https://substackcdn.com/image/fetch/$s_!kewq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png 848w, https://substackcdn.com/image/fetch/$s_!kewq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png 1272w, https://substackcdn.com/image/fetch/$s_!kewq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62c14833-3e4b-425c-a4dc-0f87c973a226_1600x985.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Language Categories Distribution</em></p><h2>Complexity, Context and Variation</h2><p>Languages that have diverse patterns, flexibility and rich signals like coding languages tend to thrive in LLM environments, in contrast to config files that are rigid and limited - below some examples of differences between languages: </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7SQD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7SQD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png 424w, https://substackcdn.com/image/fetch/$s_!7SQD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png 848w, https://substackcdn.com/image/fetch/$s_!7SQD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png 1272w, https://substackcdn.com/image/fetch/$s_!7SQD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7SQD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png" width="1456" height="820" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:820,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7SQD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png 424w, https://substackcdn.com/image/fetch/$s_!7SQD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png 848w, https://substackcdn.com/image/fetch/$s_!7SQD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png 1272w, https://substackcdn.com/image/fetch/$s_!7SQD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42bb985f-633b-46f3-878f-5160fab38e95_1600x901.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>AI offers high acceptance rate in code languages (like Python, Go, TypeScript, Java) due to their complexity, varied patterns, and rich contextual information (imports, types, hierarchies). This allows AI to provide genuinely helpful alternatives and completions.</em></p><p><em>Conversely, acceptance rate is lower in formulaic configuration languages (like JSON, YAML). These have low pattern complexity, require one correct format, and are often context-poor (standalone key-value pairs). Since these formats are brittle, developers reject &#8220;close enough&#8221; suggestions, limiting AI&#8217;s value beyond basic autocomplete.</em></p><h2>Language Matters More Than You Think: Top 10 is predominantly code language</h2><p>Across 750+ million lines of AI-suggested code from multiple AI coding tools, we found a clear hierarchy. Traditional programming languages consistently achieve 2-3x higher acceptance rates than configuration and markup languages. <strong>The top ten acceptance rates by languages show how this is predominantly led by code language. </strong>If your team writes primarily config files, even the best AI tool will show &#8220;disappointing&#8221; overall numbers. That&#8217;s not a tool failure, it&#8217;s the &#8220;language effect&#8221; in action with Go having the highest acceptance rate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gi2M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gi2M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png 424w, https://substackcdn.com/image/fetch/$s_!gi2M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png 848w, https://substackcdn.com/image/fetch/$s_!gi2M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png 1272w, https://substackcdn.com/image/fetch/$s_!gi2M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gi2M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png" width="1600" height="893" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:893,&quot;width&quot;:1600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:117918,&quot;alt&quot;:&quot;Language Acceptance Rates&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Language Acceptance Rates" title="Language Acceptance Rates" srcset="https://substackcdn.com/image/fetch/$s_!gi2M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png 424w, https://substackcdn.com/image/fetch/$s_!gi2M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png 848w, https://substackcdn.com/image/fetch/$s_!gi2M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png 1272w, https://substackcdn.com/image/fetch/$s_!gi2M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c8be33-908e-4d1a-9ba8-f487f3608c08_1600x893.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Language Acceptance Rates(%)</em></p><h3>Volume vs Acceptance: No Correlation</h3><p>One question that we wanted to validate is: <em>&#8220;Is the volume of code generated affecting the quality of tokens generated ?&#8221;</em> The answer seems <em>no, volume doesn&#8217;t predict acceptance</em>. Python (most widely used) and Go (mid-volume) both achieve high acceptance rates because they&#8217;re code languages. JSON (high volume) and Markdown (low volume) both struggle because they&#8217;re configuration/markup formats.</p><p>If volume mattered, we&#8217;d see a diagonal trend. Instead, we see horizontal clustering by language type. This might be because acceptance rate is more closely related to the performance of the actual LLM in the different context engineering environments and its conditions. Although LLMs tend to experience degradation over time depending on the context length and task difficulty the correlation with language vs volume disappears.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bPoT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bPoT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png 424w, https://substackcdn.com/image/fetch/$s_!bPoT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png 848w, https://substackcdn.com/image/fetch/$s_!bPoT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png 1272w, https://substackcdn.com/image/fetch/$s_!bPoT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bPoT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png" width="1600" height="949" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:949,&quot;width&quot;:1600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:165736,&quot;alt&quot;:&quot;Volume vs Acceptance Scatter&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Volume vs Acceptance Scatter" title="Volume vs Acceptance Scatter" srcset="https://substackcdn.com/image/fetch/$s_!bPoT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png 424w, https://substackcdn.com/image/fetch/$s_!bPoT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png 848w, https://substackcdn.com/image/fetch/$s_!bPoT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png 1272w, https://substackcdn.com/image/fetch/$s_!bPoT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ce82174-4d87-498a-aab2-ad1362efe291_1600x949.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Volume vs Acceptance </em></p><h2>Practical Implications: What This Means for Your Team</h2><h3>Measuring ROI</h3><ul><li><p><strong>Measure Time Saved, Not Just Cost:</strong> Track the duration of content creation and review cycles pre- and post AI.(Jellyfish <a href="https://jellyfish.co/platform/jellyfish-ai-impact/">AI Impact</a> tracks this for you !)</p></li><li><p><strong>Convert Accepted Lines into Financial Value:</strong> Measure acceptance rate of tokens and calculate ROI using the following function: </p><p></p><p><strong>ROI</strong> = [(Acceptance Rate &#215; Total Tokens &#215; Time Saved per Token &#215; Dev Cost) - AI Cost] / AI Cost</p></li></ul><h2>Conclusion</h2><p>Programming language choice has a <strong>larger impact</strong> on AI code acceptance rates than which AI tool you use. Our analysis of <strong>750+ million lines</strong> across <strong>239+ companies</strong> and <strong>multiple AI coding tools</strong> shows that traditional programming languages (Go, Python, Ruby, TypeScript) achieve <strong>22-30%</strong> acceptance rates, while configuration and markup formats (JSON, YAML, Markdown) achieve <strong>10-20%</strong> - a consistent <strong>2-3x difference</strong>.</p><p>Before you analyze your AI coding assistant for low ROI, check your token efficiency and how language distribution affects the ROI.Before you switch tools, measure acceptance rates by language category. Before you set team targets, adjust expectations based on what languages your team actually writes.</p><h2>Methodology</h2><p><strong>Data Source:</strong> Jellyfish AI Coding Assistant Usage Analytics</p><p><strong>Time Period:</strong> Jan 2025 - December 2025</p><p><strong>Coverage:</strong> <strong>750+ million lines</strong> suggested across multiple AI coding tools </p><p><strong>Analysis Approach:</strong></p><ul><li><p>Language analysis aggregated across all tools to identify universal patterns </p></li><li><p>Acceptance rate = (lines accepted / lines suggested) &#215; 100 </p></li><li><p>Categories defined by language purpose: Code, Config, Markup </p></li></ul><p><strong>Key Metrics:</strong></p><ul><li><p>Acceptance rate by language (aggregated across all tools)</p></li><li><p>Total lines suggested per language</p></li><li><p>Company adoption breadth per language - Category-level distributions</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://jellyfishresearch.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Jellyfish Research! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Token Tax: Why we Switched to TOON in the Jellyfish MCP]]></title><description><![CDATA[Part of the "AI Engineering at Jellyfish" Series]]></description><link>https://jellyfishresearch.substack.com/p/the-token-tax-why-we-switched-to</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/the-token-tax-why-we-switched-to</guid><dc:creator><![CDATA[Sophie Goldstein]]></dc:creator><pubDate>Mon, 22 Dec 2025 18:04:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!b4gY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><strong>Context Fragility</strong></h2><p>Large Language Models (LLMs) are powerful tools capable of executing complex tasks. As we continue to leverage and implement them, it has become increasingly clear that while a good prompt is essential, it is only half the battle. The difference between a helpful response and a hallucination often lies not just in what you ask, but also in the context you provide.</p><p>Tackling more complex applications - such as asking an LLM to refactor a codebase or analyze team performance metrics - requires providing the right context. But context is finite and fragile. Even as context windows expand from 200K (<a href="https://platform.claude.com/docs/en/about-claude/models/overview">Claude</a>) to 400K (<a href="https://platform.openai.com/docs/models/gpt-5.2">GPT-5.2</a>) or 1M (<a href="https://ai.google.dev/gemini-api/docs/gemini-3">Gemini</a>) tokens<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>, larger volumes of data can negatively affect a model&#8217;s attention, leading to lower accuracy. While newer models like GPT-5.2 are designed to <a href="https://www.cometapi.com/what-is-gpt-5-2-of-5-major-updates-in-gpt-5-2/#2-how-has-long-text-comprehension-and-cross-document-reasoning-improved">improve long-context reliability</a> and cross-document reasoning, the risk of attention saturation remains a primary bottleneck for performance.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://jellyfishresearch.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Jellyfish Research! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Managing this finite space has given rise to <a href="https://blog.langchain.com/context-engineering-for-agents">Context Engineering</a>. As we discover more about <a href="https://www.dbreunig.com/2025/06/22/how-contexts-fail-and-how-to-fix-them.html">how contexts fail</a>, developers are learning <a href="https://www.dbreunig.com/2025/06/26/how-to-fix-your-context.html">how to fix their context</a> by curating the optimal set of tokens during inference. As <a href="https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents">Anthropic</a> describes it, this practice refers to &#8220;the set of strategies for curating and maintaining the optimal set of tokens (information) during LLM inference.&#8221; Limiting the number of tokens passed to an LLM is no longer just about avoiding limits or costs - it is about ensuring the model remains sharp and focused.</p><h2><strong>Jellyfish MCP</strong></h2><p>Earlier this year, we released the <a href="https://github.com/Jellyfish-AI/jellyfish-mcp">Jellyfish MCP</a> server to provide data and analytics from the Jellyfish platform on team performance. The MCP wraps our external API to return information on metrics, allocations, deliverables, and more. Each MCP tool corresponds to a specific API endpoint, and our current implementation returns the endpoint&#8217;s raw response in JSON format.</p><p>As we&#8217;ve continued building the MCP and learning about LLM best practices, a constant theme has emerged concerning format optimization: <a href="https://www.improvingagents.com/blog/best-nested-data-format">structured text formats, such as Markdown and YAML, tend to be significantly more context-efficient</a> and easier for LLMs to interpret than verbose formats like JSON.</p><p>Because the MCP simply relays output from our API, this question is central to our design: Does changing the output format meaningfully reduce context consumption without compromising the accuracy of the data? We explored format optimization as a low-complexity, high-impact technique to manage context bloat.</p><h2><strong>Methodology</strong></h2><p>Since the MCP currently returns raw JSON from the Jellyfish API, we decided to test three different output formats:</p><ol><li><p><strong>Pretty JSON</strong>: Human-readable format with indentation and line breaks&#8212;easy to scan but heavy on whitespace.</p></li><li><p><strong>Minified JSON</strong>: Strips all unnecessary whitespace from JSON, producing a compact but less readable output optimized for machine processing.</p></li><li><p><strong><a href="https://github.com/toon-format/toon">TOON</a> (Token-Oriented Object Notation)</strong>: Format designed specifically for LLM consumption, transforming JSON into a tabular notation that aims to reduce token count while maintaining readability. According to its <a href="https://github.com/toon-format/toon?tab=readme-ov-file#key-features">documentation</a>, TOON uses roughly 40% fewer tokens than JSON while achieving 74% accuracy compared to JSON&#8217;s 70%.</p></li></ol><p><em>Pretty JSON:</em></p><pre><code><code>{
  "context": {
    "task": "Our favorite hikes together",
    "location": "Boulder",
    "season": "spring_2025"
  },
  "friends": [
    "ana",
    "luis",
    "sam"
  ],
  "hikes": [
    {
      "id": 1,
      "name": "Blue Lake Trail",
      "distanceKm": 7.5,
      "elevationGain": 320
    },
    {
      "id": 2,
      "name": "Ridge Overlook",
      "distanceKm": 9.2,
      "elevationGain": 540
    }
  ]
}</code></code></pre><p><em>Minified JSON:</em></p><pre><code><code>{"context":{"task":"Our favorite hikes together","location":"Boulder","season":"spring_2025"},"friends":["ana","luis","sam"],"hikes":[{"id":1,"name":"Blue Lake Trail","distanceKm":7.5,"elevationGain":320},{"id":2,"name":"Ridge Overlook","distanceKm":9.2,"elevationGain":540}]}</code></code></pre><p><em>TOON:</em></p><pre><code><code>context:
  task: Our favorite hikes together
  location: Boulder
  season: spring_2025
friends[3]: ana,luis,sam
hikes[2]{id,name,distanceKm,elevationGain}:
  1,Blue Lake Trail,7.5,320
  2,Ridge Overlook,9.2,540</code></code></pre><p>We tested these three formats across 25 endpoints from the Jellyfish API, each with different response characteristics&#8212;some returned deeply nested structures while others returned flat lists. After converting each response into all three formats, we measured token counts using <a href="https://platform.claude.com/docs/en/build-with-claude/token-counting">Anthropic&#8217;s tokenizer</a> for Claude Opus 4.5. Once we measured the token count, we compared the differences to see which format performs the best.</p><h2><strong>Results</strong></h2><h3><strong>Overall Performance Comparison</strong></h3><p>Across all 25 endpoints tested, <strong>TOON consistently outperformed Pretty JSON by consuming fewer tokens</strong>, achieving a median <strong>reduction</strong> in token count of <strong>42.69%. </strong>This result is unsurprising&#8212;Pretty JSON relies heavily on whitespace for readability, using indentation and line breaks to make nested structures human-parsable. TOON achieves similar readability through its compact notation while eliminating most of this whitespace overhead, resulting in substantial token savings across the board.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!b4gY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!b4gY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png 424w, https://substackcdn.com/image/fetch/$s_!b4gY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png 848w, https://substackcdn.com/image/fetch/$s_!b4gY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png 1272w, https://substackcdn.com/image/fetch/$s_!b4gY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!b4gY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png" width="1075" height="666" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:666,&quot;width&quot;:1075,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40033,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/182034051?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed996e2b-7ffc-4497-bd5f-499b2777b081_1075x666.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!b4gY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png 424w, https://substackcdn.com/image/fetch/$s_!b4gY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png 848w, https://substackcdn.com/image/fetch/$s_!b4gY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png 1272w, https://substackcdn.com/image/fetch/$s_!b4gY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F250c61f9-475c-4b9c-bbca-8d1da0ee23f2_1075x666.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The comparison between Minified JSON and TOON reveals a more nuanced picture. For many endpoints, TOON remained the more token-efficient format. This is because TOON excels with flat, uniform arrays of objects&#8212;when every item in an array shares the same fields at the same level, TOON can collapse the structure into a compact tabular format, declaring field names once as a header and representing each object as a simple CSV-like row. This eliminates the repeated key names that inflate JSON.</p><h3><strong>Performance by Endpoint Category</strong></h3><p>Interestingly, the 11 allocation endpoints told a different story: Minified JSON was consistently more token-efficient than TOON for these 11 endpoints. Upon investigating why these endpoints behaved differently, we found that their responses share specific structural characteristics: nested objects within repeated entries and dictionaries that use dynamic keys as data values. When array entries contain nested objects or sub-arrays, TOON cannot flatten them into tabular rows and falls back to YAML-like indentation, which adds whitespace overhead. Similarly, when dictionaries use dynamic keys as values rather than a consistent schema, there&#8217;s no repeating structure for TOON to factor out. <strong>In these cases, Minified JSON&#8217;s approach of stripping all whitespace produces a more compact result.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sAus!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sAus!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png 424w, https://substackcdn.com/image/fetch/$s_!sAus!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png 848w, https://substackcdn.com/image/fetch/$s_!sAus!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png 1272w, https://substackcdn.com/image/fetch/$s_!sAus!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sAus!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png" width="1456" height="443" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:443,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:183506,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/182034051?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sAus!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png 424w, https://substackcdn.com/image/fetch/$s_!sAus!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png 848w, https://substackcdn.com/image/fetch/$s_!sAus!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png 1272w, https://substackcdn.com/image/fetch/$s_!sAus!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2684a899-21ec-413d-828f-1a5c63287bac_1932x588.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>MCP Implementation</strong></h2><p>Based on our findings, we decided to implement TOON as the standard output format for the Jellyfish MCP&#8217;s Node server. Although Minified JSON proved more efficient for some nested endpoints like allocations, TOON provided a general reduction in context bloat. This efficiency is particularly valuable since there is an option to send the output to both Hugging Face (for <a href="https://huggingface.co/meta-llama/Llama-Prompt-Guard-2-86M">Prompt Guard</a>) and the LLM host, meaning the token savings result in a double reduction in processing overhead. Beyond context management, there is a secondary benefit: human readability. Unlike dense JSON blocks, TOON&#8217;s tabular output makes it significantly easier for users to quickly scan and verify the raw data returned by the MCP.</p><h2><strong>Practical Application</strong></h2><p>We wanted to move beyond theoretical benchmarks and test if these token savings translate directly to usability in a real-world workflow. To do this, we conducted a functional test in Claude Desktop using the Claude Opus 4.5 model. We paired the Jellyfish MCP with a Jellyfish Visualization Skill - a custom Claude Skill we created that allows the LLM to generate dashboard-style visualizations using the Jellyfish style. It is important to note that the Visualization Skill is not optimized for context, but it was used primarily to test the real-world application of the Jellyfish MCP. </p><p>We tasked the model with two sequential prompts in a single thread, comparing the total token output of the tool calls in Minified JSON versus TOON. Our token measurement focused strictly on the output tokens generated by the tool calls, excluding the Visualization Skill definition or conversational overhead.</p><p>When configured to return Minified JSON, Claude executed 15 distinct tool calls. After switching the output to TOON, it triggered 17 calls - 15 of which were identical to the JSON run. Despite this higher volume of requests, <strong>the TOON format achieved a 9.39% reduction in total token consumption</strong>. Consistent with our earlier findings, the allocations tools remained the only outlier where TOON consumed more context than Minified JSON. In fact, if we exclude the allocations tools from the calculation, the difference skyrockets to a <strong>23.33% decrease in token count when using TOON</strong>.</p><h2><strong>Conclusion</strong></h2><p>Our testing demonstrated two clear findings: </p><ol><li><p>TOON was the most effective format for our MCP, providing the best balance of readability and context efficiency across the majority of our tools. </p></li><li><p>Minified JSON was most effective in cases where the data structure was nested and dynamic, such as the allocations endpoint.</p></li></ol><p>Based on our findings, our next set of updates for the <a href="https://github.com/Jellyfish-AI/jellyfish-mcp">Jellyfish MCP&#8217;s</a> Node server will prioritize the reduction of context bloat, establishing TOON as the standard. <strong>These changes, along with several housekeeping updates and the new DevEx API endpoint, are scheduled to be released to the repository shortly</strong>, providing teams with deeper visibility into their developer experience metrics.</p><h2><strong>Looking Ahead</strong></h2><p>Looking at the broader landscape, this work is just one piece of the puzzle. As we continue to refine how we feed data to LLMs, we are keeping an eye on emerging solutions that streamline this process further. Platforms like <a href="https://tessl.io/">Tessl</a> are pioneering &#8220;Spec-Driven Development,&#8221; which creates structured, AI-ready context by default. Similarly, frameworks like <a href="https://www.langchain.com/">LangChain</a> are introducing middleware for agents, allowing developers to programmatically filter and compress context before it ever reaches the model.</p><p>Ultimately, our work with TOON and Minified JSON is about maximizing the value of every token. But efficient formatting is only the immediate hurdle. The more space we save on raw data, the more capacity we create for <a href="https://youtu.be/ZAPxC3SOX_o?si=UQxHf12As9P81Z2Z">Memory Engineering</a>. While context engineering optimizes the current session, memory engineering focuses on how agents persist information, learn from past interactions, and evolve over time. By reducing the bloat of today&#8217;s context, we are effectively clearing the stage for the persistent, evolving memory of tomorrow.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>One token generally corresponds to approximately four characters of English text, or roughly three-quarters of a word</p></div></div>]]></content:encoded></item><item><title><![CDATA[Luddites versus Lovers: is AI Coding Polarized, or is There a Middle Ground?]]></title><description><![CDATA[Coding in the Age of AI]]></description><link>https://jellyfishresearch.substack.com/p/luddites-versus-lovers-is-ai-coding</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/luddites-versus-lovers-is-ai-coding</guid><dc:creator><![CDATA[Tomas Pardinas]]></dc:creator><pubDate>Thu, 11 Dec 2025 15:52:22 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4e1efd77-bd52-4037-9df9-151aa435b76e_1212x1261.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the age of AI, engineering organizations everywhere are pushing adoption to position engineers for productivity gains. But spend any time in the engineering blogosphere, and you&#8217;ll read strong opinions resisting or rejecting these new tools. This led us to wonder: with the benefit of a large customer base across the industry, are we seeing a spectrum of adoption of AI coding tools &#8212; or is there a stark divide between engineers who are adopting these new tools enthusiastically, and the resisters who aren&#8217;t? Are we in a world of AI &#8220;Luddites versus lovers&#8221;?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w9pA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w9pA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png 424w, https://substackcdn.com/image/fetch/$s_!w9pA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png 848w, https://substackcdn.com/image/fetch/$s_!w9pA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png 1272w, https://substackcdn.com/image/fetch/$s_!w9pA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w9pA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png" width="302" height="314.2095709570957" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:1261,&quot;width&quot;:1212,&quot;resizeWidth&quot;:302,&quot;bytes&quot;:4200585,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w9pA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png 424w, https://substackcdn.com/image/fetch/$s_!w9pA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png 848w, https://substackcdn.com/image/fetch/$s_!w9pA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png 1272w, https://substackcdn.com/image/fetch/$s_!w9pA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd178a43f-ad72-4d72-b0dc-10a5ba4fa9e0_1212x1261.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Leader of the Luddites (Wikipedia)</figcaption></figure></div><p>We analyzed more than a hundred companies and thousands of users and discovered some interesting patterns. We defined adoption in terms of <strong>usage percentage</strong>&#8212;the proportion of workdays on which an AI coding tool was used&#8212;and categorized users into three buckets: Low (0&#8211;20%), Moderate (21&#8211;80%), and High (81&#8211;100%) usage.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://jellyfishresearch.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Jellyfish Research! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Our analysis revealed that we don&#8217;t live in a world of Luddites vs lovers in the AI age: rather, there is a middle ground of adoption. Some of our findings show:</p><ol><li><p><strong>Partial Polarization:</strong></p><ol><li><p>Usage distribution shows a right-skewed pattern, with 31.7% of users in the low usage bucket (0-20%); and only a small secondary peak at high usage.</p></li><li><p>Our measure of polarization - the Polarization Index - remains stable over time, with only a slight decrease over the past few months.</p></li></ol></li></ol><ol start="2"><li><p><strong>Tool-Specific Differences:</strong> GitHub Copilot shows the highest Polarization Index (1.28), while Amazon Q shows the lowest (0.48), indicating tool choice significantly influences adoption patterns.</p></li><li><p><strong>Middle Segment Holding:</strong> The moderate usage group (21-80%) remains stable at ~40% of users over the past months, showing no evidence of collapse.</p></li></ol><p>Rather than polarization, AI usage follows a <strong>graduated adoption curve</strong> where most users fall into low-to-moderate usage, with a long tail of power users. This suggests AI tools are widely <strong>tried,</strong> but deeply adopted by only a subset of users.</p><h2>Usage Distribution: Skewed Toward Low Usage, But With a Middle Ground</h2><p><em>Distribution Histogram</em></p><p>The distribution is <strong>weakly bimodal</strong> &#8212; we see clustering at the low end (a clear peak at 1-10% usage), but no corresponding peak at high usage, just a minor mode (at 61-70% usage) within a gradual tail. The median user uses AI tools <strong>28.4%</strong> of available workdays, suggesting moderate adoption is typical.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Oxii!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Oxii!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png 424w, https://substackcdn.com/image/fetch/$s_!Oxii!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png 848w, https://substackcdn.com/image/fetch/$s_!Oxii!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png 1272w, https://substackcdn.com/image/fetch/$s_!Oxii!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Oxii!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png" width="1600" height="719" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:719,&quot;width&quot;:1600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:151076,&quot;alt&quot;:&quot;Distribution Histogram&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Distribution Histogram" title="Distribution Histogram" srcset="https://substackcdn.com/image/fetch/$s_!Oxii!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png 424w, https://substackcdn.com/image/fetch/$s_!Oxii!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png 848w, https://substackcdn.com/image/fetch/$s_!Oxii!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png 1272w, https://substackcdn.com/image/fetch/$s_!Oxii!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55c0d19d-4148-4ed2-bdda-8c22b29dc4aa_1600x719.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Polarization Index Remains Stable, Slight Decrease Over Time</h2><p><em>Polarization Trend</em></p><p>We define a <strong>Polarization Index</strong> as: (Low Usage + High Usage) / Moderate Usage. For the past 6 months our data shows that polarization of usage has been going down, and that users are increasingly open to adopting AI &#8212; the slight negative slope suggests more users are moving into moderate usage ranges.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k1XH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k1XH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png 424w, https://substackcdn.com/image/fetch/$s_!k1XH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png 848w, https://substackcdn.com/image/fetch/$s_!k1XH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png 1272w, https://substackcdn.com/image/fetch/$s_!k1XH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k1XH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png" width="1456" height="631" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:631,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Polarization Trend&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Polarization Trend" title="Polarization Trend" srcset="https://substackcdn.com/image/fetch/$s_!k1XH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png 424w, https://substackcdn.com/image/fetch/$s_!k1XH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png 848w, https://substackcdn.com/image/fetch/$s_!k1XH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png 1272w, https://substackcdn.com/image/fetch/$s_!k1XH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f1e30ac-6ac8-43d9-b767-e8c63da4b3b1_1600x693.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Significant Tool-Specific Polarization Differences</h2><p><em>Tool Comparison</em></p><p>A different story emerges when we analyze polarization at the tool level. GitHub Copilot contrasts significantly with Amazon Q, suggesting that each tool drives different usage patterns. Breaking down each tool by usage bucket reveals distinct adoption profiles.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0nLI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0nLI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png 424w, https://substackcdn.com/image/fetch/$s_!0nLI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png 848w, https://substackcdn.com/image/fetch/$s_!0nLI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png 1272w, https://substackcdn.com/image/fetch/$s_!0nLI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0nLI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png" width="1456" height="868" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:868,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Tool Comparison&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Tool Comparison" title="Tool Comparison" srcset="https://substackcdn.com/image/fetch/$s_!0nLI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png 424w, https://substackcdn.com/image/fetch/$s_!0nLI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png 848w, https://substackcdn.com/image/fetch/$s_!0nLI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png 1272w, https://substackcdn.com/image/fetch/$s_!0nLI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2633078b-935f-469c-ae3c-eb4d7a2069b3_1600x954.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Tool Bucket Comparison</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Tc2y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Tc2y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png 424w, https://substackcdn.com/image/fetch/$s_!Tc2y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png 848w, https://substackcdn.com/image/fetch/$s_!Tc2y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png 1272w, https://substackcdn.com/image/fetch/$s_!Tc2y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Tc2y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png" width="1456" height="723" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:723,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Tool Bucket Comparison&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Tool Bucket Comparison" title="Tool Bucket Comparison" srcset="https://substackcdn.com/image/fetch/$s_!Tc2y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png 424w, https://substackcdn.com/image/fetch/$s_!Tc2y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png 848w, https://substackcdn.com/image/fetch/$s_!Tc2y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png 1272w, https://substackcdn.com/image/fetch/$s_!Tc2y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fcde3c8-7931-421d-a1d7-b7e8ba9db4df_1600x794.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The varying levels of tool usage suggest that the specific use case and the characteristics of each tool are independently influencing how much they are adopted.</p><ul><li><p><strong>GitHub Copilot</strong> (most mature tool) shows highest polarization, with nearly half of users in the low usage range</p></li><li><p><strong>Cursor</strong> shows a more balanced distribution, with the majority in moderate usage range</p></li><li><p><strong>Claude Code</strong> has the highest proportion of power users, with 28%</p></li><li><p><strong>Amazon Q</strong> shows most uniform distribution across usage ranges</p></li></ul><p>Mature tools (GitHub Copilot) have had more time to accumulate inactive/low-usage licenses, while newer tools (Claude Code) are attracting more committed early adopters, creating tool-specific polarization patterns.</p><h2>Distribution Evolution Shows Stability, Not Polarization</h2><p><em>Monthly Snapshots</em></p><p>The <strong>moderate and high usage segments are growing faster</strong> than the low usage segment, contradicting polarization toward extremes. This suggests that new users are entering at moderate usage levels, some users are graduating from occasional to frequent usage, and the user base is expanding rather than bifurcating.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MqMZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MqMZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png 424w, https://substackcdn.com/image/fetch/$s_!MqMZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png 848w, https://substackcdn.com/image/fetch/$s_!MqMZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png 1272w, https://substackcdn.com/image/fetch/$s_!MqMZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MqMZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png" width="1456" height="744" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:744,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Monthly Snapshots&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Monthly Snapshots" title="Monthly Snapshots" srcset="https://substackcdn.com/image/fetch/$s_!MqMZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png 424w, https://substackcdn.com/image/fetch/$s_!MqMZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png 848w, https://substackcdn.com/image/fetch/$s_!MqMZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png 1272w, https://substackcdn.com/image/fetch/$s_!MqMZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6e0aa2a-134d-4957-a0a3-258658368b3a_1600x818.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Conclusion</h2><p>What we&#8217;re seeing is a lopsided adoption curve: most engineers are dipping their toes in (with low to moderate use), while a smaller group is fully committed. The encouraging news is that this divide is leveling out, with more users moving into the middle ground over time.</p><p>Tool choice matters significantly. People may stick with what just plain works, but newer tools tend to attract committed early adopters with more frequent usage patterns. This has practical implications for engineering leaders: </p><ul><li><p><strong>If you&#8217;re piloting newer tools</strong>, know that early adopters tend to go all-in &#8212; this is a huge opportunity to use their enthusiasm to build internal champions. </p></li><li><p><strong>The middle-usage segment is your conversion opportunity.</strong> Focus on understanding what would move occasional users to regular usage. The path to AI-powered engineering isn&#8217;t about forcing adoption &#8212; it&#8217;s about creating the conditions where moderate users naturally become power users.</p></li></ul><h2>Methodology</h2><ul><li><p><strong>Usage percentage:</strong> Unique days a user accessed their AI tool &#247; available workdays (Monday&#8211;Friday, from seat activation to analysis end date)</p></li><li><p><strong>Usage buckets:</strong> Low (0&#8211;20%), Moderate (21&#8211;80%), High (81&#8211;100%)</p></li><li><p><strong>Polarization Index:</strong> (Low Users + High Users) &#247; Moderate Users &#8212; values &gt;1.5 suggest strong polarization toward extremes</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://jellyfishresearch.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Jellyfish Research! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[AI Coding Tools Not Paying Off? Your Code Architecture Might Be to Blame.]]></title><description><![CDATA[Centralized architectures can lead to 4x increases in productivity, while distributed architectures can lead to little-to-no gains... or even declines.]]></description><link>https://jellyfishresearch.substack.com/p/ai-coding-tools-not-paying-off-your</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/ai-coding-tools-not-paying-off-your</guid><dc:creator><![CDATA[Nicholas Arcolano]]></dc:creator><pubDate>Thu, 16 Oct 2025 15:15:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Cjk2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p><em>Cross-posted at <a href="https://jellyfish.co/blog/ai-coding-tools-not-paying-off-your-code-architecture-might-be-to-blame/">jellyfish.co</a></em></p></blockquote><p>Here at Jellyfish <strong><a href="https://jellyfish.co/blog/jellyfish-openai-team-up-to-measure-impact-of-ai-coding-tools/">we recently teamed up with OpenAI</a></strong> to study how AI coding tools are affecting developer productivity. We found that adoption of AI tools correlated with significant increases in productivity, particularly PR throughput. In fact, achieving full adoption of AI coding tools (i.e. every engineer using AI every time they code) <strong>corresponded to a 2.1x change in PRs merged.</strong></p><p>However, we all know that software engineering teams are complex entities, each with their own unique constraints, strengths, and challenges. Perhaps you&#8217;re seeing exciting gains&#8230; or perhaps you&#8217;re finding the impact of your AI investments underwhelming. In either case, there are many factors beyond adoption that affect PR throughput. Let&#8217;s talk about the most important: your <strong>code architecture</strong>.</p><h2>What Do We Mean By &#8220;Code Architecture&#8221;?</h2><p>By <strong>code architecture</strong>, we mean the strategy by which your products and services are organized and coordinated across your source code repositories. Think monorepo vs. polyrepo, monolith vs. microservices, or a centralized vs. federated product/platform strategy.</p><p>One simple metric for understanding your code architecture is <strong>weekly active repositories:</strong> how many distinct repositories did you merge code into this week? However, this metric grows linearly with the number of engineers &#8211; larger companies tend to have more repositories, representing more products and services as they scale. Alternatively, here at Jellyfish we&#8217;ve developed a new metric: <strong>weekly active repositories per engineer</strong>. This metric is more useful in practice, characterizing your organization&#8217;s repo strategy independent of scale. A low number of active repos per engineer indicates a more consolidated architecture (monorepos, monolithic services, and/or centralized products) while a high number indicates a more distributed architecture (polyrepos, microservices, and/or federated products).</p><p><strong>To understand different code architectures and how they affect AI productivity gains, we investigated data from 321 Jellyfish customers from January to August 2025, representing more than 3.8 million pull requests across 130,000 code repositories.</strong></p><p>The chart below shows the distribution of weekly active repos per engineer over the data set (9,602 total &#8220;company-weeks&#8221;, i.e. snapshots of each company for the various weeks over the observation period).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s_Ih!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s_Ih!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp 424w, https://substackcdn.com/image/fetch/$s_!s_Ih!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp 848w, https://substackcdn.com/image/fetch/$s_!s_Ih!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp 1272w, https://substackcdn.com/image/fetch/$s_!s_Ih!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s_Ih!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp" width="1456" height="1027" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1027,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Repos per engineer&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Repos per engineer" title="Repos per engineer" srcset="https://substackcdn.com/image/fetch/$s_!s_Ih!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp 424w, https://substackcdn.com/image/fetch/$s_!s_Ih!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp 848w, https://substackcdn.com/image/fetch/$s_!s_Ih!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp 1272w, https://substackcdn.com/image/fetch/$s_!s_Ih!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F379a7978-a8cc-49af-aa4c-7d3515d3eedb_2201x1553.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This chart illustrates the wide range of distinct code architectures that companies adopt. We&#8217;ve broken the distribution down into quartiles, labeling each with the code architecture that most characterizes it. The table below summarizes the different regimes and the types of architecture strategies that give rise to each.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lOw5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lOw5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png 424w, https://substackcdn.com/image/fetch/$s_!lOw5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png 848w, https://substackcdn.com/image/fetch/$s_!lOw5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png 1272w, https://substackcdn.com/image/fetch/$s_!lOw5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lOw5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png" width="1456" height="542" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:542,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:116349,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/179153659?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lOw5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png 424w, https://substackcdn.com/image/fetch/$s_!lOw5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png 848w, https://substackcdn.com/image/fetch/$s_!lOw5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png 1272w, https://substackcdn.com/image/fetch/$s_!lOw5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5a91750-8784-45e8-b5c7-b58122b230f1_1702x634.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>How Code Architecture Affects AI Impact</h2><p>As in previous analyses, we specify a company-wide <strong>AI adoption rate</strong> that measures how regularly engineers in an organization are using AI coding tools. It is defined as the fraction of active coding days an engineer used AI, averaged across all engineers on the team.</p><p>When we segment the data by code architecture, we find that coding behaviors and the effects of AI adoption vary widely across the cohorts. The chart below contains four scatter plots, where each data point is a snapshot of a single company and week, indicating the PRs merged per engineer and AI adoption rate for that company-week. Each scatter plot corresponds to a different code architecture (quartile) &#8211; centralized, balanced, distributed, and highly distributed. We also fit trend lines to measure the relationship between change in AI adoption rate and changes in PR throughput.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cjk2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cjk2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp 424w, https://substackcdn.com/image/fetch/$s_!Cjk2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp 848w, https://substackcdn.com/image/fetch/$s_!Cjk2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp 1272w, https://substackcdn.com/image/fetch/$s_!Cjk2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cjk2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp" width="1456" height="1179" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1179,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Quartile analysis repos&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Quartile analysis repos" title="Quartile analysis repos" srcset="https://substackcdn.com/image/fetch/$s_!Cjk2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp 424w, https://substackcdn.com/image/fetch/$s_!Cjk2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp 848w, https://substackcdn.com/image/fetch/$s_!Cjk2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp 1272w, https://substackcdn.com/image/fetch/$s_!Cjk2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6f3cda-7d52-457e-8ae6-8ce49699ee99_2201x1783.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The first thing to notice is that the average volume of PRs varies across the code architectures, increasing as code becomes less centralized. <strong>Independent of AI coding practices, teams with highly distributed architectures merge approximately 2x the PRs per engineer compared to those with centralized architectures </strong>(5.3 vs. 2.7 PRs per engineer on average)<strong>.</strong></p><p>Note that this difference does <em>not</em> suggest a disparity in overall productivity. Rather, the higher PR volume associated with more distributed architectures is primarily due to the higher level of cross-repo coordination required. In a distributed architecture, shipping a single feature typically requires changes across many services, and a single upgrade or code migration may have to be duplicated across multiple repositories, resulting in a higher number of pull requests on average.</p><p>Most interestingly, the correlation between AI adoption and PR throughput also varies across code architectures. <strong>Teams with centralized and balanced architectures can expect a ~4x change in PRs per engineer</strong> (approximately double what we observed in our previous study) when going from 0% to 100% adoption of AI coding tools. <strong>Teams with distributed architectures</strong> <strong>can expect a ~2x change</strong> &#8211; about half of what we observe for centralized and balanced architectures.</p><p>For the top quartile the relationship between AI adoption and throughput is the noisiest, and the correlation indicates a slight <em>negative</em> trend, indicating that <strong>adoption of AI tools may actually be slowing down teams with highly distributed code architectures. </strong>We summarize all of the results in the table below.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_M-z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_M-z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png 424w, https://substackcdn.com/image/fetch/$s_!_M-z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png 848w, https://substackcdn.com/image/fetch/$s_!_M-z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png 1272w, https://substackcdn.com/image/fetch/$s_!_M-z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_M-z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png" width="1456" height="527" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:527,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Engineering repos_2&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Engineering repos_2" title="Engineering repos_2" srcset="https://substackcdn.com/image/fetch/$s_!_M-z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png 424w, https://substackcdn.com/image/fetch/$s_!_M-z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png 848w, https://substackcdn.com/image/fetch/$s_!_M-z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png 1272w, https://substackcdn.com/image/fetch/$s_!_M-z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc57ecb6-e4c0-40d2-ba72-59be21073f4f_2338x846.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The obvious question is, what is causing these disparities? One explanation is the importance of <strong>context </strong>to the effectiveness of AI coding tools. In the past year we have seen huge advances in the abilities of LLMs to perform indexing and search as well as tool use, and in many ways today&#8217;s AI tools are limited only by the quality of the context provided to them. <strong>This may be why code architecture matters: the more centralized the code, the easier it is for AI coding tools to find the context they need to complete work accurately, effectively, and with all relevant changes included.</strong> By contrast, highly distributed systems likely rely on human experts who understand the architecture and can manage the complex cross-repo coordination required to push code effectively.</p><p>Does this mean that organizations with highly distributed architectures are doomed? Not at all: context engineering and AI agents are evolving fast, and these organizations may have a lot to gain through better automation of coding work that spans repositories, services, and products, especially if it is highly redundant and can be easily validated. Coding, testing, and <strong><a href="https://jellyfish.co/blog/impact-of-ai-code-review-agents/">review agents</a></strong> that can coordinate work across repositories could yield huge gains for these kinds of architectures.</p><p>For now though, the data is clear: your code architecture is a critical (and perhaps unexpected) factor in the return on your AI investments. As you scale your use of these powerful new tools, take a hard look at your code architecture &#8211; <strong>a more consolidated codebase may be the key to unlocking the promise of bigger AI productivity gains.</strong></p>]]></content:encoded></item><item><title><![CDATA[Better Code, or Just Bigger?]]></title><description><![CDATA[AI-assisted pull requests are 18% larger.]]></description><link>https://jellyfishresearch.substack.com/p/better-code-or-just-bigger</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/better-code-or-just-bigger</guid><dc:creator><![CDATA[Nicholas Arcolano]]></dc:creator><pubDate>Tue, 02 Sep 2025 15:24:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!DCla!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p><em>Cross-posted at <a href="https://jellyfish.co/blog/ai-assisted-pull-requests-are-18-larger/">jellyfish.co</a></em></p></blockquote><p>Last month we shared some exciting work we did in collaboration with folks at OpenAI <strong><a href="https://jellyfish.co/blog/jellyfish-openai-team-up-to-measure-impact-of-ai-coding-tools/">investigating the impact of AI coding tools</a></strong> across millions of pull requests for more than 500 companies.</p><p>Among other things, we found that full adoption of AI coding tools correlated with a 113% increase in PR throughput. (Check out OpenAI&#8217;s take on the analysis <strong><a href="https://www.linkedin.com/posts/aaron-ronnie-chatterji_theres-a-lot-of-buzz-and-real-debate-about-activity-7362158640918630404-9i9g?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAArFJU8BN29tPoEA6-a1i6Ho1N45qxndAxg">here</a></strong> and our full post <strong><a href="https://jellyfish.co/blog/jellyfish-openai-team-up-to-measure-impact-of-ai-coding-tools/">here</a></strong>.)</p><h2>AI-assisted PRs are getting bigger</h2><p>As we continue to dig deeper into the data, one interesting discovery is that in addition to increasing PR throughput and improving cycle times, teams with high levels of AI adoption also are pushing <em>larger</em> PRs.</p><p>As before, we defined a company-wide &#8220;AI Adoption Rate&#8221; which reflects how regularly engineers on a team are using AI tools (defined as the fraction of active coding days an engineer used AI, averaged across all engineers on the team). For the plot below, each data point is a snapshot of the average additions per PR and the level of AI adoption for a given company and week.</p><p>What we see is that going from 0% to 100% AI adoption corresponds to an increase from 74.8 additions per PR to 88.4 additions per PR &#8211; an 18.2% increase. In other words, as companies increase AI adoption, their product features and platform enhancements tend to involve more lines of code than before.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DCla!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DCla!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png 424w, https://substackcdn.com/image/fetch/$s_!DCla!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png 848w, https://substackcdn.com/image/fetch/$s_!DCla!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!DCla!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DCla!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png" width="1456" height="969" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:969,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;companies with higher AI use push larger PRs&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="companies with higher AI use push larger PRs" title="companies with higher AI use push larger PRs" srcset="https://substackcdn.com/image/fetch/$s_!DCla!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png 424w, https://substackcdn.com/image/fetch/$s_!DCla!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png 848w, https://substackcdn.com/image/fetch/$s_!DCla!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!DCla!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac3e4704-1e7e-4b9a-90ae-52b6b347e75c_1999x1330.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Why are PRs getting bigger?</h2><p>Many factors could be driving this change. To look a little deeper, let&#8217;s compare some metrics for the top quartile of companies in the plot above (75&#8211;100% AI adoption) versus the bottom quartile (0&#8211;25% AI adoption).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nLAf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nLAf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png 424w, https://substackcdn.com/image/fetch/$s_!nLAf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png 848w, https://substackcdn.com/image/fetch/$s_!nLAf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png 1272w, https://substackcdn.com/image/fetch/$s_!nLAf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nLAf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png" width="1262" height="1208" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1208,&quot;width&quot;:1262,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:212488,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://jellyfishresearch.substack.com/i/179155005?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nLAf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png 424w, https://substackcdn.com/image/fetch/$s_!nLAf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png 848w, https://substackcdn.com/image/fetch/$s_!nLAf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png 1272w, https://substackcdn.com/image/fetch/$s_!nLAf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd999e5d9-b0be-42a9-a5c3-48bf8386bc5d_1262x1208.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Looking at these metrics, we can see that most of the change appears to be from net new code, not rewritten code. Companies in the top quartile are only pushing 4.1% more deletions on average, versus 14.5% more net additions. Interestingly, we don&#8217;t see a big increase in the number of files per PR, suggesting AI is generating more code within the same files, not necessarily finding more changes to make across files.</p><p>If we take a look across file types, we see that companies in the top quartile are using certain languages and file formats at a much higher rate (up to 76% more), particularly TypeScript, Python, YAML, and Markdown. This trend aligns with the fact that LLMs generally perform better when working with these highly structured, well-documented programming languages and supporting configuration and documentation formats. The AI-assisted code that high-adoption companies are writing in these formats tends to be more explicit, with more comments and other conditional code (e.g. exception handling) than might otherwise be written without it.</p><h2>Is bigger code better?</h2><p>Here at Jellyfish we&#8217;ve found that an increase in PR throughput is typically a good thing, corresponding to more value delivered to customers. But what about <em>bigger</em> code? That depends. It could mean code that is more robust and well-commented, but it could also mean code that is overly complex and harder to debug, iterate on, and maintain. As AI technology continues to evolve, we&#8217;ll keep looking at how AI coding assistants and agents are impacting code generation &#8211; both quantity <em>and</em> quality.</p>]]></content:encoded></item><item><title><![CDATA[Jellyfish and OpenAI Team Up to Measure the Impact of AI Coding Tools]]></title><description><![CDATA[Study of over 500 companies reveals more than 2x increase in PR throughput.]]></description><link>https://jellyfishresearch.substack.com/p/jellyfish-and-openai-team-up-to-measure</link><guid isPermaLink="false">https://jellyfishresearch.substack.com/p/jellyfish-and-openai-team-up-to-measure</guid><dc:creator><![CDATA[Nicholas Arcolano]]></dc:creator><pubDate>Wed, 13 Aug 2025 15:30:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Nh3U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p><em>Cross-posted at <a href="https://jellyfish.co/blog/jellyfish-openai-team-up-to-measure-impact-of-ai-coding-tools/">jellyfish.co</a></em></p></blockquote><p>When it comes to determining the <strong><a href="https://jellyfish.co/platform/jellyfish-ai-impact/">impact of AI coding tools</a></strong>, reliable data has been few and far between, and published results can tell conflicting stories. For instance, a recent study from <strong><a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/">METR</a></strong> found that in some cases and under some conditions, AI tools can actually slow developers down, while an earlier study from <strong><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4945566">Microsoft</a></strong> showed significant gains.</p><p>At Jellyfish, we&#8217;ve been collecting a robust set of AI impact data across more than 500 companies, representing the coding activities of tens of thousands of engineers, comprising millions of pull requests and billions of lines of code. Real-world data at this scale can help us understand what&#8217;s really going on out in the field as companies engage in the journey of AI transformation.</p><p>We recently teamed up with the folks at OpenAI to take a closer look at how AI tools are affecting coding productivity and quality. Here&#8217;s what we found:</p><h2>Increasing AI adoption means shipping more code, and faster</h2><p>We defined a company-wide &#8220;AI Adoption Rate&#8221; which reflects how regularly engineers on a team are using AI tools. It is defined as the fraction of active coding days an engineer used AI, averaged across all engineers on the team. In the following plots, each data point is a snapshot of a given performance metric and a company&#8217;s level of AI adoption for a single week, allowing us to see trends as companies achieve higher levels of AI adoption.</p><p>What we see is that in terms of PR throughput, going from 0% AI adoption to 100% adoption corresponds to an increase from 1.36 PRs per engineer on average to 2.90 PRs &#8211; a 113% increase.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Nh3U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Nh3U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png 424w, https://substackcdn.com/image/fetch/$s_!Nh3U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png 848w, https://substackcdn.com/image/fetch/$s_!Nh3U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!Nh3U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Nh3U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png" width="1456" height="969" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:969,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;PRs_Jellyfish_OpenAI&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="PRs_Jellyfish_OpenAI" title="PRs_Jellyfish_OpenAI" srcset="https://substackcdn.com/image/fetch/$s_!Nh3U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png 424w, https://substackcdn.com/image/fetch/$s_!Nh3U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png 848w, https://substackcdn.com/image/fetch/$s_!Nh3U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!Nh3U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67b6f7c4-9eef-4e1c-bad2-87b37c4508cb_1999x1330.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Increasing AI adoption also correlates with pushing code to production faster. In going from 0% to 100% AI adoption, an average company can expect their median cycle time to drop four hours from 16.7 to 12.7 &#8211; a 24% reduction.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DPV-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DPV-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png 424w, https://substackcdn.com/image/fetch/$s_!DPV-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png 848w, https://substackcdn.com/image/fetch/$s_!DPV-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!DPV-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DPV-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png" width="1456" height="969" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:969,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Faster Cycle Times_Jellyfish_OpenAI&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Faster Cycle Times_Jellyfish_OpenAI" title="Faster Cycle Times_Jellyfish_OpenAI" srcset="https://substackcdn.com/image/fetch/$s_!DPV-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png 424w, https://substackcdn.com/image/fetch/$s_!DPV-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png 848w, https://substackcdn.com/image/fetch/$s_!DPV-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!DPV-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd848f49d-15e3-4ee2-b7f5-f0749cb088d3_1999x1330.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>But what about code quality?</h2><p>One of the things we looked at was the fraction of total PRs that are linked to bug tickets. We see that the highest levels of AI adoption correspond to an increase of 26.8% in the proportion of tickets (9.5% versus a baseline of 7.5%). In other words, companies with high AI adoption are pushing more code, <em>and</em> a higher fraction of those PRs are bug fixes.</p><p>It may be the case that higher AI use is <em>causing</em> more bugs, or it may be helping teams <em>fix</em> more bugs &#8211; or it may be a combination of both. One interesting observation is that we aren&#8217;t seeing a comparable increase in revert PRs (pull requests whose primary purpose is to undo previous changes, typically due to a critical failure), suggesting that AI doesn&#8217;t seem to be causing major quality issues across the board.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qzsb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qzsb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png 424w, https://substackcdn.com/image/fetch/$s_!qzsb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png 848w, https://substackcdn.com/image/fetch/$s_!qzsb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!qzsb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qzsb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png" width="1456" height="969" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:969,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Bug Fixes_Jellyfish_OpenAI&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bug Fixes_Jellyfish_OpenAI" title="Bug Fixes_Jellyfish_OpenAI" srcset="https://substackcdn.com/image/fetch/$s_!qzsb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png 424w, https://substackcdn.com/image/fetch/$s_!qzsb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png 848w, https://substackcdn.com/image/fetch/$s_!qzsb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!qzsb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb20dad6c-9003-4e00-b390-696adda4f2d9_1999x1330.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>AI tools still need human judgment to deliver quality at speed</h2><p>While errors are few and far between, they still diminish the time savings and productivity gains from AI and show just how vital human skill and oversight remain in the AI era. As AI models improve and training and usage increase, we expect quality and speed to further improve.</p>]]></content:encoded></item></channel></rss>