Created
April 8, 2026 22:32
-
-
Save jordangarcia/11323488d2243a50d15ca9b77bb3a316 to your computer and use it in GitHub Desktop.
GenerateDeckContinuous: Body Tag Early Stop Bug Investigation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <!DOCTYPE html> | |
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>GenerateDeckContinuous: Body Tag Early Stop Bug</title> | |
| <style> | |
| :root { | |
| --bg: #0d1117; | |
| --surface: #161b22; | |
| --border: #30363d; | |
| --text: #c9d1d9; | |
| --text-muted: #8b949e; | |
| --accent: #58a6ff; | |
| --red: #f85149; | |
| --green: #3fb950; | |
| --yellow: #d29922; | |
| --orange: #db6d28; | |
| } | |
| * { box-sizing: border-box; margin: 0; padding: 0; } | |
| body { | |
| font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, sans-serif; | |
| background: var(--bg); | |
| color: var(--text); | |
| line-height: 1.6; | |
| padding: 2rem; | |
| max-width: 900px; | |
| margin: 0 auto; | |
| } | |
| h1 { color: #fff; margin-bottom: 0.5rem; font-size: 1.75rem; } | |
| h2 { color: var(--accent); margin: 2rem 0 0.75rem; font-size: 1.3rem; border-bottom: 1px solid var(--border); padding-bottom: 0.5rem; } | |
| h3 { color: #fff; margin: 1.5rem 0 0.5rem; font-size: 1.1rem; } | |
| p { margin-bottom: 0.75rem; } | |
| .subtitle { color: var(--text-muted); margin-bottom: 2rem; font-size: 0.95rem; } | |
| code { | |
| background: var(--surface); | |
| border: 1px solid var(--border); | |
| border-radius: 4px; | |
| padding: 0.15em 0.4em; | |
| font-size: 0.9em; | |
| font-family: 'SF Mono', 'Fira Code', monospace; | |
| } | |
| pre { | |
| background: var(--surface); | |
| border: 1px solid var(--border); | |
| border-radius: 8px; | |
| padding: 1rem; | |
| overflow-x: auto; | |
| margin: 0.75rem 0 1rem; | |
| font-size: 0.85rem; | |
| line-height: 1.5; | |
| } | |
| pre code { background: none; border: none; padding: 0; } | |
| .card { | |
| background: var(--surface); | |
| border: 1px solid var(--border); | |
| border-radius: 8px; | |
| padding: 1.25rem; | |
| margin: 1rem 0; | |
| } | |
| .tag-red { color: var(--red); font-weight: 600; } | |
| .tag-green { color: var(--green); font-weight: 600; } | |
| .tag-yellow { color: var(--yellow); font-weight: 600; } | |
| .tag-orange { color: var(--orange); font-weight: 600; } | |
| .tag-blue { color: var(--accent); font-weight: 600; } | |
| table { | |
| width: 100%; | |
| border-collapse: collapse; | |
| margin: 1rem 0; | |
| font-size: 0.9rem; | |
| } | |
| th, td { | |
| border: 1px solid var(--border); | |
| padding: 0.5rem 0.75rem; | |
| text-align: left; | |
| } | |
| th { background: var(--surface); color: #fff; } | |
| .flow-diagram { | |
| background: var(--surface); | |
| border: 1px solid var(--border); | |
| border-radius: 8px; | |
| padding: 1.5rem; | |
| margin: 1rem 0; | |
| font-family: 'SF Mono', 'Fira Code', monospace; | |
| font-size: 0.8rem; | |
| white-space: pre; | |
| overflow-x: auto; | |
| line-height: 1.4; | |
| } | |
| .highlight { background: rgba(248, 81, 73, 0.15); border-left: 3px solid var(--red); padding-left: 0.75rem; margin: 0.5rem 0; } | |
| .highlight-green { background: rgba(63, 185, 80, 0.15); border-left: 3px solid var(--green); padding-left: 0.75rem; margin: 0.5rem 0; } | |
| ul { margin: 0.5rem 0 0.75rem 1.5rem; } | |
| li { margin-bottom: 0.25rem; } | |
| .commit { font-family: 'SF Mono', monospace; font-size: 0.85em; color: var(--accent); } | |
| hr { border: none; border-top: 1px solid var(--border); margin: 2rem 0; } | |
| .evidence { border-left: 3px solid var(--yellow); padding-left: 1rem; margin: 1rem 0; color: var(--text-muted); } | |
| </style> | |
| </head> | |
| <body> | |
| <h1>Bug: GenerateDeckContinuous Stops Early on <code></BODY></code></h1> | |
| <p class="subtitle">Investigation — April 8, 2026 · Affects all deck generation with specified card counts</p> | |
| <h2>Summary</h2> | |
| <div class="card"> | |
| <p>When a user requests 60 cards, the LLM generates ~40 cards, emits <code></BODY></code>, and the continuous generation component <span class="tag-red">stops immediately</span> — never retrying to reach the target count.</p> | |
| <p>This happens because a recent change (<span class="commit">20cf066ca4</span>) simultaneously:</p> | |
| <ul> | |
| <li>Added explicit <code><BODY>...</BODY></code> wrapping instructions to the prompt</li> | |
| <li>Changed the card count language from <strong>"create EXACTLY N cards"</strong> to <strong>"content-driven count, N is just a default"</strong></li> | |
| </ul> | |
| <p>The <code></BODY></code> tag is the <strong>primary unconditional stop signal</strong> in the continuous component. Once the LLM emits it, there is no retry — generation is done.</p> | |
| </div> | |
| <h2>Root Cause</h2> | |
| <h3>Before <span class="commit">20cf066ca4</span> (Feb 25, 2026)</h3> | |
| <div class="highlight-green"> | |
| <p><strong>Prompt said:</strong></p> | |
| <pre><code>**CRITICAL REQUIREMENT**: Create exactly {numCards} cards. | |
| The user explicitly specified this and would be disappointed | |
| if you deviated from it. | |
| // For numCards >= 12: | |
| It is critical that you generate all {numCards} cards. | |
| Do not stop early. It is UNACCEPTABLE to stop without | |
| returning the full {numCards} cards.</code></pre> | |
| <p><strong>No BODY tag wrapping instructions.</strong> LLM rarely emitted <code></BODY></code>, so the continuous mechanism retried until <code>numCards</code> was reached.</p> | |
| </div> | |
| <h3>After <span class="commit">20cf066ca4</span></h3> | |
| <div class="highlight"> | |
| <p><strong>Prompt now says:</strong></p> | |
| <pre><code>**Card Count**: The default target is 60 cards, but the actual | |
| count should be driven by the content structure. | |
| **Card count priority**: | |
| 1. Hard ceiling: Never produce more than 60 cards. | |
| 2. Content-driven: match content structure — this should | |
| WIN over the default target. | |
| 3. Default: Only aim for 60 when content doesn't dictate. | |
| **Output structure**: Your entire output MUST be wrapped in | |
| <BODY>...</BODY> tags. | |
| **CRITICAL**: Wrap ALL sections inside <BODY>...</BODY> tags. | |
| Your output MUST start with <BODY> and end with </BODY>.</code></pre> | |
| <p>The LLM decides 40 cards "fit the content", emits <code></BODY></code>, and generation ends.</p> | |
| </div> | |
| <h2>The Stop Logic</h2> | |
| <p>In <code>generate-deck-continuous.component.tsx</code>, the stop conditions are checked in this order:</p> | |
| <pre><code>// 1. PRIMARY STOP — unconditional, no retry possible | |
| if (id === 'BODY_CLOSE') { | |
| return // ← game over, no matter how many cards we have | |
| } | |
| // 2. SAFETY CAP — also unconditional | |
| if (hasMaxCards && id === 'SECTION' && yieldedCards >= maxCards) { | |
| return | |
| } | |
| // 3. POST-STREAM — only checked after stream ends naturally | |
| if (!doneOnBodyEnd && numCards && yieldedCards >= numCards) { | |
| break // stop retrying | |
| }</code></pre> | |
| <p><span class="tag-red">The problem:</span> Stop condition #1 fires on <code></BODY></code> regardless of how many cards have been yielded. There is no check like <code>yieldedCards >= numCards</code> before returning.</p> | |
| <h2>Why This Worked for Remix (and Doesn't for GenerateDeck)</h2> | |
| <table> | |
| <tr> | |
| <th></th> | |
| <th>Remix</th> | |
| <th>GenerateDeck (the bug)</th> | |
| </tr> | |
| <tr> | |
| <td><strong>Has <code>numCards</code>?</strong></td> | |
| <td><span class="tag-green">No</span> — only <code>maxCards</code></td> | |
| <td><span class="tag-red">Yes</span> — user specifies exact count</td> | |
| </tr> | |
| <tr> | |
| <td><strong>Card count intent</strong></td> | |
| <td>Inherently content-driven (adapting existing doc)</td> | |
| <td>User explicitly chose 60 cards</td> | |
| </tr> | |
| <tr> | |
| <td><strong><code></BODY></code> as stop signal</strong></td> | |
| <td><span class="tag-green">Correct</span> — LLM decides when content is done</td> | |
| <td><span class="tag-red">Wrong</span> — LLM stops before reaching user's target</td> | |
| </tr> | |
| <tr> | |
| <td><strong>Expected behavior</strong></td> | |
| <td>Stop when content is fully remixed</td> | |
| <td>Keep generating until 60 cards are produced</td> | |
| </tr> | |
| </table> | |
| <p>The <code><BODY></code> tag mechanism was designed for remix, where the LLM controls how many cards to produce. But it was added to GenerateDeck prompts without adjusting the continuous component to treat <code></BODY></code> differently when <code>numCards</code> is specified.</p> | |
| <h2>Trace Evidence</h2> | |
| <div class="evidence"> | |
| <p><strong>Request:</strong> 60 cards</p> | |
| <p><strong>Prompt sent to LLM:</strong> "The default target is 60 cards, but the actual count should be driven by the content structure... content's natural structure should WIN over the default target"</p> | |
| <p><strong>LLM output:</strong> 40 sections wrapped in <code><BODY>...</BODY></code></p> | |
| <p><strong>Continuous component:</strong> Saw <code></BODY></code> → returned immediately → no retry</p> | |
| <p><strong>User got:</strong> 40 cards instead of 60</p> | |
| </div> | |
| <h2>Suggested Fix</h2> | |
| <div class="card"> | |
| <p>The <code>BODY_CLOSE</code> stop should <strong>not</strong> be unconditional when <code>numCards</code> is specified. Two things need to change:</p> | |
| <h3>1. Continuous component: respect <code>numCards</code> over <code></BODY></code></h3> | |
| <pre><code>// Instead of unconditional return on BODY_CLOSE: | |
| if (id === 'BODY_CLOSE') { | |
| - return | |
| + if (!props.numCards || yieldedCards >= props.numCards) { | |
| + return // remix / target met → done | |
| + } | |
| + // numCards not met → treat as end-of-stream, allow retry | |
| + break | |
| }</code></pre> | |
| <h3>2. Prompt: restore forceful language when numCards is explicit</h3> | |
| <p>The "content-driven" language should be conditional — when the user explicitly chose a card count, the prompt should demand that count, not defer to "content structure".</p> | |
| </div> | |
| <hr> | |
| <h2>Commits Involved</h2> | |
| <table> | |
| <tr> | |
| <th>Commit</th> | |
| <th>Date</th> | |
| <th>Change</th> | |
| <th>Impact</th> | |
| </tr> | |
| <tr> | |
| <td><span class="commit">20cf066ca4</span></td> | |
| <td>Feb 25</td> | |
| <td>Added BODY tag instructions + content-driven count language</td> | |
| <td class="tag-red">Root cause</td> | |
| </tr> | |
| <tr> | |
| <td><span class="commit">9e539c4356</span></td> | |
| <td>Feb 27</td> | |
| <td>Added mid-stream return at numCards without body tag</td> | |
| <td class="tag-yellow">Fixed in next commit</td> | |
| </tr> | |
| <tr> | |
| <td><span class="commit">58e376b3dc</span></td> | |
| <td>Feb 28</td> | |
| <td>Moved numCards stop to post-stream break</td> | |
| <td class="tag-green">Fixed 9e539c43's issue</td> | |
| </tr> | |
| </table> | |
| <p class="subtitle" style="margin-top: 2rem; text-align: center;">Files: <code>generate-deck-continuous.component.tsx</code> · <code>generate-deck2_5.prompt.tsx</code> · <code>generate-deck2_5-preserve.prompt.tsx</code></p> | |
| </body> | |
| </html> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment