Skip to content

Instantly share code, notes, and snippets.

@ingo-eichhorst
Created March 16, 2026 14:55
Show Gist options
  • Select an option

  • Save ingo-eichhorst/d9b6f2976042be3717e878b59064b6e4 to your computer and use it in GitHub Desktop.

Select an option

Save ingo-eichhorst/d9b6f2976042be3717e878b59064b6e4 to your computer and use it in GitHub Desktop.

Myth vs Fact: Working with LLMs in Engineering

Duration: 10 min | Group: solo or pairs | Bloom: Understand

5 statements. Decide Myth or Fact. Reveal the evidence.

Pairs variant: One person calls it, the other argues the opposite before revealing.


1. "When Claude generates a Helm chart with the wrong values structure, the model is bad at Kubernetes."

Reveal

Myth. LLMs predict the most likely next token — they don't "know" your project. A Helm values.yaml has many valid structures; Claude picks the statistically common one. It writes the correct structure once it has read your existing file.

Takeaway: Wrong output = missing context. Show it an example file from your repo before asking for new code.

OpenAI — Why Language Models Hallucinate


2. "If Claude's generated code compiles and passes lint, you can trust it's correct."

Reveal

Myth. LLMs are trained to produce plausible output. Training rewards confident answers — "I don't know" scores zero, so bluffing is optimal. The model states a wrong function signature with the same confidence as a correct one.

Takeaway: Treat AI output like a junior's PR — it compiles, but review it. Run tests against the actual system.

SignalFire — LLM Hallucinations Aren't Bugs; LLM Overconfidence (arxiv 2509.25498)


3. "Spending 2 minutes reading your existing code with Claude before asking it to implement saves more time than writing a perfect prompt."

Reveal

Fact. Northeastern University (Riedl, 2026) found that understanding what the AI knows and doesn't know predicts better outcomes than prompt syntax. Quote: "There's no special AI skill. It's just good old-fashioned soft skills."

If Claude reads your middleware/chain.go first, it discovers your custom function signature and generates compatible code. Without that, it guesses — and you rewrite.

Takeaway: Context engineering > prompt engineering. Research your codebase with Claude before implementing.

Northeastern — Empathy in Human-AI Collaboration (2026)


4. "When you paste a long CLAUDE.md, error logs, and 3 config files into your prompt, Claude uses all of it equally."

Reveal

Myth. LLMs attend heavily to the beginning and end of context, but miss the middle. Stanford research showed a 30%+ performance drop for middle-positioned information. More tokens also degrade quality — Chroma Research calls this "context rot."

Takeaway: Put critical constraints first. Keep context focused. Targeted research (specific files) beats "dump everything."

Lost in the Middle (Liu et al., 2023); Chroma — Context Rot


5. "A junior developer who understands how LLMs work gets better AI-assisted results than a senior developer who doesn't."

Reveal

Fact. Riedl's study: humans 56%, GPT-4 71%, Llama 3 39% solo. Human+AI teams exceeded both — but synergy was driven by theory-of-mind skills, not technical expertise. Even the weak Llama 3 created synergy with empathetic users. Lower-skilled people with good AI collaboration skills benefited the most.

Takeaway: Understanding the tool (token prediction, context windows, confidence mechanics) beats raw engineering skill when working with AI.

Riedl et al., Northeastern 2026


3 Takeaways

  1. Wrong output = missing context, not a bad model. Show existing code before asking for new code.
  2. Plausible ≠ correct. Verify against the actual system, not just the compiler.
  3. Understanding how LLMs work is the highest-leverage engineering skill for AI-assisted development.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment