The comment paradox: clean code in age of AI agents
Comments no longer speak only to developers. In AI-assisted coding, every stale TODO, misleading note, and commented-out block becomes part of the prompt. Clean comments help humans understand the code, but they now also protect AI assistants from repeating old mistakes.
Comments used to be a private conversation between developers. A note to the next maintainer. A warning about a strange edge case. A little apology for code that nobody had time to clean up.
That changed when AI coding assistants entered the editor.
Today, comments are no longer passive. They are part of the prompt. Every stale TODO, every commented-out block, every "temporary" workaround from three years ago gets fed into the model as context. The assistant may not treat it as dead text. It may treat it as intent.
That is the comment paradox: comments can explain code to humans, but they can also mislead the machine now helping write it.
Code is now read by two audiences
Good code still has to serve humans first. A developer should be able to read a function and understand what it does from the names, structure, and tests around it.
But AI agents add a second reader. They do not skim like humans. They consume the whole buffer: imports, functions, docstrings, TODOs, commented-out experiments, old examples, broken snippets, and emotional outbursts.
To an LLM, the difference between this:
ssl.wrap_socket(sock)and this:
# ssl.wrap_socket(sock, ssl_version=ssl.PROTOCOL_SSLv3)is not as clean as we want it to be. One executes. One does not. But both shape the model's next suggestion.
That matters because commented-out code often contains exactly the things we do not want copied: deprecated APIs, old assumptions, insecure defaults, half-finished refactors, and patterns we meant to delete.
Zombie code is no longer harmless
Commented-out code used to be mostly embarrassing. Now it is risky.
Recent research on "comment traps" found that defective commented-out code can push LLM-generated defect rates sharply upward. In some cases, defect rates reached 58.17 percent when bad commented code appeared in the model's context. [1]
The scary part is not that assistants copy bad snippets verbatim. Often they do something worse: they reconstruct the bad idea.
Even when defective fragments are incomplete or partially removed, models can infer the pattern and generate executable code that preserves the mistake. A human sees a dead block and thinks, "Ignore that." The model sees another signal.
Placement matters too. Some assistants are especially sensitive to code that appears after the cursor because of fill-in-the-middle completion. If stale code sits below the line you are editing, it may pull the generated code toward old behavior.
Prompting helps, but only a little. Telling the assistant "do not use the commented-out code" reduces some defects, but it does not erase the poisoned context. Clean input beats clever instruction.
Bad comments have types
Most teams know comments can rot. AI just makes the cost easier to see.
The obvious comment repeats the code:
count++; // increment count (pretty stupid example)
It adds noise without adding knowledge.
The lying comment is worse. It describes code that used to exist:
// Users are always sorted by signup date
return users.sort(byLastActiveAt);This wastes human time and gives the model a false premise.
The hoarder comment preserves old code "just in case":
// const legacyClient = new LegacyBillingClient(apiKey);
const billingClient = new BillingClient(apiKey);This is the dangerous one in LLM-assisted development. It tells the assistant that the old path still matters.
The rant comment records frustration instead of intent:
// This is garbage fix laterThat may feel honest, but it gives nobody a next step. Open an issue, refactor it, or explain the constraint.
Best comment is often a better name
Before writing a comment, try to make the code need less explanation.
# Instead of this:
d = 86400 # elapsed time
# write this:
SECONDS_PER_DAY = 86400# Instead of this:
if (status == 4):
# write this:
if status == STATUS_PUBLISHED:Instead of explaining a dense conditional, extract it:
// Instead of this
if (user.current.isActive && user.current.emailVerified && !user.current.suspended) { sendWelcomeBackEmail(user);
}
//Write this
if(canRecieveWelcomeBackEmail(user)){
sendWelcomeBackEmail(user);
}
The goal is not silent code. The goal is code whose meaning lives in executable structure, not fragile side notes.
When comments still matter
Some comments earn their place.
Use comments to explain why the code is surprising. If a strange branch exists because a payment provider returns the wrong status code on Tuesdays, say so. If a slow-looking algorithm beats the clever one because of cache behavior, explain the tradeoff. If a public API has strict input and error contracts, document them.
Comments are also useful for dense syntax that resists naming, such as regular expressions or bit manipulation.
Useful comment:
// Matches invoice IDs from the legacy billing system, for example INV-2024-000193.
const legacyInvoiceIdPattern = /^INV-\d{4}-\d{6}$/;That comment adds a constraint the code cannot express on its own.
LLM rules are not a substitute for hygiene
Many teams now add editor rules for LLM assistants. That is fine. Specific rules help:
Use ReadonlyArray for collection parameters.
Use branded types for UserId and AccountId.
Do not introduce commented-out code.
Prefer named constants over magic numbers.Vague rules do not help much:
Write clean code.
Follow best practices.
Be type safe.
No mistakes.Agents follow examples and concrete patterns better than abstract advice.
Still, rules cannot save a messy file. If the surrounding context contains obsolete APIs, zombie code, and misleading comments, the model will absorb them. The better fix is to remove the traps.
A practical comment policy for LLM-assisted teams
Treat comments as part of the prompt surface.
Delete commented-out code. Git keeps history better than your source files do.
Rewrite comments that describe what the code already says. Use better names, constants, guard clauses, and extracted functions.
Keep comments that explain constraints, tradeoffs, external bugs, public contracts, and examples that prevent misuse.
Run static analysis instead of trusting the assistant to repair itself. LLMs can generate plausible code with fake APIs, wrong parameters, and inconsistent identifiers. Linters, type checkers, tests, and API-aware analysis catch what fluent prose hides.
Most of all, review the context before you accept the completion. The assistant is not only responding to your prompt. It is responding to the file.
Clean code has always helped humans. Now it also protects the machine from learning the wrong lesson.
References
[1] Comment Traps: How Defective Commented-out Code Augments Defects in AI-assisted code generation
[2] Investigating the Impact of Code Comment Inconsistency on Bug Introducing
[3] A Philosophy of Software Design vs Clean Code