A detail that went almost unnoticed on the AI-focused subreddit has shone a spotlight on a practice both ingenious and potentially troublesome: according to a user report, Claude Code – Anthropic’s assistant for code generation – would be using steganography to mark every request sent. No official statement has arrived yet, but the news lifts the curtain on a mechanism that, if confirmed, could rewrite the rules of trust in LLM-based coding tools.

The technique: hiding data where no one looks

Steganography, unlike cryptography, does not make a message unreadable; it hides it in plain sight, embedding it within an innocent carrier. In the context of a code assistant, this can translate into the insertion of zero-width characters, anomalous spacing inside comments, or seemingly pointless sequences within strings and variable names. The stated goal of similar techniques is often watermarking, to identify the origin of a text or, in this case, the prompt that generated a portion of source code. Claude Code, according to the report, applies such marking to requests even before the code is produced, creating an invisible yet persistent trail.

From a technical standpoint, this type of intervention is extremely lightweight on the computational side and requires no changes to the model architecture. The mark can be encoded with just a few bytes and go completely unnoticed by a developer who does not perform an in-depth inspection. For an LLM during inference, the additional signal does not alter the quality of the completion, but it represents a form of informational asset that persists in the generated code and, potentially, in all systems that adopt it.

Data sovereignty and the trap of assisted code

For those working in environments where digital sovereignty is a non-negotiable requirement – banks, defense, healthcare, public administration – this discovery carries significant weight. Code produced with the help of cloud tools, if secretly marked, could act as a vector for the leakage of sensitive metadata. This is not just a matter of intellectual property: the existence of a hidden identifier would make it possible to trace back to the user or the organization that originated the request, effectively nullifying the anonymity guaranteed by privacy policies and potentially violating GDPR constraints.

The use of steganographic markers shifts the focus from protecting data in transit to protecting data “at rest” inside the code, an aspect often overlooked in audit processes. If a piece of software goes into production in an air-gapped on-premise environment, the presence of such marks creates an information bridge between the cloud and the isolated system, effectively bypassing any physical firewall. In such a scenario, transparency is no longer a communication choice but a structural variable of the architecture.

Watermarking tools: between security and hidden lock-in

The LLM industry has long been exploring watermarking techniques to distinguish synthetic content from human-generated content. However, applying steganography to code requests introduces a novel element: the tracking mechanism is active upstream of the completion, influencing the process even before the output type is chosen. This upends the usual perspective on anti-abuse defenses – often focused on the output – and highlights the potential for granular session-level tracking during development.

There is also the risk of a subtle lock-in. A company that bases a significant portion of its codebase on marked suggestions might find itself dependent on the provider for any future compliance verification or for the removal of the marks themselves. Without a prior declaration of hidden metadata, the service provider holds an asymmetric information advantage that conflicts with the control principles required by self-hosted deployments.

Beyond a single tool: rethinking the on-premise development pipeline

The Claude Code episode is a starting point for a broader reflection on how to integrate AI assistants into development pipelines that aim for total data control. For those evaluating on-premise deployment of LLMs, the question of mark transparency and the ability to disable them becomes as central as token-per-second performance or context window size. It is not about demonizing a tool, but about aligning technology choices with the security requirements that, in certain sectors, mandate the absence of any undocumented information channel to the outside world.

AI-RADAR has previously analyzed the trade-offs between cloud solutions and self-hosted stacks, highlighting how metadata management is often the weak link in the chain. This new piece – steganographic marking inside requests – adds a further dimension: it is no longer enough to audit the model; every single interaction with the service must be scrutinized. The alternative is not to reject innovation, but to demand that its architecture be inspectable and that any marking be declared, deactivatable, and not introduce unauthorized tracking vectors.