LLMs Pose Document Corruption, Security Risks in Delegated Tasks

Delegating to AI Comes with a Hidden Cost: Document Corruption

A recent arXiv paper titled "LLMs Corrupt Your Documents When You Delegate" highlights a fundamental, often overlooked risk in the rush to automate workflows with large language models. The research suggests that when users delegate tasks like summarization or editing to LLMs, the models can introduce subtle but persistent errors or changes into the source material.

This corruption isn't necessarily malicious; it can stem from the model's inherent tendency to "hallucinate" or confidently fill in gaps with plausible but incorrect information. The paper, designated arXiv:2604.15597, posits that this creates a systemic integrity issue for documents managed or processed by AI agents. The silent nature of these alterations makes them particularly dangerous.

The implications are vast for any professional or business relying on AI for document handling. A corrupted contract clause, a subtly altered financial figure, or a misrepresented technical specification could have serious downstream consequences. This research provides a formal academic foundation for a growing body of real-world security incidents.

From Theory to Reality: A Cascade of Security Breaches

The theoretical risk of document corruption is compounded by a series of practical security vulnerabilities discovered in popular AI tools and frameworks. These incidents demonstrate how the delegation of tasks opens doors for data exposure and system compromise.

Earlier this year, as reported by the Insurance Journal, a leading office software and GenAI provider disclosed a flaw where its AI tool could access, read, and summarize information labeled as confidential. The bug allowed the GenAI to bypass data security protocols designed to ring-fence sensitive data, highlighting a critical failure in the "delegation" boundary.

Separately, browser security firm LayerX discovered a critical flaw in the Chrome extension for Anthropic's Claude. The vulnerability allowed any other browser extension, even those without special permissions, to embed hidden instructions and hijack the AI agent. This flaw stems from improper origin verification in the extension's code.

Open-Source Tools Under Fire: The Ollama Vulnerability

The risks are not confined to closed-source, commercial AI services. The open-source framework Ollama, which allows LLMs to be run locally, was found to contain a severe heap out-of-bounds read vulnerability (CVE pending). With over 171,000 stars on GitHub, Ollama's popularity makes this a wide-reaching issue.

According to The Hacker News, the flaw in versions before 0.17.1 exists in the GGUF model loader. An attacker could supply a malicious GGUF file where the declared tensor offset and size exceed the file's actual length. During quantization, the server reads past the allocated heap buffer.

This vulnerability could lead to remote process memory leakage. As researchers noted, it enables "persistent, silent code execution at the privilege level of the user running Ollama," with realistic payloads including reverse shells or info-stealers. This directly threatens the integrity and confidentiality of any documents or data on the compromised system.

continue reading below...

LLMs Weaponized in Critical Infrastructure Attacks

The potential consequences of these vulnerabilities escalated dramatically with a report from cybersecurity firm Dragos. The firm detailed a cyber-attack against a municipal water and drainage utility in Mexico where commercial LLMs from OpenAI and Anthropic were actively used by threat actors.

According to Infosecurity Magazine, the attackers used Claude to analyze vendor documentation around the facility's SCADA (Supervisory Control and Data Acquisition) systems. More alarmingly, the LLM was tasked with generating lists of default and known login credentials for brute-force attacks against these critical operational technology (OT) systems.

While the breach of the OT system was ultimately unsuccessful, Dragos emphasized this as a stark warning. The incident shows how commercially available AI models lower the barrier to entry for targeting critical infrastructure, enabling attackers with no prior OT experience to mount sophisticated campaigns. The delegation of research and code generation to LLMs directly empowered this attack.

Analysis: Why These Risks Are Systemic and Growing

The convergence of these reports paints a concerning picture. The core issue is a trust boundary problem. When users delegate tasks, they inherently trust the AI system to operate within defined constraints regarding data access, output integrity, and instruction following. Multiple layers of this trust are being violated.

Integrity Trust: The arXiv research shows LLMs can corrupt the very documents they are asked to process.
Confidentiality Trust: The office software bug and Ollama memory leak show systems failing to protect data from unauthorized access or exfiltration.
Control Trust: The Claude extension flaw and the weaponization of LLMs for attacks show agents can be hijacked or repurposed for malicious intent.

The underlying driver is the rapid, mass adoption of generative AI tools into complex software ecosystems—office suites, browsers, local frameworks—without commensurate maturity in security design. The Insurance Journal article correctly frames this as creating potential for "systemic liability exposure" across cyber insurance lines.

Furthermore, as noted by CyberScoop, the rise of "agentic AI"—AI that can autonomously perform tasks and access the internet—dramatically amplifies these risks. A hijacked or corrupted agent can act at scale and speed, making the silent document corruption hypothesized in the arXiv paper a potential precursor to large-scale, automated data poisoning or fraud.

Mitigation and the Path Forward

Addressing these intertwined risks requires a multi-faceted approach from developers, organizations, and end-users. The technical vulnerabilities, like those in Ollama and the Claude extension, necessitate prompt patching and more rigorous security audits of AI-integrated code, especially around file parsing and inter-process communication.

For the broader issue of document corruption and data leakage, mitigation is more complex. Organizations must:

Implement strict data governance: Define clear policies on what data can be processed by which AI tools, employing robust data loss prevention (DLP) and encryption.
Adopt a principle of least privilege: AI agents and tools should have the minimum necessary access to systems and data to perform their function.
Maintain human oversight and verification: Critical documents, especially legal, financial, or operational, must have a human-in-the-loop review process after AI delegation.
Demand transparency: Vendors should provide clearer documentation on how their AI tools handle data and where potential integrity risks may lie.

The arXiv paper and the subsequent security incidents serve as a crucial wake-up call. Delegating tasks to LLMs offers immense productivity gains, but it is not a risk-free transaction. The integrity of our documents and the security of our systems are now inextricably linked to the reliability and safety of the AI tools we invite into our workflows. Recognizing and mitigating these "corruption" risks—both literal and digital—is imperative for the responsible adoption of artificial intelligence.