ShadowMQ and Other Critical RCE Flaws

November 17, 2025

Researchers have identified a worrying class of remote code execution (RCE) vulnerabilities across multiple AI inference engines. These flaws affect major AI serving platforms-from Meta’s Llama to NVIDIA Triton and open-source inference systems—raising serious risks around model theft, persistent compromise, and infrastructure hijacking.

What’s the Core Issue?

The root cause is a pattern dubbed ShadowMQ, found in several AI inference frameworks.
This pattern involves ZeroMQ (ZMQ) sockets receiving serialized Python objects and deserializing them using Python’s pickle, a known unsafe serialization method.
Because of insecure code reuse across different projects, this flawed logic appears in several widely used engines.

Affected Platforms

According to Oligo Security and other researchers, the following inference engines are impacted:

vLLM – CVE-2025-30165
NVIDIA TensorRT-LLM – CVE-2025-23254
Modular Max Server – CVE-2025-60455
Meta Llama-Stack – previously reported CVE-2024-50050
NVIDIA Triton Inference Server – multiple RCE issues; e.g., CVE-2025-23319, CVE-2025-23320, CVE-2025-23334
Ollama – including DoS, authentication bypass, and arbitrary file copy vulnerabilities
PyTorch – bug in torch.load() (CVE-2025-32434) that enables code execution when loading serialized models.

Why These Flaws Are Dangerous

Model Theft: Inference servers often host proprietary models that are valuable IP. RCE here could enable an attacker to exfiltrate them.
Persistence: Once an attacker gains code execution, they can deploy backdoors, cryptominers, or other tools inside the inference environment.
Lateral Movement: AI inference nodes are now part of the attack surface — compromising them could lead to pivot points within the infrastructure.
Unsafe Defaults: The vulnerabilities stem from insecure patterns (like pickle deserialization) that are often copied between projects, making this a systemic issue.

Recommended Mitigations

Patch Immediately: Apply all available updates from vendors (e.g., update Triton to version 25.07+, vLLM to a patched release, etc.).
Restrict Network Exposure: Do not expose ZMQ or other inference sockets publicly – bind them only to localhost or private networks.
Use Safe Serialization: Replace pickle.loads() with safer formats like JSON or Protobuf wherever possible.
Enable ZMQ Security: Use ZMQ’s built-in security mechanisms (e.g., CURVE) or proxy traffic over TLS.
Harden Runtime Environment: Run inference processes with least privilege, enable container isolation, and monitor for unusual child-process creation.
Audit Code Reuse: Review open-source or third-party frameworks for recycled, unsafe patterns; perform static analysis or code review especially for deserialization logic.

Conclusion

As AI becomes more deeply integrated into enterprise infrastructure, the security of inference engines is no longer just about model integrity – it’s also about platform trust. These remote code execution vulnerabilities expose a critical layer of the AI stack to severe risk. Organizations must treat inference infrastructure as a first-class security concern: patch quickly, isolate aggressively, and review code patterns to protect against advanced compromise.

About COE Security

COE Security partners with organizations in financial services, healthcare, retail, manufacturing, and government to secure AI-powered systems and ensure compliance. Our offerings include:

AI-enhanced threat detection and real-time monitoring
Data governance aligned with GDPR, HIPAA, and PCI DSS
Secure model validation to guard against adversarial attacks
Customized training to embed AI security best practices
Penetration Testing (Mobile, Web, AI, Product, IoT, Network & Cloud)
Secure Software Development Consulting (SSDLC)
Customized CyberSecurity Services

In response to these inference-engine vulnerabilities, COE Security also provides AI infrastructure risk assessments, code-reuse auditing, secure serialization reviews, and network-segmentation consulting for AI deployment environments.

Follow COE Security on LinkedIn for ongoing insights into secure, compliant AI adoption and to stay updated and cyber safe.

Click to read our LinkedIn feature article