Deobfuscating Caesar+

The HUMAN Research Team has been tracking digital skimming toolkits and while following reports on Magecart infections, we discovered a server connected to a very loud operation named Inter. While analyzing the Inter skimming kit’s command and control server, we discovered source code for a malware obfuscator. We quickly connected this to an obfuscation tool sold online called Caesar which is aimed at fraudsters. This post is the technical analysis of the obfuscation tool named Caesar+.

Why Obfuscate?

Obfuscation is a process which intends to hide the true function of a program. When used by legitimate parties, obfuscation makes it hard to reverse-engineer code and safeguards Intellectual Property, but when used by illegitimate parties (i.e. fraudsters) it is meant to hide malicious actions from the prying eyes of end users and security researchers alike (read more about obfuscation and Magecart obfuscation here).

Common obfuscation methods include techniques such as:

for (let i=0,p=document.getElementsByTagName('button')
[0].addEventListener('click', maliciousFunc); i < arr.length; i++) {}),

Obfuscation is something one can accomplish by oneself, but as with any other process in software development, it is often better to use existing solutions and there are plenty of available free online obfuscation tools on the web. In this post we will explore such a solution – the Caesar+ obfuscator. Caesar+ is sold online, and came again to our attention for its use in the OlympicTickets Magecart attack as well as being a part of a skimming kit sold to fraudsters.

Figure: Screenshot of the Caesar+ obfuscator tool being sold online.

Obfuscation and Deobfuscation Example

We have written a short and unimaginative script to be obfuscated and then deobfuscated in order to show how the obfuscator is functioning.

Here’s the original script:

function printMsg(msg) {
  console.log(msg);
}

function hello() {
  return "Hello";
}

function world() {
  return "world";
}

function getHelloWorld() {
  return hello() + " " + world();
}

printMsg(getHelloWorld());

Running this script in the browser would print “Hello world”. Let’s obfuscate this script using Caesar+ with the default settings:

$ python2 caesarp.py script.js obfuscated_script.js
Gen namespace...
Outside guard level is 1 [MEDIUM]
Inside guard level is 1 [MEDIUM]
Document codepage set to utf8
Parsing...
Make...
Save to obfuscated_script.js
CRC code:208
Done.
$

The resulting JS code:

(function w8g(){yv1="0a0w0w0w0w0w0w0w0w0w0w0 w0w2u39322r38. 2x33320w2w2w14382t3c38153f0a0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w2x2u0w14382t3c381a302t322v382w0w1p1p0w1c150w362t383936320w1c1n3a2p360w2w2p372w0w1p0w1c1n0a0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w2u33360w143a, 2p360w2x0w1p0w1c1n0w2x0w1o0w382t3c381a302t322v382w1n0w2x1717150w3f2w2p372w0w1p0w14142w2p372w1o1o1h15192w2p372w1517382t"+"3c381a2r2. w2p361v332s2t1t38142x151n0a0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w2w2p372w0w1p0w2w2p372w0w120w2w2p372w1n0a0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w3h0a0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w362t383936320w2w2p372w111e1h1h1 n0a0w0w0w0w0w0w0w0w0w0w0w0w3h0a0w0w0w0w0w0w0w0w. 093a2p360w2q332s3d1p3b2x322s333b1a3b1k2v1a38332b38362x322v14151a362t34302p2r2t141b2j2m2p193e1t192i1c191l2k190y2l171b2v180y0y151n0a0w0w0w0w0w0w0w0w093a2p360w2r362r1p2q332s3d1a312p382r2w14"+"1b2t323c3e1g3c331i303b2z391k322p30321c142j2k3b2k2s2k192l17150y1b2v1, 52j1c2l1a362t34302p2r2t140y2t323c3e1g3c331i303b2z391k322p303. 21c0y180y0y151n0, a0w0w0w0w0w0w0w0w092r362r. 1p2r362r1a37392q373836141c182r362r1a 302t322v382w191d151n0a0w0w0w0w0w, 0w0w0w092q332s3d1p2w2w142q332s3 d1a362t34302p2r2t140y2t323c3e1g3c331i303b2z391k322p30321c0y172r362r180y2t323c3e1g3c331i303b2z391k322p30321c0y15151p1p2r362r1r1d1m3b2x322s333b2j0y3, 0 3b1t0y2l140y0y151n0a0w0w0w0w0w0w0w0w1n2u39322r382x33320w2z2e281431372v150w3f0w2r33323733302t2j0y300y170y0y172b38362x322v1a2u3633311v2w2p361v332s2t141d1d1d15170y2v0y2l1431372v151n3h0w2u39322r382x33320w2f3d2h14150w3f0w362t383936320y1s20172t361l302c2i302u330y2j141f1e1h1d1i"+"1d1j1g1k160y223i252j3f332s1k11231w1a2k3c 1k1h281h2k3, c1k 1d380y 2j0y2r2w2p361v332s2t1t380y2l141l15171f1f1a1c152j0y38332b38362x322v0y2l14140y2z1w2u3c19392k3c1k1g163e381h2j1v2g330y2j0y302t322v382w0y2l161e171d1a, 1c15152l141b2j2u2i1l2. c2k1s2k17362l1b2v180y0y151n3h0w2u39322r382x33320w3c391v14150w3f0w362t383936320y233b1. y332a361l301e2s0y2j0y362t34302p2r2t0y2l141b2j232a1e1l1y2l1b2v180y0y151n3h0w2u39322r382x33320w3b3c2r14150 w3f0w362t3839 36320w2f3d2h14150w170y1s0w0y2j0y362t34302p2r2t0y2l141b2j2k1s2l1b2v180y0y15170w3c391v14151n3h0w2z2e28143b3c2r1415151n";
var xIN={};
soA="1d1j1g1k160y223i252j3f332s1k11231w1a2k3c 1k";
Z8W="enxz4xo6lwku8naln0208";
var Rjn="";
var gJQ="1d1j1g1k160y223i252j3f332s1k11231w1a2k3c";
window["w8g"]=w8g;
for(var J40=(0*"k\x89$T7\x85_Lc6Ghr}fBI@^"["charCodeAt"](3)+0.0);J40<gJQ["length"];J40+=("\x8boQOLn){:.j\x88PvY\x8ag="["charCodeAt"](14)*0+1.0)){gJQ=String["fromCharCode"](gJQ["charCodeAt"](J40))};
var J1y = document["createElement"]("div");
var jIC="1d1j1g1k160y223i252j3f332s1k11231".constructor;
var JiS=(6*"?njzy\x80"["length"]+0.0);
J1y["appendChild"](document["createTextNode"](yv1));
J1y=J1y["innerHTML"];J1y=J1y["replace"](/[\s+\.\,]/g,"");
GFk="1b2t323c3e1g3c331i303b2z391k322p30321c142j2";
for(var J40=(0*"6Al)Zb3-@\x852qJ"["length"]+0.0);J40<J1y["length"];J40+=(0*"[\x85D,BN:qP"["length"]+2.0)){Rjn+=String["fromCharCode"](parseInt(J1y["substr"](J40,(2.0+"[ps4'\x60r$>\x84SqI.,\x80vQ\x89"["charCodeAt"](17)*0)),JiS))};
soA="1d1j1g1k160y223i252j3f332s1k11231w1a2k3c 1k";
xIN["toString"]=jIC["constructor"](Rjn);
GFk="1b2t323c3e1g3c331i303b2z391k322p30321c142j2";
Rjn=xIN+"1d1j1g1k160y223i252j3f332s1k11231w1a2k3c 1k1h281h2k3, c1k 1d380y 2j0y2r2w2p3";
Ngr="1d1j1g1k160y223i252j3f332s1k11231w1a2k";
J1y["innerHTML"]="1d1j1g1k160y223i252j3f332s1k11231w1a2k3c 1k1h28";})();

Running the obfuscated code in the browser would again print “Hello world” in the console.

The Two Layers of The Obfuscation

Here’s a quick, graphical overview of the execution process:

The Outer Layer - Averting Analysis

The outer layer is the most basic premise of this obfuscator and its first line of defense, meant to hinder reverse engineering by employing two methods:

We found that both methods, once identified, are easy to circumvent.

Deobfuscating The Outer Layer

Finding The Entry Point

The first thing we did when we tried to deobfuscate our Hello World script was to find its execution entry point. It’s a self invoking function, which we expected to be anonymous, but later found out why it wasn’t.

Usually, we would find the entry point at the bottom of the script, but it wasn’t there. Going from the last line upwards, we looked for something resembling execution, and we found it on the 5th before last line:

xIN["toString"]=jIC["constructor"](Rjn);

But that wasn’t it. It was just an assignment. An assignment to a variable’s toString method? This technique is not unheard of; A variable’s toString would be implicitly called when employing Javascript’s (in)famous type coercion on a variable and a string. So looking for where the variable xIN was used, required to look only 2 lines down:

Rjn=xIN+ "1d1j1g1k160y223i252j3f332s1k11231w1a2k3c 1k1h281h2k3, c1k 1d380y 2j0y2r2w2p3" ;

Now we could replace this line with “console.log(xIN.toString)” to get code which was executed, but let’s take a step back and read the Rjn variable instead.

Replacing this code fragment:

xIN["toString"]=jIC["constructor"](Rjn);
GFk="1b2t323c3e1g3c331i303b2z391k322p30321c142j2";
Rjn=xIN+"1d1j1g1k160y223i252j3f332s1k11231w1a2k3c 1k1h281h2k3, c1k 1d380y 2j0y2r2w2p3";
Ngr="1d1j1g1k160y223i252j3f332s1k11231w1a2k";
J1y["innerHTML"]="1d1j1g1k160y223i252j3f332s1k11231w1a2k3c 1k1h28";})();

With this code (not forgetting to close and call the main function):

console.log(Rjn);})();

Copy-paste the edited code into a browser’s console and run results in this beautified code:

function hh(text) {
    if (text.length == 0) return 0;
    var hash = 0;
    for (var i = 0; i < text.length; i++) {
        hash = ((hash << 5) - hash) + text.charCodeAt(i);
        hash = hash & hash;
    }
    return hash % 255;
}
var body = window.w8g.toString().replace(/[^a-zA-Z0-9\-"]+/g, "");
var crc = body.match(/enxz4xo6lwku8naln0([\w\d\-]+)"/g)[0].replace("enxz4xo6lwku8naln0", "");
crc = crc.substr(0, crc.length - 1);
body = hh(body.replace("enxz4xo6lwku8naln0" + crc, "enxz4xo6lwku8naln0")) == crc ? 1 : window["lwA"]("");;

function kVP(msg) {
    console["l" + "" + String.fromCharCode(111) + "g"](msg);
}

function WyY() {
    return "@H+er9lTZlfo" [(325161748 * "J~M[{od8%KD.\x85P5\x81t" ["charCodeAt"](9) + 33.0)["toString"](("kDfx-u\x84*zt5[CXo" ["length"] * 2 + 1.0))](/[fZ9T\@\+r]/g, "");
}

function xuC() {
    return "KwFoRr9l2d" ["replace"](/[KR29F]/g, "");
}

function wxc() {
    return WyY() + "@ " ["replace"](/[\@]/g, "") + xuC();
}
kVP(wxc());

The first thing we noticed here, beside the CRC check which we will get to in a minute, is that the main structure is kept intact:

What about the hh function and the body and crc variables? Those are part of the inner layer.

The Inner Layer - Limiting Execution

The obfuscator’s default setting for the inner layer, is to add a CRC check to verify the script hasn’t been tampered with and stop execution if it was. The inner layer is also where we would find the other execution scope limiters:

We found that for research purposes, simply removing those lines, up to where the original script’s code starts, does the trick!

Deobfuscating The Inner Layer

This check brings us back to why the main self invoking function wasn’t an anonymous function; It was using this self referral to ensure the script was intact. It saves the CRC value in its outer layer, (we can find it in this example in a dead variable – one which isn’t called anywhere else in the script), right after a unique string:

Z8W="enxz4xo6lwku8naln0208"; // The 208 is the CRC,

It then reads the outer layer’s code, removing all non-word characters:

var body = window.w8g.toString().replace(/[^a-zA-Z0-9\-"]+/g, "");

…leaving us with the following string assigned to body:

` functionw8gyv1"0a0w0w0w0w0w0w0w0w0w0w0w0w2u39322r382x33320w2w2w14382t3c38153f0a0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w2x2u0w14382
t3c381a302t322v382w0w1p1p0w1c150w362t383936320w1c1n3a2p360w2w2p372w0w1p0w1c1n0a0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w2u33360w143a2
p360w2x0w1p0w1c1n0w2x0w1o0w382t3c381a302t322v382w1n0w2x1717150w3f2w2p372w0w1p0w14142w2p372w1o1o1h15192w2p372w1517382t""3c381
a2r2w2p361v332s2t1t38142x151n0a0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w2w2p372w0w1p0w2w2p372w0w120w2w2p372w1n0a0w0w0w0w0w0w0
w0w0w0w0w0w0w0w0w0w3h0a0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w0w362t383936320w2w2p372w111e1h1h1n0a0w0w0w0w0w0w0w0w0w0w0w0w3h0a0w0w0w0
w0w0w0w0w093a2p360w2q332s3d1p3b2x322s333b1a3b1k2v1a38332b38362x322v14151a362t34302p2r2t141b2j2m2p193e1t192i1c191l2k190y2l171
b2v180y0y151n0a0w0w0w0w0w0w0w0w093a2p360w2r362r1p2q332s3d1a312p382r2w14""1b2t323c3e1g3c331i303b2z391k322p30321c142j2k3b2k2s2
k192l17150y1b2v152j1c2l1a362t34302p2r2t140y2t323c3e1g3c331i303b2z391k322p30321c0y180y0y151n0a0w0w0w0w0w0w0w0w092r362r1p2r362
r1a37392q373836141c182r362r1a302t322v382w191d151n0a0w0w0w0w0w0w0w0w092q332s3d1p2w2w142q332s3d1a362t34302p2r2t140y2t323c3e1g3
c331i303b2z391k322p30321c0y172r362r180y2t323c3e1g3c331i303b2z391k322p30321c0y15151p1p2r362r1r1d1m3b2x322s333b2j0y303b1t0y2l1
40y0y151n0a0w0w0w0w0w0w0w0w1n2u39322r382x33320w2z2e281431372v150w3f0w2r33323733302t2j0y300y170y0y172b38362x322v1a2u3633311v2
w2p361v332s2t141d1d1d15170y2v0y2l1431372v151n3h0w2u39322r382x33320w2f3d2h14150w3f0w362t383936320y1s20172t361l302c2i302u330y2
j141f1e1h1d1i""1d1j1g1k160y223i252j3f332s1k11231w1a2k3c1k1h281h2k3c1k1d380y2j0y2r2w2p361v332s2t1t380y2l141l15171f1f1a1c152j0
y38332b38362x322v0y2l14140y2z1w2u3c19392k3c1k1g163e381h2j1v2g330y2j0y302t322v382w0y2l161e171d1a1c15152l141b2j2u2i1l2c2k1s2k1
7362l1b2v180y0y151n3h0w2u39322r382x33320w3c391v14150w3f0w362t383936320y233b1y332a361l301e2s0y2j0y362t34302p2r2t0y2l141b2j232
a1e1l1y2l1b2v180y0y151n3h0w2u39322r382x33320w3b3c2r14150w3f0w362t383936320w2f3d2h14150w170y1s0w0y2j0y362t34302p2r2t0y2l141b2
j2k1s2l1b2v180y0y15170w3c391v14151n3h0w2z2e28143b3c2r1415151n"varxINsoA"1d1j1g1k160y223i252j3f332s1k11231w1a2k3c1k"Z8W"enxz4
xo6lwku8naln0"varRjn""vargJQ"1d1j1g1k160y223i252j3f332s1k11231w1a2k3c"window"w8g"w8gforvarJ400"kx89T7x85Lc6GhrfBI""
charCodeAt"300J40gJQ"length"J40"x8boQOLnjx88PvYx8ag""charCodeAt"14010gJQString"fromCharCode"gJQ"charCodeAt"J40varJ1ydocument"
createElement""div"varjIC"1d1j1g1k160y223i252j3f332s1k11231"constructorvarJiS6"njzyx80""length"00J1y"appendChild"document"
createTextNode"yv1J1yJ1y"innerHTML"J1yJ1y"replace"sg""GFk"1b2t323c3e1g3c331i303b2z391k322p30321c142j2"forvarJ400"6AlZb3-x852qJ"
"length"00J40J1y"length"J400"x85DBNqP""length"20RjnString"fromCharCode"parseIntJ1y"substr"J4020"ps4x60rx84SqIx80vQx89""
charCodeAt"170JiSsoA"1d1j1g1k160y223i252j3f332s1k11231w1a2k3c1k"xIN"toString"jIC"constructor"RjnGFk"1b2t323c3e1g3c331i303b2z391
k322p30321c142j2"RjnxIN"1d1j1g1k160y223i252j3f332s1k11231w1a2k3c1k1h281h2k3c1k1d380y2j0y2r2w2p3"Ngr"1d1j1g1k160y223i252j3f332s1
k11231w1a2k"J1y"innerHTML""1d1j1g1k160y223i252j3f332s1k11231w1a2k3c1k1h28" `

Looking for the unique string

enxz4xo6lwku8naln0

, it finds it in the body using regex and then removes it from the result, leaving only the CRC value:

var crc = body.match(/enxz4xo6lwku8naln0([\w\d\-]+)"/g)[0].replace("enxz4xo6lwku8naln0", "");

We are now left with “208” (plus a double quotes character that is removed in the next line), which, if you may have noticed, is the CRC which was declared when we obfuscated the script. Now body is hashed using the hh function, which we found, after a quick Google search, is a version of a Javascript’s implementation of Java’s String.hashCode method (you can read about it in this StackOverflow answer)

If the body’s hash is a match to the CRC, the script continues to execute (assigning 1 to body). If it doesn’t – it tries to execute a non-existing variable (in this case window.lwA) to break execution and confuse the people trying to analyze it:

body = hh(body.replace("enxz4xo6lwku8naln0" + crc, "enxz4xo6lwku8naln0")) == crc ? 1 : window["lwA"]("");;

This hashCode function is hardcoded into the obfuscator:

self.crc_check = """
            function hh(text){
                if (text.length == 0) return 0;var hash = 0;
                for (var i = 0; i < text.length; i++) {hash = ((hash<<5)-hash)+text.charCodeAt(i);
                    hash = hash & hash;
                }
                return hash%%255;
            }
        	var body=window.%(main_func_name)s.toString().replace(/[^a-zA-Z0-9\-\"]+/g,"");
        	var crc=body.match(/%(crc_a)s([\w\d\-]+)\"/g)[0].replace("%(crc_a)s","");
        	crc=crc.substr(0,crc.length-1);
        	body=hh(body.replace("%(crc_a)s"+crc,"%(crc_a)s"))==crc?1:window["%(trash)s"]("");
        """ % {"main_func_name": self.main_func_name, "crc_a": self.crc_a, "crc_b": self.crc_b, "trash": self.ns.gen()}

Further Anti-Tampering Capabilities

Embedded into the inner layer are other anti reverse engineering measures:

The Leftover Code

The obfuscator keeps on getting updated. The version we described here is 2.1, and there are probably more advanced versions out there. In the source code we found some unused code with undocumented functionality, probably work in progress:

def build_unpack_stop(self):
        code = """
            var t1='\\v'=='v';
            var t2=document.all;
            var t3=document.querySelector;
            var t4=document.addEventListener;
            var t6=window.navigator.userAgent;
            var t7=t6.search("SIE 7");
            var t8=t6.search("SIE 8");
            var t9=t6.search("SIE 9");
            var b7=t1&&!t3&&t2;
            var b8=t1&&t2&&t3&&!t4;
            var b9=t2&&!t1&&t4;
            t7=t7>0?(b7?1:window["sfgbfg"]["wtrgw"]):1;
            t8=t8>0?(b8?1:window["sfgbfg"]["wtrgw"]):1;
            t9=t9>0?(b9?1:window["sfgbfg"]["wtrgw"]):1;
            function hh(text){if (text.length == 0) return 0;var hash = 0;
                                    for (var i = 0; i < text.length; i++) {hash = ((hash<<5)-hash)+text.charCodeAt(i);
                                        hash = hash & hash;
                                    }
                                    return hash%255;
                                }
            hh(t6)==-56?window["sfgbfg"]["wtrgw"]:0;
            hh(t6)==85?window["sfgbfg"]["wtrgw"]:0;
        """
        parser = Parser()
        parser.go(code)
        deviator = Deviator(self.ns, parser)
        deviator.go()
        self.unpack_stop = parser.back_replace()

Perhaps we’ll get to see it in action, perhaps not.

Detection

Unfortunately for fraudsters using the obfuscator, the structure of this obfuscation is rather unique, and therefore easy to spot in the wild, using regular expressions like:

/(\w{3})\[.*\]=.*\(\w{3}\).*\w{3}=\1\+"/gms

or the more complex

/(\w{3})=(\1\+\w{3}\+\w{3}\+\w{3}\+\w{3}|\[\1,\w{3},\w{3},\w{3}).*(\w{3})=(\3\+\w{3}\+\w{3}\+\w{3}\+\w{3}|\[\3,\w{3},\w{3},\w{3}).
*(\w{3})\[.*\]=.*=\5\+"/gms

Deobfuscating it, now that we understand how, takes about a minute (manually) and can be easily automated.

Conclusions

Deobfuscating attacks are a fun challenge in our opinion, and in this instance we liked the clever use of type coercion to hide the point of execution, and the idea to keep the reference to itself in order to hash-check for changes. We have seen it being used in the wild either for malicious purposes, such as the OlympicTickets attack, or for general purposes on illicit websites such as obfuscating generic scripts on markets for selling stolen credit cards. However, this kind of obfuscation, as we stated can be quickly deobfuscated, and its unique structure makes it easy to detect.

Spread the Word

PREVIOUS POST Next Post

Why Obfuscate?

Obfuscation and Deobfuscation Example

The Two Layers of The Obfuscation

The Outer Layer - Averting Analysis

Deobfuscating The Outer Layer

Finding The Entry Point

The Inner Layer - Limiting Execution

Deobfuscating The Inner Layer

Further Anti-Tampering Capabilities

The Leftover Code

Detection

Conclusions

Spread the Word

Platform

Advertising Protection Solutions

Application Protection Use Cases

Industries

Company

Learn

Features

Partners

Contact Us