Throughout this series of articles we will showcase some of the techniques used by the ransomware Maze to make its analysis more difficult. Additionally, a series of scripts will be provided to deobfuscate and better follow the execution flow.
Usually the ransomware Maze is in DLL form, which is loaded into memory through a loader containing the encrypted DLL.
Therefore, there are two components:
- Loader/packer that contains the Maze encrypted DLL and performs a series of checks before launching the ransomware component.
- Maze (DLL), which is highly obfuscated and performs a series of checks before running that will be explained in the next article.
You can find the sample used for this article in the IOC section.
- Not all Maze loaders/packers make use of this technique, but the purpose of this article is to understand it and be able to solve the problem.
- Some samples of Maze (in the DLL itself) have been found making use of the same control flow flattening technique that we found in the loader/packer explained below.
Using control flow flattening and opaque predicates
Should researchers wish to follow this part of the article using a debugger, it is recommended to do the following in IDA:
Edit -> Segments -> Rebase… -> 0x400000
This eliminates the possibility of problems arising due to having a different base address in the debugger and allows researchers to directly compare the memory addresses of the loader that appears in their debugger with those in this article.
At the beginning of the program (0x0405DC0), the following graph is found:
It may seem complicated at first, but let’s take a step-by-step approach to identify what is going on.
It is making use of control flow flattening to break the original flow of the program and complicate its analysis. In addition, it adds opaque predicates and dead code in order to make the identification of the relevant blocks more difficult.
Before continuing, let’s give a brief introduction of some concepts:
Dead code is code that is inserted into a program to make the analysis tasks more complicated, but even if it is executed it does not change the original program flow.
Opaque predicate is a technique used to add complexity to the original program flow. It is usually implemented by generating new branches in the control flow, for example, using conditional jumps.
Although these conditional jumps exist, the implementation of the program remains deterministic.
Control flow flattening is a technique used to break the flow of a program by flattening it.
This diagram represents the control flow of a program.
When the control flow is flattened, the program is divided into blocks, all of which are at the same level. Therefore, the order of execution cannot be identified at first glance.
This order of execution is relegated to the new structure in the form of a switch.
Said switch will change the program flow based on a predetermined value, which, for the Maze loader, is stored in the EBX register.
At the end of each block a new value will be inserted in the EBX register to redirect the program flow to another branch, thus maintaining a deterministic flow.
Below is an illustration by Qarkslab that provides a detailed overview of the different components of a control flow graph that has been obfuscated using control flow flattening.
When dealing with this type of obfuscation, the first thing to do is locate the main dispatcher, the relevant blocks and the predispatcher.
The main dispatcher is the program block to which the execution of the program will always return after finishing the execution of one of the branches.
The relevant blocks are those blocks that are part of the functional program, as we could see in the first representation Block 1, Block 2.
The predispatcher is in charge of modifying the value of the parameter that will be checked by the main dispatcher to redirect the execution flow through one branch or another.
The first thing it does is a mov ebx,0x813, the prologue has been identified. Furthermore, by looking a little around the nodes you can see how the flow depends on the value of the EBX register as well:
If looking at the cross references of the instruction following the mov (at 0x405DC5), it is referenced 208 times.
The main dispatcher (0x405DC5) has been located, to which the execution will return again every time the branch is executed.
To simplify, the execution would be something similar to this:
- Main dispatcher checks EBX, redirects the execution to a specific branch based on the value of the register.
- The branch is executed until it reaches the predispatcher.
- The predispatcher sets a new EBX value and jumps to the main dispatcher and back to step 1.
In this specific case, if the execution of the program is followed until it returns to the main dispatcher (0x405DC5), it is possible to find the first predispatcher block.
It can be noticed that in the predispatcher (0x4068B2) a new value is moved to EBX, specifically 0x25B, and then it jumps again to the main dispatcher (0x405DC5) and continues the execution through a new branch.
This Maze loader does not use just one predispatcher; instead, each branch has a predispatcher that, after modifying the EBX value, will jump back to the main dispatcher block. So, in this implementation, the predispatcher blocks are all those that refer to the main dispatcher (the 208 blocks that appear in the XREFs), and the relevant blocks, which are those that matter, are usually those that precede the predispatcher blocks.
But before identifying relevant blocks let’s deal with other obfuscation techniques besides control flow flattening.
Opaque predicates with the same jz and jnz address
By analyzing the code, we find that in addition to the control flow flattening technique, opaque predicates are being applied, which are implemented in different ways:
You will notice how the last instructions with conditional jumps will always jump to the same address 0x405DC5:
.text:00407A1E 0F 85 A1 E3 FF FF jnz loc_405DC5
.text:00407A24 0F 84 9B E3 FF FF jz loc_405DC5
In this case 0x405DC5, which again is the main dispatcher’s address.
The following patterns based on the jnz and jz instruction opcodes can be used to identify these opaque predicates:
You can search for these patterns in IDA by using Search -> Sequence of bytes (Check Find all occurrences). This search mechanism will be used several times throughout this article.
0F 85 ?? ?? ?? ?? 0F 84 ?? ?? ?? ??
0F 84 ?? ?? ?? ?? 0F 85 ?? ?? ?? ??
The first pattern, which searches first for a jnz and then a jz, has multiple occurrences; the second one (jz then jnz) has none:
We have created a script in IDA that uses the aforementioned patterns to patch out the obfuscation:
mazeloader_patcher.py -> resolve_opaque_jnz_jz()
Once the script is executed, the opaque predicates implemented with jz and jnz will be removed.
Now the CFG is beginning to look a little clearer:
Opaque predicates with EBX value
Besides using the jnz + jz opaque predicates, the loader also implements a second variation in which it will attempt to interfere with our analysis by modifying the value of EBX.
There are multiple permutations of opaque predicates modifying EBX such as:
Using mov ebx, value :
00406D25 BB 01 00 00 00 mov ebx, 1 <– Relevant code
00406D2A 34 3B xor al, 3Bh
00406D2C 66 81 C1 AF 0C add cx, 0CAFh
00406D31 66 01 C9 add cx, cx
00406D34 85 DB test ebx, ebx <– Relevant code
00406D36 74 0B jz short loc_406D43 <– Relevant code
Using push value pop ebx:
00406D4D 6A 05 push 5 <– Relevant code
00406D4F 5B pop ebx <– Relevant code
00406D50 66 81 EE 68 1B sub si, 1B68h
00406D55 88 D2 mov dl, dl
00406D57 09 FF or edi, edi
00406D59 85 DB test ebx, ebx <– Relevant code
00406D5B 74 1F jz short loc_406D7C <– Relevant code
In both cases, the value of EBX will be in charge of directing the flow of the program, but this value (push[value]) is fixed so the flow of these blocks will never change (deterministic) – therefore it is using opaque predicates.
In order to remove these opaque predicates it is necessary to find all these blocks, interpret them, and patch them.
The first case (mov ebx, value) appears to be more difficult to identify, but by looking at similar blocks, it can be observed that the value of EBX only varies from 0-9 so we can use multiple patterns to find them.
BB 01 00 00 00
BB 02 00 00 00
BB 03 00 00 00
BB 04 00 00 00
BB 05 00 00 00
BB 06 00 00 00
BB 07 00 00 00
BB 08 00 00 00
BB 09 00 00 00
In the second case, it is simpler. With a small pattern, all the blocks with those characteristics can be found inside the function, since the script will perform some more checks.
6A ?? 5B
Using these patterns we can, again, search and patch using idapython.
Original code with EBX > 0:
In this case the value of EBX is 7, so when the test ebx, ebx operation is performed, EBX will never satisfy the jz condition, so it can be removed.
Original code with EBX == 0:
In this case, the value of EBX is 0, so when the test ebx,ebx operation is performed, the ZF will be set to 1 and therefore the condition of the conditional jump jz is satisfied, so it is possible to modify the block patching it with a non-conditional jmp.
mazeloader_patcher.py -> resolve_opaque_mov_push()
In a similar vein to the opaque predicates that we discussed before, there are loops that increase until a specific EBX value is reached:
Right after the program leaves the loop, however, the value of EBX is modified, making all the operations that modified the EBX register useless: this is dead code.
In theory these loops could be directly removed from the code and would not affect the execution flow.
By searching for the following pattern in IDA, all loops of this type can be found:
81 FB ?? ?? ?? ?? 75
This sample contains 41, and it can be seen how the EBX value is always modified after the loop.
The script will remove the cmp and conditional jmp instructions of the loop block.
mazeloader_patcher.py -> resolve_loops()
There are other dead code blocks, whose code does not change the value of EBX, but it is not easy to find an expression to detect them:
Some of the blocks also present checks of the BeingDebugged flag. In this sample, there are 8 such blocks.
Some of these blocks precede a predispatcher block and are not deadcode because they affect the behavior of the loader. Because of this, these blocks could be considered relevant blocks:
In cases where it finds the BeingDebugged flag active, it will enter an infinite loop.
Since the purpose of these relevant blocks is to detect if the program is being debugged, we can simply patch the condition to make the check have no consequence by changing the jz to jmp.
The pattern used to identify these blocks is the following:
64 ?? 30 00 00 00
64 ?? ?? 30 00 00 00
The conditional jump has been patched to a non-conditional jump.
mazeloader_patcher.py -> resolve_fs30()
As we have seen, it is not so easy to identify relevant blocks, and even more so in this case where, in addition to the control flow flattening technique, other obfuscation techniques have been introduced.
The main objective of this obfuscated function is to detect whether the sample is being analyzed through the BeingDebugged flag. Only in cases where this flag is not activated will the ransomware component (DLL) be executed.
The approach that has been taken in order to understand all the functionality has been to remove the obfuscation layers such as opaque predicates and dead code as well as identifying the anti-debug techniques.
With all the previous patches applied, reanalyzing the code with IDA results in the following flow graph control:
Now the function is correctly patched and it is easy to identify the blocks that are dead code.
By following the resulting CFG, it can be seen that, at the bottom, there are blocks with a larger size and with calls to Windows API functions.
Once these blocks are executed, a callback function will be invoked through CreateTimerQueueTimer when Sleep is called, which will load the Maze DLL into memory and run it.
At this point, the obfuscation techniques used have been understood and the anti-debug checks of the Maze loader have been avoided.
In the next article the obfuscation and anti-debug methods used in the Maze DLL will be explained in detail.
Throughout this article, the basic concepts of control flow flattening have been explained, along with the obfuscation techniques used by this Maze loader sample with the goal that the researcher will be able to identify them and apply the scripts provided in order to facilitate the analysis of Maze ransomware samples.
In the following article we will analyze the techniques used by the Maze DLL in detail.
We want to acknowledge another very good blogpost regarding this topic which was published by Crowdstrike a couple of days ago:
- Maze Ransomware Loader version 1
Links to script repository