In the last article, we have covered the obfuscation techniques used by one of the loaders used by the Maze ransomware. It is recommended to read it before you start with the Maze DLL.
In this article we will analyze in detail the obfuscation techniques used by the Maze DLL. Additionally, a series of scripts will be provided to deobfuscate and better follow the execution flow.
Usually the ransomware Maze is in DLL form, which is loaded into memory through a loader containing the encrypted DLL.
Therefore, there are two components:
- Loader/packer that contains the Maze encrypted DLL and performs a series of checks before launching the ransomware component, which we discussed in our last article.
- Maze (DLL), which is highly obfuscated and performs a series of checks before running that will be explained throughout this article.
You can find the samples used for this article in the IOC section.
IMPORTANT
- Some samples of Maze (in the DLL itself) have been found making use of the same control flow flattening technique that we found in the loader/packer explained in the last article.
Maze DLL
Obfuscation techniques
The Maze DLL presents the following obfuscation techniques:
- Hidden API calls with opaque predicates
- Hidden return with push + JMPs eax (WinApi)
- Hidden memcpy calls
- Fake calls
- Opaque predicates with intermediate jumps and junk code
Below, we will detail how these techniques work and how they can be patched.
Hidden API calls with opaque predicates
The real API calls in the Maze DLL are hidden behind opaque predicates and work as follows:
There are two conditional jz
and jnz
jumps pointing to the same address 0x100371A2
. Before executing the jmp
to the address where the code will execute jmp <WindowsAPIAddr>
, a push operation will push the return address, in this case 0x10001571
.
This address will be the one the program will return to once the Windows API function is executed.
It is important to note that, in this case, the opaque predicate is pointing to a part of the Maze code that contains several jmp
instructions, each pointing to a function of the Windows API.
This hides the Windows API calls by putting an intermediate step via jmp
instructions, making it more difficult to follow the program execution flow.
OpaquePredicate jz/jnz -> jmp WinAPIFuncAddr -> Execute WinAPIFunc -> return to pushed address
In this case, once the return address has been pushed push offset loc_10001571
, it will jump to the jmp lstrlenA
.
68 71 15 00 10 push offset loc_10001571 <- Push return Addr loc_10001571 0F 84 57 5C 03 00 jz loc_100371A2 <- if jz jmp to JMP FuncWinAPI 0F 85 51 5C 03 00 jnz loc_100371A2 <- if jnz jmp to JMP FuncWinAPI loc_100371A2: FF 25 9C 90 03 10 jmp ds:lstrlenA <- Jump here and lstrlenA loc_10001571: 40 inc eax <- Then return here (pushed value) 89 E3 mov ebx, esp 50 push eax 55 push ebp 53 push ebx 68 94 15 00 10 push offset dword_10001594 0F 84 CE 5E 03 00 jz sub_10037450 0F 85 C8 5E 03 00 jnz sub_10037450 AE scasb
Now that it is understood how it works, let’s eliminate the opaque predicates.
To remove the conditional jumps, all opaque predicates of this type must be detected.
For this purpose the following expression will be used:
68 ?? ?? ?? ?? 0F 84 ?? ?? ?? ?? 0F 85 ?? ?? ?? ??
This expression, along with some script checks, will find opaque predicates, but we need to differentiate those that make jmp
to the Windows API from those that do not:
- If the opaque predicate jumps to an address that does not point to a
jmp
to Windows API function, it will be changed to an unconditionaljmp
. - In the case of those conditional jumps that point to a jmp to an Windows API function, they will be patched by replacing the opaque predicate with the
call push ret
formula.
Thus it is necessary to check if the address being pointed to starts with:
FF 25
For example, this is a case where there are calls to the Windows API:
Because loc_10371A2
points to jmp lstrlenA
:
Once patched, it presents the Windows API call correctly and the return has been fixed:
Now the program flow is much easier to follow.
During the patch, an extra byte was used for retn
because the code that precedes this type of call is junk code. If there was a problem with this, it could always be fixed as there are 4 bytes with previous NOP that can be used for patching.
maze_patcher.py -> obfuscated_jz_jnz()
Hidden return with push + JMPs eax (WinApi)
Similar to the previous case, and to further complicate the tracking of the program flow, push
and jmp
instructions are combined to simulate a call behavior. The code will jump to the address of a register, and will return to the address pushed to the stack.
push addr_1 -> jmp WindowsApiFunc -> Execute WindowsApiFunc -> return to addr_1
Example:
image-20200325165312729
push 0x100028C9 -> jmp lstrcatW -> Execute lstrcatW -> return to 0x100028C9
Usually it is not possible to determine that EAX
contains an address of a Windows API without first observing it during the dynamic analysis, but in the analyzed samples the behavior is always the same, so it can be patched as explained below.
Note that in certain cases a fake API call can be found after the call to jmp eax
. For example, in the image above the call LsaAddAccountRigts
is junk code since that code fragment will never be executed.
In order to patch it up, we first have to identify all the calls of this type. We can use the following expression:
68 ?? ?? ?? ?? FF E0
Which is based on the following code fragment:
.text:100028B8 68 C9 28 00 10 push offset loc_100028C9 .text:100028BD FF E0 jmp eax .text:100028BF FF 15 08 90 03 10 call ds:LsaAddAccountRights
After checking that the block is correct, it is necessary to change the jmp for a call instruction that will call the Windows API that is passed to EAX, then put the push after the call and add a ret so that the code jumps to the memory address that has been passed in the push instruction:
.text:100028B8 FF D0 call eax ; lstrcatW .text:100028BA 68 C9 28 00 10 push 100028C9h .text:100028BF C3 retn .text:100028C0 90 nop .text:100028C1 90 nop .text:100028C2 90 nop .text:100028C3 90 nop .text:100028C4 90 nop
In the case where a fake call instruction is found after jmp eax
, it can be patched with nop instructions.
The final code is much cleaner:
Moreover, during the analysis, the execution will be easier to follow.
maze_patcher.py -> patch_jmp_eax()
Hidden memcpy
There is a particular function that is called repeatedly during the execution of Maze.
This function will be called up continuously throughout the execution of the program and may complicate the analysis if its purpose is not understood.
The function will be identified with its start:
.text:10037450 57 push edi .text:10037451 56 push esi .text:10037452 8B 74 24 10 mov esi, [esp+8+arg_4] .text:10037456 8B 4C 24 14 mov ecx, [esp+8+arg_8] .text:1003745A 8B 7C 24 0C mov edi, [esp+8+arg_0] .text:1003745E 8B C1 mov eax, ecx .text:10037460 8B D1 mov edx, ecx .text:10037462 03 C6 add eax, esi .text:10037464 3B FE cmp edi, esi .text:10037466 76 08 jbe short loc_10037470
Because of this unique prologue, the function can be quickly identified with the following bytes:
57 56 8B 74 24 10 8B 4C 24 14 8B 7C 24 0C 8B C1 8B D1 03 C6 3B FE 76 08
The values of the ESI
and EDI
registers are stored in the stack at the beginning and will be restored once the function is exited.
The purpose of the function is to copy the string/data passed as an argument to the function as follows:
memcpy
( DestinationString = edi , SourceString = esi, NumberOfBytesToWrite = ecx )
The memcopy operations will be performed in several ways, one of which uses the instruction rep movsd
.
Most branches restore the value passed by parameter in the EAX
register and the values that the ESI
and EDI
registers initially had. It is possible, however, to find other blocks that indicate that it could have a different behavior.
But if we analyze the function in depth, we’ll see that most blocks perform similar operations, only that they are carried out through other instructions.
Through the script that has been created, the function is identified and patched so that it can be skipped, facilitating its analysis.
Original code:
Patched code:
maze_patcher.py -> find_and_rename_memcpy_function
Fake calls
Fake calls are API calls that will never happen. Their sole purpose is to make the analyst think that the binary has a functionality that it does not have; they are junk code.
To remove these fake calls, the opaque predicates that precede them will have to be removed. Once removed, the fake function calls (if any) should be removed as well.
.text:1000166B 74 9C jz short loc_10001609 <- Opaque Predicate .text:1000166D 75 0A jnz short loc_10001679 <- Opaque Predicate .text:1000166F FF 15 80 90 03 10 call ds:LocalAllocx <- Fake Call
.text:10021CB9 0F 84 9D 00 00 00 jz loc_10021D5C <- Opaque Predicate .text:10021CBF 75 0A jnz short loc_10021CCB <- Opaque Predicate .text:10021CC1 FF 15 04 92 03 10 call ds:LsaConnectUntrusted <- Fake Call
.text:10021C11 0F 84 09 F9 FD FF jz loc_10001520 <- Opaque Predicate .text:10021C17 0F 85 03 F9 FD FF jnz loc_10001520 <- Opaque Predicate .text:10021C1D FF 15 00 91 03 10 call ds:CreateFileW <- Fake Call
We can locate these kinds of calls using the following expressions:
74 ?? 75 ?? FF 15 75 ?? 74 ?? FF 15 0F 84 ?? ?? ?? ?? 75 ?? FF 15 0F 85 ?? ?? ?? ?? 74 ?? FF 15 0F 84 ?? ?? ?? ?? 0F 85 ?? ?? ?? ?? FF 15 0F 85 ?? ?? ?? ?? 0F 84 ?? ?? ?? ?? FF 15
And then we simply have to patch them:
Original 74 ?? 75 ?? FF 15
:
Patched:
Original 0F 84 ?? ?? ?? ?? 75 ?? FF 15
:
Patched:
Original 0F 84 ?? ?? ?? ?? 0F 85 ?? ?? ?? ?? FF 15
:
image-20200327111515584
Patched:
Script:
maze_patcher -> delete_fake_calls_before_jz_jnz()
Opaque predicates with intermediate jumps and junk code
In the code you can find opaque predicates where each instruction jumps to a different address, but finally they end up in the same place, as you can see in the following example:
It can be identified by the following expressions:
68 ?? ?? ?? ?? 0F 84 ?? ?? ?? ?? 75 ??
68 ?? ?? ?? ?? 0F 85 ?? ?? ?? ?? 74 ??
Through these expressions you will find the blocks that have opaque predicates, but it is necessary to check if they also present this behavior with intermediate jumps.
It is assumed, after analysis, that the following conditions are met:
- The first
jz
orjnz
instruction is the one that contains the final address. - The second
jz
orjnz
instruction contains the intermediate address. This intermediate address contains another conditionaljmp
instruction that points to the same final address as the first conditionaljmp
instruction.
.text:1002AFE3 68 1D B0 02 10 push offset loc_1002B01D .text:1002AFE8 0F 84 12 6E 00 00 jz loc_10031E00 <- Final Addr .text:1002AFEE 75 04 jnz short loc_1002AFF4 <- Intermediate Addr .text:1002AFF0 E2 1B loop loc_1002B00D
.text:1002AFF2 00 db 0 .text:1002AFF3 00 db 0
.text:1002AFF4 .text:1002AFF4 loc_1002AFF4: .text:1002AFF4 0F 85 06 6E 00 00 jnz loc_10031E00 <- Final Addr .text:1002AFFA 74 04 jz short loc_1002B000 <- Junk Code .text:1002AFFC 13 1A adc ebx, [edx]
Once the code is patched, it looks much better and easier to follow:
maze_patcher.py -> obfuscated_jz_jnz_2()
Anti-debug techniques
As if the sample did not already have enough countermeasures to make its analysis complicated by obfuscation, it also implements the following anti-debug techniques:
- Checking BeingDebugged flag
- DbgUIRemoteBreaking patch
- Windows Language Code Identifier (LCID) check
- Process blacklist
Checking BeingDebugged flag
Some of the blocks feature checks of the BeingDebugged flag. At this point there are several solutions:
- Use a plugin that hides the debugger so that even if the PEB is checked it does not have the BeingDebugged flag set to 1.
- Modify the BeingDebugged flag the first time it is checked during the dynamic analysis.
- Patch all blocks that perform the BeingDebugged check and force them to go the appropriate way regardless of the BeingDebugged
DbgUIRemoteBreaking patch
Another of the anti-debug techniques used by Maze is patching the DbgUIRemoteBreaking function to prevent the debugger from following the analysis, which involves using VirtualProtect
.
To avoid this technique, a breakpoint must be set in the kernel32->VirtualProtect
function:
During execution, Maze tries to change the permissions to 0x40(PAGE_EXECUTE_READWRITE) at the first byte of the function in order to patch it.
Accessing the address where the DbgUIRemoteBreakin function is located, we can see that it is still unmodified:
Following the execution, we will see how Maze patches the first byte with the C3
opcode (return)
.
Then Maze will run VirtualProtect
again to modify the permissions of the first byte and reset it to 0x20 (PAGE_EXECUTE_READ). At this point it is possible to patch again to its original value, replacing C3
with 6A
, restoring the function to its original form. You can use a conditional breakpoint for this.
Windows Language Code Identifier (LCID) check
The function, GetUserDefaultUILanguage checks the LCID and compares it against a hardcoded list. If it matches any of the values that belong to countries from the Commonwealth of Independent States (CIS) or Ukraine, it will not encrypt any file.
Some examples:
Russia
Ukraine
Process blacklist
Using the kernel32
functions CreateToolhelp32Snapshot
and Process32NextW
will list the processes.
For each process name listed, it will perform the following actions:
- It will change the name of the process to lowercase.
- It will convert the name of the original process to one generated according to some constants.
Example with smss.exe
is converted to QOQQ/GZG
- Calculates the CRC of the intermediate name with the Adler32 algorithm.
- Finally, it compares the generated CRC against the hardcoded values, which correspond to the blacklisted processes.
ida.exe
is among the process names it checks.
In order to facilitate the analysis tasks, and not have to modify the name during the process enumeration, it is recommended to rename ida.exe
to a random name.
New Versions
Some recent samples of Maze incorporate the technique analyzed in the last article about the Maze loader, Control Flow Flattening, along with different obfuscation techniques directly in the DLL. In addition, those graphs appear contiguously and it is possible to compare the blocks previously analyzed and observe that it is the same technique.
Sample:
dee863ffa251717b8e56a96e2f9f0b41b09897d3c7cb2e8159fcb0ac0783611b
The upper part of the image below is the first graph showing the Maze loader obfuscated with control flow flattening.
In the lower part, there is another graph, which corresponds to the part of the DLL that was analyzed, showing it has been obfuscated with control flow flattening as the Maze loader. Although it looks smaller than the above graph, this is simply because IDA is not able to represent it correctly due to obfuscation and cuts the graph at that point.
Conclusion
Throughout this article, the most relevant techniques used by Maze DLL have been explained, with the goal that the researcher will be able to identify them and apply the scripts provided in order to facilitate the analysis of Maze ransomware samples.
Creating a script that supports changes in the obfuscation methods used by Maze as a whole is complex. Instead of attacking the whole problem at the same time, we’ve tried to isolate each problem and solve each type of obfuscation one by one, supporting different versions, as well as allowing the researcher to understand the techniques used. This way, if there are possible changes in future versions, the researcher will be able to find an appropriate solution to the new problems.
IOCs
- Maze Ransomware DLL version 1
1e3c7bce7eac2516c68e5586f1c22ba06e9e4bad649c5e8117393208f2eaa7bf
- Maze Ransomware DLL to EXE version 1
e35ffe111c62d9b05048518659b2b462d8124691bf63b8c34513ec4433d21d80
- Maze Ransomware DLL version 2.1.1
dee863ffa251717b8e56a96e2f9f0b41b09897d3c7cb2e8159fcb0ac0783611b
- Maze Ransomware DLL version 2.3
719d18f210c459f53071906675725175a65ed43c11d4008607b9ee13dcfeef2b
Links to script repository
https://github.com/Blueliv/maze-deobfuscation/blob/master/maze_patcher.py
This blog post was authored by the Blueliv Labs team.