on

Old tricks still work

There are many well-known anti-VM / anti-sandbox tricks in targeted malware. However, most up-to-date sandboxes have fixed them, or they can be fixed easily by modifying the VM in which the malware sample will run. In this article we will examine a particular technique that exploits a design flaw in Cuckoo Monitor.

While analyzing a malware sample (for this we used AgentTesla), we found a Visual Basic packer that successfully hid a malicious process. It did this when executed in a Cuckoo Sandbox using the default Monitor—which is executed in UserMode (ring 3).

When looking at the sample’s  behavior, it appears to run until it hits the Windows API EnumWindows. The Windows API reference does a pretty good job at explaining what it does:

Enumerates all top-level windows on the screen by passing the handle to each window, in turn, to an application-defined callback function. EnumWindows continues until the last top-level window is enumerated or the callback function returns FALSE.

The function, found in the user32 module, makes use of a very particular mechanism. It’s one of the few functions during which the kernel makes calls to user mode, not the other way around. You can find more information about this behavior in this blogpost by Ken Johnson/Skywing.

Cuckoo loses traceability of the analysis during the callback execution partly due to this behavior. During this unmonitored time, the packer can run anti-VM, anti-sandbox, or any kind of checks “hidden” from the sandbox.

One of the mechanisms used by the packer is to execute a very long sleep inside one of the callbacks in order to force the sandbox to finish the analysis before unpacking the actual payload.

In order to illustrate this, we have created a small PoC

#include <windows.h>
#include <stdio.h>
 
typedef UINT(CALLBACK* LPFNDLLFUNC1)(WNDENUMPROC, LPARAM);
typedef UINT(CALLBACK* LPFNDLLFUNC2)(DWORD);
 
 
BOOL CALLBACK EnumWindowsProc(HWND hWnd, long lParam) {
    HINSTANCE hDLL;
    LPFNDLLFUNC2 p;
 
    hDLL = LoadLibraryA("kernel32.dll");
    p = (LPFNDLLFUNC2) GetProcAddress(hDLL,"Sleep");
    printf("%x\n", p);
    p(3000);
    return TRUE;
}
 
int main() {
    HINSTANCE hDLL;
    LPFNDLLFUNC1 p;
 
    hDLL = LoadLibraryA("user32.dll");
    p = (LPFNDLLFUNC1) GetProcAddress(hDLL, "EnumWindows");
    printf("Resolved EnumWindows at addr: %x\n", p);
    p(EnumWindowsProc, 0);
    printf("EnumWindows finished executing\n");
    return 0;
}

In the previous code, the main function loads the user32 module, retrieves the EnumWindows address dynamically, prints the address of EnumWindows, and calls it passing as a parameter the function EnumWindowsProc.

Inside EnumWindowsProc, the code executes a sleep of 3000 milliseconds. Simple stuff. If we examine the behavior of the sample when executing it with the User Monitor (the ring 3 monitor), we can see how the sample loads the DLL, gets the procedure for EnumWindows, prints the address, and then calls EnumWindows.

After the call, we get no logs from the execution until 333 seconds later. This is because the function EnumWindowsProc has been executed for every window in the VM, but there’s no trace of this execution:

(time) call(arguments) return_code
(000.00)  LdrLoadDll(module_name[user32.dll],basename[user32],flags[0],module_address[0x760e0000]): 0x0
(000.00)  LdrGetProcedureAddress(ordinal[0],function_address[0x760f375b],module_address[0x760e0000],module[USER32],function_name[EnumWindows]): 0x0
(000.01)  WriteConsoleA(buffer[Resolved EnumWindows at addr: 760f375b  ],console_handle[0x7]): 0x1
(000.01)  EnumWindows(): 0x1 
(333.49)  WriteConsoleA(buffer[EnumWindows finished executing  ],console_handle[0x7]): 0x1  

When, instead, the analysis is launched with a Kernel Monitor (working from ring 0), the sandbox is capable of following the execution inside the callback function, tracing all syscalls used inside:

parentfunction+offset_loggedfunction(arguments): returnvalue
WriteConsoleA+0x18_NtRequestWaitReplyPort(24,2809540,2809540): 0x0
EnumWindows+0x16_NtDelayExecution(0,2816460): 0x0
WriteConsoleA+0x18_NtRequestWaitReplyPort(24,2809540,2809540): 0x0
EnumWindows+0x16_NtDelayExecution(0,2816460): 0x0
WriteConsoleA+0x18_NtRequestWaitReplyPort(24,2809540,2809540): 0x0
EnumWindows+0x16_NtDelayExecution(0,2816460): 0x0
WriteConsoleA+0x18_NtRequestWaitReplyPort(24,2809540,2809540): 0x0
EnumWindows+0x16_NtDelayExecution(0,2816460): 0x0
WriteConsoleA+0x18_NtRequestWaitReplyPort(24,2809540,2809540): 0x0
EnumWindows+0x16_NtDelayExecution(0,2816460): 0x0 

But why does the User Monitor behave this way? Even though the call is going from Kernel Space to User Space, the callback is using the same libraries as the main function, which are hooked by the monitor. Responsible for this behaviour is the following check:

    log_debug("Entered %s\n", "{{ hook.apiname }}");
 
    {%- if not hook.signature.special: %}
 
>   if(hook_in_monitor() != 0) {
        log_debug("Early leave of %s\n", "{{ hook.apiname }}");
 
        {{ call_old(hook, replace_args=False, lasterr=False)|indent }}
 
        {%- if hook.signature.return_value != 'void' %}
        return ret;
        {%- else %}
        return;
        {%- endif %}
    }

In order to generate behavior that actually makes sense, the hooking engine of Cuckoo was designed so that if a hook is triggered when inside another hook, then the triggered hook is ignored. This prevents regular API calls that make use of other APIs from generating very large reports (which might crash the analysis) and from introducing a lot of noise into the behavior.

Working from User Space has many advantages, but because you’re always in the same context as the malware you’re analyzing, it will always have some degree of control over your monitor. The best way to fix this is to switch to a hyper-visor or kernel monitor, avoiding this issue completely.

For more details about how we reverse engineer and analyze malware, visit our targeted malware module page here.

This blog post was authored by Blueliv Labs team.

Demo Free Trial Community Newsletter