Threat Bulletin: Dissecting GuLoader’s Evasion TechniquesJuly 9, 2020 | Featured PostsMalware Analysis
Editor’s Note: This blog post was updated on August 10, 2020.
Over the last couple of months, we observed a new downloader called GuLoader (also known as CloudEyE) that has been actively distributed in 2020. In contrast to prototypical downloaders, GuLoader is known to use popular cloud services such as Google Drive, OneDrive and Dropbox to host its encrypted payloads. So far we have seen that GuLoader is being used to deliver Formbook, NanoCore, LokiBot and Remcos among others. We’ve observed that GuLoader uses a combination of evasion techniques that evade sandboxes and slow down (manual) analysis.
On June 6th, 2020 the developers of GuLoader informed the public that they have shut down their service (Figure 1). Despite the suspension of service, we anticipate other malware families will evolve and adapt some of these techniques in the near future. In this post, we will highlight GuLoader’s techniques with a focus on sandbox evasion and anti-analysis.
Overview and Shellcode
In our analysis, we can see that GuLoader creates another instance (in the following referenced as the second instance) of itself and modifies its execution (Figure 2 and Figure 3).
The second instance then performs further malicious activities, which includes network activity to download the payload and the memory modification of other processes (Figure 4).
Other reports about GuLoader revealed the main functionality is implemented as shellcode, whereby the sample is a 32-bit executable written in VB6 that contains the shellcode in encrypted form.
During execution, the embedded shellcode is decrypted, executed and even injected as seen before (Figure 2).
By loading the shellcode in IDA Pro (we loaded the shellcode at offset 0x001A0000) or a similar disassembler, we can see that the code is heavily obfuscated. The code is split into smaller code parts containing additional junk code (Figure 6) that are connected with control-flow changing instructions such as call, return and (indirect) jump. In contrast to compiler-generated code, the shellcode combines code instructions and data such as strings, which is typical for position-independent code.
This makes the static control-flow analysis more difficult and causes the automatic analysis of IDA Pro to fail.
For example, the addresses of library names are pushed on the stack by using the call instruction (Figure 5). In compiler-generated code, this instruction is used to transfer the control flow to another function, and the return instruction transfers it back to the caller.
GuLoader resolves the required functions during runtime and uses the hash algorithm djb2 to find the desired functions.
Anti-Analysis and Evasion Techniques
Expanding on the techniques mentioned above, the shellcode contains more techniques to obstruct automatic analysis. One of these techniques is the search for virtual machine artifacts, which are embedded as djb2 hash values. In Figure 6, we can see that these hash values are pushed on top of the stack and the successive call to the function tries to find the corresponding artifacts in memory.
Since these values are calculated by a one-way function (djb2), their preimages are unknown. So far, the strings in Table 1 have been found to be possible preimages.
|7F21185B||“HookLibraryx86.dll”||ScyllaHide Plugin for x64dbg|
|A7C53F01||“VBoxTrayToolWndClass”||VirtualBox Guest Additions|
|B314751D||“vmtoolsdControlWndClass”||VMWare, see |
If one of these hashes is found in memory, the sample displays an error message (Figure 7) and terminates the process. Therefore, the sample shows no further malicious behavior, and it does not download the payload.
In addition to the virtual machine artifacts, GuLoader verifies the number of top-level Windows displayed on the current screen to exclude running in a sandbox (Figure 8.).
For each top-level Window, the callback function (Figure 9) increases a counter by one, which leads to the overall number of top-level Windows. This counter is used in the check at 0x1A01A6 that validates if at least 12 top-level Windows are present.
If the number is lower, the process terminates in which case no error message is displayed.
To further prevent the manual analysis with a debugger, GuLoader modifies functions related to debugging (Figure 10).
GuLoader modifies the two functions DbgBreakPoint and DbgUiRemoteBreakin. For the first function, the first byte is replaced by a NOP instruction, and for the second function, the code is replaced by a call to ExitProcess (Figure 11).
After the code modifications of DbgUiRemoteBreakin, the attaching of a debugger to the running process results in its termination.
In addition to the modifications of the two functions mentioned above, GuLoader modifies further functions exported by Ntdll.dll (Figure 12). These functions are well-known candidates for function hooking which allows intercepting function calls by redirecting the control flow. Some Antivirus Software and Sandboxes use function hooking to monitor the behavior of a given program.
Verifying this suspicion in IDA Pro, GuLoader iterates through the code section of Ntdll.dll. While iterating GuLoader tries to undo modifications introduced through function hooking as mentioned in Crowdstrike’s analysis and disables Turbo Thunks, see WoW64 Internals.
To find candidates for modification, GuLoader uses various byte patterns including “B8 00 00 00 00 BA” (Figure 13).
Disabling of Turbo Thunks is reported (Figure 12) and calls to these functions are still monitored because VMRay Analyzer does not rely on hooking.
Furthermore, GuLoader hides threads by calling the function NtSetInformationThreadwith the value HideFromDebugger (0x11) for the parameter ThreadInformationClass(Figure 14).
In addition to the previously mentioned hash values of virtual machine artifacts, GuLoader checks the presence of the Qemu Guest Agent on the filesystem. Both filesystem strings are visible in the shellcode (Figure 15) and in the function log (Figure 14).
Before the second instance is created, or, in case of the second instance, before the payload is downloaded, it delays its execution by using the instructions cpuid and rdtsc frequently in a loop (Figure 16).
The instruction cpuid provides information about the processor and available features and can be used to detect the presence of a hypervisor. In addition, rdtsc provides the number of CPU cycles since the last reset.
If cpuid is executed in a virtual machine, the instruction causes the control flow to be transferred to the hypervisor which resolves the request. Switching from the virtual machine to the hypervisor and back again introduces an overhead that can be used to detect a virtual machine.
In case that a sandbox patches the rdtsc instruction to return a fixed value, the loop in Figure 16 is an infinite loop since the register edx at 0x001A2506 has always the value 0 and the subsequent conditional jump is always taken.
Next, the sample performs the actions related to its stage. In the first stage, it creates a new process of itself, tries to unmap its base image, maps msvbvm60.dll instead, followed by the previously mentioned code injection.
In the second stage, it downloads the payload using WinINet’s functions InternetOpenURLA and InternetReadFile. We inspected the behavior of both stages in the VMRay function log (Figure 17). We highlighted the fuction calls to NtGetContextThread in both figure because calls to some specific functions including CreateProcessInternalW, NTAllocateVirtualMemory, NTWriteVirtualMemory and NTResumeThread are preceded by a call to NtGetContextThreat.
These functions are well-known candidates for breakpoints during manual dynamic analysis, and GuLoader tries to detect the presence of these breakpoints (Figure 18). After a call to NtGetContextThread, the values of the debug registers DR0, DR1, DR3, DR6, DR7 are investigated to detect hardware breakpoints (see the structure CONTEXT). Next, the code of the desired function is checked against interrupts/software breakpoints (0xCC, 0x3CD, 0x0B0F), which are typically set by debuggers, before the function is finally called (offset 0x1A2E66).
After all of these evasion and anti-analysis attempts, the second instance decrypts the received payload, maps it into memory, and transfers execution.
With the help of VMRay Analyzer, we can observe the complete behavior GuLoader, which automates and accelerates the identification of important behavior for further analysis (Figures 19 & 20). This analysis is a good example of how malware evolves and adapts very technical sandbox evasion and anti-analysis techniques. The quick and widespread adoption of GuLoader confirms a growing demand for evasive malware loaders in the criminal underground.