Move Fast and Don’t Break Things (Part 1): Accurate API Monitoring at High Performance

Apr 16th 2020

In designing systems, engineers often must navigate between two extremes. Resources are finite and compromises must be made between making something operate slowly and thoroughly or fast and recklessly.

But what if a system could be both fast and accurate? Because of VMRay’s entirely hypervisor-based technology, it has the ability to be both. While traditional sandboxing technology needs to choose one or the other, VMRay has the unique ability to monitor all the API calls made by a code sample (unlike API hooking sandboxes), and do it at high performance (unlike emulators).

With this approach, VMRay has the ability to monitor API calls of secondary importance such as string manipulation functions like strlen (which just returns the length of a string). Traditional sandboxes cannot because they must conserve resources. This ability is useful as parameters of these calls can show information that represent the internal workings of malware – exactly the type of information that is helpful for good family classification, configuration extraction, and getting a deeper understanding of how the malware works.


More accurate behavior monitoring

The majority of sandboxes monitor behavior by a technique called API hooking. This approach saves a lot of time compared to manual analysis and is relatively simple to implement compared to a hypervisor-based sandbox or emulator. However, it only monitors calls to a few carefully selected API functions. Whenever the sample calls one of the hooked functions, the hook redirects execution to the sandbox’s agent inside the VM to log the call. This redirection-and-logging is slow, meaning monitored functions are executed far slower than non-monitored functions. Therefore, to achieve acceptable performance, API hooking-based sandboxes need to trade-in accuracy and intentionally monitor only very few API functions.

The second downside of API hooking technology is that it’s unable to distinguish between calls made directly by the sample and calls made internally by an OS library. The combination of these two properties lead to major caveats: the monitoring becomes incomplete and noisy.


More complete monitoring

As API hooking does not monitor most of the API, a large part of the behavior is invisible to it. API calls such as string operations are not covered by API hooking but, because it operates in the hypervisor layer, VMRay is able to monitor these calls and provides a complete list of the calls made by the sample.

This Ursnif sample calls wsprintfA to dynamically generate the user agent. The call is visible in the VMRay function log, just as it would be seen during manual debugging, but it’s missing from sandboxes based on API hooking.


wsprintfA call in the IDA disassembly

Figure 1 – Calling wsprintfA to create the user agent as seen in the IDA disassembly of the unpacked payload


wsprintfA call in the VMRay function log

Figure 2 – The wsprintfA call as it appears in the VMRay function log


Noise-free monitoring

Because API hooking can only monitor a very limited set of calls, it attempts to select low-level ones to cover more high-level functions. This means that the sandbox misses API calls which are called by the sample directly, and instead logs other, lower-level calls made by the implementation of the call by the OS. This results in noisy API logs which make it difficult to understand what the sample actually did. In other words: API hooking sandboxes replace direct calls with low-level calls when the directly called function is not hooked.


Let’s look at an example where an API hooking sandbox (Cuckoo) misses a direct call (NtQueryInformationProcess), and instead replaces it with low-level calls (NtOpenProcess and NtClose).

The sandbox used in the example is Cuckoo, an open-source, API-hooking based sandbox. We picked it as an example because most commercial sandboxes use it as a base or at least use the same principle for monitoring. Although some Cuckoo forks may have a hook for this specific API call, the same issue comes up with other, non-hooked API functions.

The example sample calls the NtQueryInformationProcess function to query whether it’s running on a 32-bit system. Looking at the behavior analysis produced by Cuckoo, the NtQueryInformationProcess call is missing because it’s not hooked. Instead, the analysis shows two extra function calls that were not executed by the sample at all: NtOpenProcess and NtClose. This noise is generated because internally, the implementation of NtQueryInformationProcess inside Windows called these two monitored functions. From this output, it is impossible to tell what function the sample really called and why.

VMRay logs the API functions which were directly called by the sample.


NtQueryInformationProcess call in the IDA disassembly

Figure 3 – Actual call seen in the IDA disassembly of the manually unpacked malware


VMRay function log of the NtQueryInformationProcess call

Figure 4 – VMRay function log (formatted) shows the call


Cuckoo output of the NtQueryInformationProcess call

Figure 5 – Cuckoo output for the same call, showing a crude approximation


Many vendors make broad claims about their analysis and detection capabilities but when one looks at the underlying architecture, it becomes clear these vendors have been forced to accept painful tradeoffs, attempting to bridge the divide between achieving a high rate of speed and providing deep visibility into malware behavior.

Because of our unique product architecture, VMRay’s dynamic analysis engine sees every interaction between malware and the target system. Our software logs and analyzes everything from simplistic, easily defeated attacks to advanced threats that “good enough” sandboxes aren’t good enough to catch. This deep insight provides precise, actionable results that guide security measures across the enterprise.