An in-depth look at the technology behind CyberCapture

Ondrej Vlcek 17 Nov 2016

Avast CTO and executive vice president Ondrej Vlcek explains CyberCapture's advanced malware detection methods used in every Avast product.

Earlier this summer, we told you about our proprietary CyberCapture technology. CyberCapture is a vital component of the Avast Antivirus Nitro Update, providing users with increased speed and a higher level of protection against zero-second attacks. In this post, I’d like to dive deeper into the engineering behind CyberCapture and explain the components that give the feature its technical integrity.

In essence, CyberCapture is a cloud-based smart file scanner. In order to provide immediate analysis, CyberCapture automatically establishes a two-way channel of communication with the Avast Threat Labs while securing suspicious files on the user’s PC until analyses are completed. Once a file has been isolated, our team can clear away all the false code, misdirection, obfuscation, and other stuff malware creators use to mask malware’s true intentions. By doing so, CyberCapture is able to dissect malicious file, observe the binary level instructions inside the malware and understand the true purpose hidden within it.

What goes on behind the scenes of protection against zero-second attacks?

When examining potentially malicious files, we peel away layers of obfuscated code and then observe the binary level instructions inside the object. We then analyze the files through a number of analysis techniques, including behavioral analysis and various similarity checks.

We perform similarity checks while running a file in a virtual environment and monitoring various components such as its registry access, file operations, and memory operations. An example of a behavioral pattern that we detect could read “Writes to registry to achieve persistence and saves itself to C:\Windows,” which we would then use to scan the dropped files and create memory blocks from.  Previously, these kind of detections were mainly performed as part of the DeepScreen module in the Avast client. However, this approach hasn’t always led to optimal results, mainly because we couldn’t make the virtualization layer needed for DeepScreen work on all system configurations, and also because the time given to the DeepScreened samples to show their true malicious behavior wasn’t typically large enough.

How does CyberCapture identify malicious files?

Several years ago, our team developed a technology that we named MDE. Essentially, MDE is a large in-memory database that works on top of indexed data and allows our team to carry out rapid similarity checks and clustering queries. MDE’s custom distance function helps us to calculate the similarity between two files, and was specifically crafted to work well in terms of identifying malicious files. With the newest version of CyberCapture, we continue to heavily rely upon the intelligence behind MDE. What’s more, we have recently released the second version of MDE, which contains more features and improves the vector’s accuracy. And the fact that the CyberCapture similarity matching runs right in our cloud gives it access to the very latest version of the database, which is also very important.

The second technology that helps CyberCapture provide top-notch protection against new malicious files is Evo-Gen. At its core, Evo-Gen is a genetic algorithm developed by the Avast Threat Labs that efficiently locates and reveals short, generic descriptions of large sets of malware samples. Evo-Gen is an especially vital tool when our team is analyzing millions (and sometimes even tens of millions) of files at once -- using the algorithm, we can pick out malicious files that have been randomly scattered across various virus sets. Evo-Gen also benefits from a set of notable improvements that were added to it as part of the works on the CyberCapture project.

Behavioral analysis of the files is performed in virtual containers based on the proprietary NG technology we have introduced in Avast 2014. NG has been an important project for Avast, not necessarily because it brought us better detection rates per se, but because it allowed us to focus more on behavioral study of malware and gradually create a rich database on which today’s CyberCapture could build. The advantage of running CyberCaptured samples in controlled, “clean-room” environment for extended periods of time is massive and is something that is truly difficult for the bad guys to fight.

Last but not least, signatures – in their generalized form – continue to play a role in how we detect malware. They provide offline protection and greatly reduce the workload of cloud systems (by means of whitelists). Signatures also support our behavioral analysis, memory scanning, clustering and metadata extraction (more on that below). Finally, they’re invaluable in certain locations with limited access to broadband Internet connection, where programs are mostly distributed via removable media and an Internet connection is not reliable.

What to look for in the new MDE version of CyberCapture

In addition to the improvements on our MDE, Evo-Gen, and behavior analysis technologies, our team has implemented a new system called Simzilla, which takes a different approach to binary similarity with custom similarity vectors. Put simply, Simzilla gives us ways to calculate the distance to nearby files and determines whether they will be classified as clean or malicious. On top of using similarity checks, we continue to compare and scan outputs of behavioral analysis for malicious behavior and content.

Finally, the newest version of CyberCapture also comes with a handy new real-time clustering and classification algorithm up its sleeve, making it more capable of creating more accurate clusters and covering more specific parameters. This algorithm is also a similarity check, but what makes it stand out is its reliance on machine learning. By combining features from multiple systems, the algorithm allows us to gather useful behavioral information from malicious samples and to react to new families and subtypes of samples more quickly. This is especially important in cases where samples have a very short time to live.  

The continued development of CyberCapture is an important innovation within the cybersecurity field, and we’re proud to provide you with its newest and greatest updates. In the next article, I will tell you more about how the AVG technology will help us make CyberCapture and associated technologies even stronger. 

--> -->