In 2008, Eva Chen, Trend Micro’s CEO declared: “In the antivirus business, we have been lying to customers for 20 years”. Seven years later, Bryan Dye, Symantec SVP Information Security, stated, “[antivirus] is dead”. Despite these declarations, antivirus remains a billion-dollar industry with millions of users, whether they are individuals, companies or governments. So why is that?
This series aims to help us understand this paradox by reviewing the forty-year-old cat and mouse game that has been going on between antiviruses and malware programs.
Once upon a time in ARPANET
The Creeper is generally considered to be the first computer worm. It was let loose in 1971 on the ARPANET, displaying the following message: “I’m the creeper: catch me if you can” before propagating itself. Even though it was harmless, it paved the way for a new kind of program: computer viruses.
More than a decade later, the first antiviruses were developed. An antivirus (AV) is a program designed to protect a computer against malware. It has two main missions: scan a system for malware programs to purge and correct the elements damaged or modified by malware. This blogpost will mainly focus on the first part: search & destroy.
For a long time after their birth in the late ‘80s, AVs relied mainly on static analysis techniques. Static analysis means that the file tested by the AV is not executed, relying instead on the content of the file.
The most common of these techniques is the signature-based approach. The principle is simple: when an AV editor discovers a new malware family, it creates a signature from one or more samples of this malware. A signature is designed to identify characteristic patterns found in the code of several versions of a malware. This is a delicate s too broad would raise many false positives and could even damage the system on which the AV is running, whereas a signature too narrow would simply miss some samples.
After its creation, the signature is uploaded to the AV database. From then on, when the AV scans a file to determine its malicious nature, it will try to match the sample to the store of signatures in its database. If one or more signatures match, the file is treated as malicious and is likely be quarantined or destroyed.
For a while, AVs were a satisfactory answer to the proliferation of amateur malware. But the professionalization of cybercrime began, and the first AV countermeasures appeared.
The easiest way to avoid malware signatures: Tales from the crypt
Malware encryption was among the first techniques employed by malware authors to avoid malware detection. The theory is pretty simple: encrypt your payload and use a decryptor at the beginning of the code. When the code is executed, the decryptor will decrypt the payload, which will carry out its malicious mission. After that, the decryptor will re-encrypt the payload with a different key.
For the encryption / decryption process, malware authors can use a wide range of technologies, going from a simple XOR to modern encryption processes like AES.
This technique neutralizes all signatures that were created based on patterns found in the payload, since the payload is only decrypted when running. Of course, an AV could simply scan the system’s memory to look for it, and while some may do that, it is generally avoided because of the colossal resource cost. The main approach to counter classic encrypted malware is a signature based on the decryptor, which remains the same throughout the sample’s activities. According to Bashari Rad, Masrom, & Ibrahim (2012), the first known encrypted virus was Cascade, released in 1987.[i]
Oligomorphism: The sequel to encryption
Remember the movie Alien? It scared everyone in 1979 with its xenomorph (the titular alien) creeping around the ship and slaughtering the crew members one by one. When the time came to make a second Alien movie, James Cameron’s film did not feature one alien, but an entire hive. Because, in the same sense that spectators had experienced the fear caused by a single alien and were ready for it, the AV industry was able to effectively counter encrypted malware. And that is why, like James Cameron, malware authors had to come up with the next step: oligomorphism.
Oligomorphic malware is quite similar to encrypted malware, but before re-encrypting the payload, a different decryptor is chosen from a set embedded in the malware. The payload is then encrypted so that the newly chosen decryptor can decrypt it. The end result is that the decryptor is not constant, this is the main improvement from encrypted malware.
The main drawback of this technique is that the decryptor pool is a finite set (a couple hundreds of elements for the largest ones). That means that making signatures matching every decryptor would be costly (in term of human effort to write them and computing time to match them) but it would ultimately work. The first malware known to use this technique was the Whale virus in 1990.
Polymorphism: more than a fistful of decryptors
As reported by Szor (2005), the first polymorphic virus (V2PX or 1260) was written by Mark Washburn in 1990 to prove to the AV community that static analysis alone was not an efficient way to detect malware. The general principle is the same as oligomorphism. The main difference is that polymorphic malware creates a new decryptor from an embedded polymorphic engine, which is able to generate a very large number of decryptors. To put things in perspective, the largest number of embedded decryptors in an oligomorphic malware observed in the wild is around a few hundred; the virus of Washburn could generate about a million different decryptors despite its small size (1260 bytes).
In 1992, two year after Washburn’s virus, a malware author known as “Dark Avenger” released a polymorphic toolkit (MtE: Mutation Engine), to allow less sophisticated coders to produce polymorphic malware.
A polymorphic engine is able to create a new decryptor through the use of several obfuscation techniques. Borello & Mé (2008) mentioned some of these techniques[ii]:
- Instruction substitution: replacing an instruction by an equivalent one.
- Instruction permutation: changing the order of the instructions.
- Variable substitution: swapping the use of registers or using variables instead.
- Dead code insertion: adding junk code which won’t strictly serve the purpose of the decryptor but will make its detection more difficult, for example: adding 0 to a register or moving the value of a register to the same register.
- Changing the control flow by adding branching instructions while preserving the program objective.
- Using opaque constants, as explained by Moser, Kruegel & Kirda in 2008, some malware samples hide their constants by computing them at runtime, which make them more difficult to track than hardcoded values.[iii]
The answer from AV editors was to execute the first instructions of a scanned file in order to trigger the decryption process and then use the decrypted payload to match a signature. Although it worked at first, malware authors quickly responded by adding countermeasures such as adding long and costly loops at the beginning of the malware to prevent the AV from reaching the decryption process.
But that was not the only type of countermeasure, let’s talk about W95/HPS and W95/Marburg: both written by GriYo in 1998, they were the two first viruses to use a 32-bit polymorphic engine. W95/HPS in particular was the first virus capable of infecting Windows 98 (and was released before the OS). It was a “slow polymorphic” which means the virus does not generate a new encrypted version of itself every time a file is infected. This feature targeted AV editors, which often produce a signature by generating a large number of samples from a single polymorphic sample and spotting similarities between these samples.
Ultimately, polymorphic viruses were still suffering from the fact that once decrypted, the payload was easy to spot. To fix this weakness, metamorphic malware programs were developed.
Metamorphism: the new batch
Instead of being a new way to exploit encryption like the three previous techniques, metamorphic malware programs have the ability to generate a different (but functionally equivalent) version of themselves for each infection. To propagate their code and avoid detection, they use a metamorphic engine like the one represented in the diagram below:
In the first step, the metamorphic engine localizes itself and the code that will be transformed. The next step is to disassemble that code in order to allow the code analyzer to acquire a logical understanding of it. Then, the engine performs the most crucial step: the actual transformation of the code. This is done by using the same obfuscation techniques that are used by polymorphic engines (listed above). The only thing left to do is to compile the code (which is usually done by a legitimate compiler installed on the victim system) and to embed it into a host file.
Every time a metamorphic virus propagates, a new version of the virus is created which will keep the same effect and general behavior. A good metamorphic engine is supposed to create an infinite number of versions without common identifiable string patterns, thereby making its detection by AV signatures virtually impossible.
The first metamorphic malware was called Win95/Regswap, written in 1998. This virus did not take advantage of all the obfuscation techniques discussed above, it mainly focused on register swapping.
It is also worth mentioning that Win95/Zmist, written by Z0mbie (from the virus writing group 29a), introduced the code integration obfuscation technique. Zmist is capable of decompiling Portable Executable (PE) files in logical elements, to insert parts of its code in the file and to link them via JMP instructions. After doing so, it regenerates the code and rebuilds the PE. Another noteworthy piece of malware is W32/Simile released in 2002 by the Mental Driller (also from 29a). This virus uses several advanced techniques such as entry-point obscuring, metamorphism, and polymorphic decryption.
The phantom menace
Even today, AV editors rely on signatures. They are created from the analysis of malware samples to characterize a malware family and are matched against files entering the system. As time went by, these signatures began to include behavioral indicators, which metamorphic and polymorphic malware do not counter, because they both keep the same global semantics. Moreover, writing a polymorphic or a metamorphic malware requires solid expertise, which not all malware authors have. Thus, it is no surprise that malware authors began writing fileless malware: programs that do not write themselves on the disk, only exist in memory, disappear as soon as the machine is turned off and leave no or little trace.
A good example is SQL Slammer, a 2003 worm that caused a massive Internet slowdown. This sort of malware is problematic in two ways: first of all, traditional AV expects malware to be a file (or part of a file) entering a computer, and they match their signatures against this file and then quarantine or destroy the file if it’s deemed malicious. Due to the fileless nature of the threat, this procedure cannot be applied here, which makes this kind malware quite difficult to detect: in 2017, a Meterpreter running in a domain controller was detected by analyzing the physical memory of a domain controller; without this security scan, for how long the attack would have gone unnoticed?
Second, AV editors and researchers in general need forensic evidence to learn about a threat, study it and come up with an effective response. Because of the volatile nature of fileless malware, it is often difficult to obtain material which analysts can study, which means that it is much harder to develop a signature against this kind of malware.
Modern fileless malware often uses Living Off the Land binaries (LOLBins). LOLBins are legitimate binaries, leveraged by malware authors for malignant purpose. The most common tools used by malware are PowerShell and Windows Management Instrumentation, which the recent Astaroth campaign employed.
The next part of this series will examine when Antiviruses strike back. Stay tuned for more.
This post was authored by Mathieu Gaucheler with the support of the Blueliv Labs team.
[i] Bashari Rad, Babak & Masrom, Maslin & Ibrahim, Suhaimi. (2012). Camouflage in Malware: From Encryption to Metamorphism. International Journal of Computer Science and Network Security (IJCSNS). 12. 74-83.
[ii] Borello, Jean-Marie & Mé, Ludovic. (2008). Code Obfuscation Techniques for Metamorphic Viruses. Journal in Computer Virology. 4. 211-220. 10.1007/s11416-008-0084-2.
[iii] Moser, Andreas & Kruegel, Christopher & Kirda, Engin. (2008). Limits of Static Analysis for Malware Detection. Proceedings – Annual Computer Security Applications Conference, ACSAC. 421-430. 10.1109/ACSAC.2007.21.