Continuing with the retrospective overview of malicious software, in this part Moti Yung focuses on the role of crypto in the execution workflow of polymorphic viruses and touches upon the basic principles of public-key cryptography.I want to point out one interesting design – actually, implementation in design from the 1980s, and this is the survivable Trojan horse of Ken Thompson created in 1984. He described it in his Turing Award lecture.
What he designed is, effectively, a survivable password snatching Trojan horse. Password snatcher is a program that tries to infect the Unix password program, and the design goal is to show how systems are getting complex and threats are getting serious. He designed a Trojan horse that survived recompilation and reinstallation of the password program on the host machine.
He demonstrated that it’s not just the application itself that needs to be protected, and I have the description here but I will skip it, but I’ll get to the bottom line. The bottom line was that, essentially, the Trojan was using the C compiler to increase its survivability; and then if you replace the C compiler or the password program, just the smart operation of this Trojan made it survive and replicate inside the system. So, the bottom line of this was that one cannot just scrutinize the trustworthiness of a program by analyzing the source code and by compiling it – in some sense, the entire machine must be scrutinized, because he used something that was hiding in the compiler, therefore the source code, the compiler, linker, assembler, operating system need to be analyzed. When the threat comes from one source, you cannot just go there; really, it spreads around.The academic studies by Cohen in his PhD thesis in the 1980s: he started to investigate viral countermeasures and bypassing these countermeasures. What he noted is the idea that a virus can change itself in the replication process. He produced some viruses that had no common sequence of over 3 bytes between each generation, and he called them evolutionary viruses, and nowadays we know about polymorphic viruses. Also in the 1980s it was shown that the problem of whether a program or a set of programs has a virus or doesn’t have a virus is undecidable; the meaning is there is no universal treatment, and at the same time the antivirus heuristics based on scanning and signatures started to be developed.
Another thing that happened in the 80s is worms were invented as a paradigm for distributed computing in Xerox PARC, among other innovations.And, as I said, polymorphic viruses are now known threats, and they employ cryptography, to begin with. I’m not getting into details, but there is a decryption header that decrypts the program (see left-hand image), the program can execute when there is a replication. The decryption can be done with another key. So, you can see how cryptography can generate many viewpoints of the same encrypted piece of software.
This was already something that existed as application of cryptography to malware when we started. So, you can do many changes, and it’s easy to exponentially explore the space of possible viruses, and this was noticed in the early 90s.This was the state of the art when we started; the other piece of the picture that we wanted to include in our technology is public-key cryptography (see right-hand image). For the sake of this presentation, I’m not going to dwell a lot on public-key cryptography.
But all you have to know is that there is a public key denoted Y there, and it has a corresponding secret key X; X is kept by the key owner. Everybody can encrypt E of Y in the message, get the ciphertext CI, everybody can decrypt it. But the knowledge of Y does not enable the decryption of the ciphertext back to the message. Only the one who knows X, this is called Trapdoor Information, can take the ciphertext and the secret key X, apply a transformation denoted D, which is the reverse operation to the encryption, and get the message back.
So, this is the idea from the late 70s, and the first example of it was the RSA which was based on factoring big numbers: there is a number N that is a multiple of P and Q, two big numbers, and you cannot factor them, you cannot take the root; therefore you can take the message and raise it to the E mod N; only if you can factor, you can recover them. Applications for this are encryption, key exchange, digital signatures, and various protocols that have been suggested in the last 30 years, like playing poker over the phone, etc.
Read previous: “Yes We Can’t!” – On Kleptography and Cryptovirology