Thursday, February 22, 2007

Obfuscated ELF Objects

I have blogged before on reverse engineering/binary analysis tools and how incredibly easy it is to break them. Prior works on the de-obfuscation of obfuscated binaries have concentrated on the accurate dead listing of executable code. These methods mostly concentrate on detecting 'junk bytes' or data within code sections. This is mainly done by determining what instructions are actually executed at runtime, without actually executing the object. I have researched ways to throw off these tools by manipulating the ELF object data that surrounds this code instead.

The search is limited to ELF object values that analysis tools use and that the OS linker, and loader does not, while maintaining runtime functionality. Unfortunately I have found _many_ ways to accomplish this. Most of the techniques disable or otherwise subvert the majority of analysis tools out there. Here's a few off the top of my head.

elf_header e_ident[EI_CLASS]
The Linux kernel doesn't check this value, make it whatever you want, your object will continue to function. Unfortunately most analysis tools will cease to work, only IDA pro will default to 32bits, unless you set it to ELFCLASS64, the demo version of IDA Pro will complain you can't disassemble 64 bit objects with that version. I sent a patch to the LKML for it.

elf_header e_phnum
Depending on the object you choose, you can increment this value a couple of times safely, and most of the time analysis tools that use the program header instead of the section header (IDA pro) will simply fill its analysis output with garbage from fake program header segments. Wont work on all objects, as you cant always choose the data that sits just beyond the legitimate program header entries. And some may cause the loader to throw your binary out upon execution.

section header sh_size
When the section header is present tools like IDA pro use it to perform analysis (yes you can force using the program header). Change the sh_size member of any section to any value you want, and the analysis will be incorrect.

Remember, when writing binary analysis tools you have to assume the object your parsing is malformed. How would the real OS loader treat it? That's what analysis tools must do, emulate the real environment, not the standard.

*I am in no way bashing IDA pro, it's far more powerful then anything else out there right now. But I had to use an example :] In fact to Ilfak's credit most other tools refuse to even read or fail completely (crash) when reading most objects I was manipulating.

No comments: