Friday, July 03, 2009

Leaf - Hit Tracing

I just posted a new version of the Leaf framework. So I thought this might be a good time to blog on how to write and use a hit tracer using Leaf. Even though it is mostly a static analysis tool, the data it collects during this process is really useful to a debugger. I wanted a debug API and I wanted it fast, so a few versions ago I wrote a quick wrapper to Ptrace and put it into Leaf. It currently has only been tested for x86 Linux so there's work to be done in making it support BSD. I am always looking into other ways to make the debugging API cleaner, more useful and easier to code with, so please send any suggestions. So lets look at the steps needed to write a plugin that implements a hit tracer.

Here is what my basic hit tracer, 'lhit' (included with Leaf) implements:

1. LEAF_init() - a mandatory function that must be present in all plugins. You can use it to initialize any private data structures your plugin may need, or you can leave its function body blank.

2. LEAF_interactive() - this is the plugin hook a debugger would want to call. Ideally you only want *one* plugin calling this, it doesnt make sense to have more then one. If your plugin implements this hook it will be called after all other static analysis if finished, consider it your debugger plugins main()

3. LEAF_attach(pid_t) - takes a pid_t as its only argument and will attach the debugger to your target process.

4. LEAF_set_hittracer(pid_t, breakpoints_t, int) - this is where it gets slightly tricky. Your plugin must declare a structure somewhere of type breakpoints_t. Pass the targets pid, the breakpoints structure and flag (ON/OFF) to this function and Leaf will automatically use the vector of function addresses it collected during static analysis and set breakpoints on each of them. There is no need for your plugin to manage any of this. There is also another function called LEAF_set_breakpoint, which takes a pid_t, a breakpoints_t structure, and the address you want to break on, you can use this for any other manual breakpoints you want to set.

5. LEAF_cont(pid_t) - this one is pretty self explanitory, it takes a pid_t as its only argument, and instructs the traced program to continue. At this point Leaf will handle calling wait() for you. All you have to do is inspect and handle the signal it returns. If you had used LEAF_set_hittracer and you hit one of the breakpoints it set then you will want to call LEAF_reinstall_breakpoint and Leaf will take care of putting the old instruction back, single stepping and reinstalling the breakpoint for you.

6. LEAF_get_regs(pid_t, user_regs_struct) - this will retrieve the processes registers for you.

7. LEAF_detach(pid_t) - will detach Leaf from your process.

8. LEAF_cleanup() - another mandatory plugin hook which you can use to free memory or close file descriptors, or you can leave it blank.

You will find an example hit tracer (lhit) which implements all of this in version 0.0.15 of Leaf here. Its not the best hit tracer in the world but it does the job. The debugger internals will be getting an overhaul soon, but the API should stay the same.

This new version of Leaf also contains my experimental LeafRub plugin which embeds a Ruby interpreter for scripting capabilities. An example LeafRub.rb script is also included, but I'll blog more about that later.

Saturday, June 20, 2009

Fun with erase()

Over the last few months I've been knee deep in C++, you can view this as good thing or a bad thing, I for one enjoy it. I personally like finding bugs in C++ applications, as they are usually more complex then plain old C and require a bit more thought:
  • Keeping track of what variables your destructor will take care of, and which it wont
  • Iterators, and what methods invalidate them
  • (insert your favorite C++ gotcha here)
While debugging a crash one day it occured to me that the security research community has paid very little attention to the STL and CPPism's in general. There are a few things out there like TAOSSA's delete vs delete[] and of course there is also Cert's secure coding standards. But there is very little written on exploring STL specific bugs. Maybe its all private and im just not cool enough to see it :/

I decided to document some ways STL specific bugs may be exploited. The first place I looked was containers, you know vectors/queues/lists etc... Any use of these containers probably means lots of interesting data is being stored, and considering they all have very easy-to-use methods even novice C++ developers were (ab)using them somewhere.
Most of these methods take in iterators (don't let the name fool you, they're just pointers), and tainted iterators have been a known bad thing for a long time (read certs secure coding standards). But where were the exploits? Where were the how-to's on owning an attacker influenced iterator? I decided to look into it myself.
I settled on using vectors as my first topic of interest, as they are widely used for their efficiency and ease of use. I further focused my efforts by looking at any method that added/moved/deleted multiple elements of data at a time from a container. The erase method seemed like a good candidate considering the amount of memory copies that take place under the hood.
The erase method either takes a single position within the container and removes it, or it takes a range supplied by two iterators and deletes the elements within that range. But I needed to see what it looked like under the hood. After navigating the tangled mess that is the GNU C++ templates (this is probably the real reason no one has done a lot of STL security research) I was able to isolate the relevant erase() code and find what I was looking for.
This is where you are probably getting bored, so I'll skip ahead and just tell you why you care about any of this.
Tainted iterators are a known C++ gotcha that every code auditor should know about, but in certain situations they can lead to very interesting conditions for an exploit writer. The Cert secure coding standard begins to touch on the subject of invalid iterator ranges, but labels their 'undefined behavior' as equivalent to a buffer overflow. This is true, however it can be more then that depending onthe STL implementation. When an attacker can control the range iterators passed to erase() he may be able to leak or directly overwrite memory contents or even better he can trick the STL into resizing the container to encapsulate adjacent heap memory (think 'other containers'). This opens up all kinds of doors for creative exploitation.

I would love to post those details here, but blogger mangled my write up pretty bad. So I've uploaded it here. If you spot any inaccurate technical information please let me know.

Thursday, January 08, 2009


It's been awhile since I have posted. This blog is up to almost 500 subscribers somehow.

I posted a new project on googlecode. Leaf is an ELF reversing framework written in C. It has a built in API for developing your own analysis and output plugins. The current version (0.0.7) supports plugins written in C. The whole point of the project is flexibility in the analysis and output of the stuff your interested in. It's not just another text based disassembler, although a plugin that implements one can be easily written. In fact I released one with it and its available for download at the website. I am slowly releasing other plugins of varying quality. There are plenty of great tools for reversing on the Win32 platform, so there is no plan to support the PE format. If you want more information on it check out the googlecode link and look at the wiki.
It's still beta quality and there are definitely a few bugs. I hope you find it useful.

Update: Posted Leaf-0.0.10.tar.gz at It now uses udis86. Lots of work still to do, but its a start.