lighthouse_coverage - an execution tracer for PANDA
Update: This plugin has been merged and is now a part of PANDA.
It's very common to log program execution for either coverage or full tracing using either a PIN tracer
[1,2,3], qemu
in user mode [1] or drcov
[2,3,4]. However, these tools are fragile with respect to obfuscated or self-modifying code. In those cases, it is desirable to extract a program execution trace directly from a whole-system emulation of a computer, such as PANDA.
There is at least one execution tracer included in PANDA as a plugin to provide csv
files that can then be imported into IDA pro using a python script. This is often sufficient. However, Binary Ninja has a some advanced capabilities to deal with overlapping code instructions and so it would be nice to have some means to import the trace information there. And lighthouse has some very powerful functionality to analyze coverage data. And since lighthouse runs on both IDA pro and Binary Ninja, it's a natural program to target PANDA coverage output to.
PANDA documents the structure of plugins here [5].
Let's make a new plugin for PANDA that traces executions and outputs coverage information
At bare-bone do-nothing PANDA plugin with needs the following:
- a subdirectory in the plugins directory named
name_of_plugin
- create a file
name_of_plugin.c
orname_of_plugin.cpp
- define a
init_plugin()
function - define a
uninit_plugin()
function - a
Makefile
with the one-line content$(PLUGIN_TARGET_DIR)/panda_$(PLUGIN_NAME).so: $(PLUGIN_OBJ_DIR)/$(PLUGIN_NAME).o
- an entry for the plugin in the file
config.panda
in the plugins directory (just the name of the plugin)
Thus, I created a subdirectory, lighthouse_coverage
, in the plugins directory, added the one-line Makefile
and the source file lighthouse_coverage.c
:
#include "panda/plugin.h"
bool init_plugin(void *self) { return true;}
void uninit_plugin(void *self) {}
In order to be sure that it works, I added a printf()
statement.
#include "panda/plugin.h"
#include "stdio.h"
bool init_plugin(void *self) {
printf("loaded lighthouse plugin\n");
return true;
}
void uninit_plugin(void *self) { }
And we have a working plugin, which can be invoked in the same way as any other panda plugin
./panda-system-x86_64 -m 4096 -replay theRecording -panda lighthouse_coverage
PANDA[core]:initializing lighthouse_coverage
loaded lighthouse plugin
loading snapshot
[ ... ]
At this point the plugin does no useful work. PANDA plugins work by hooking callback functions into events as they occur during execution or playback. Lets hook a callback function into an event when our plugin loads.
#include "panda/plugin.h"
void before_block_exec(CPUState *cpuState, TranslationBlock *translationBlock)
{
// this function gets called right before every basic block is executed
printf("%#018"PRIx64"\n" , translationBlock->pc); // print out program counter of basic block
return 0;
}
bool init_plugin(void *self) {
panda_cb pcb = { .before_block_exec = before_block_exec };
panda_register_callback(self, PANDA_CB_BEFORE_BLOCK_EXEC, pcb); // register the callback function above
return true;
}
void uninit_plugin(void *self) { }
TranslationBlock
has a member, pc
, that is the program counter of the basic block about to be executed. This yields a list of all the basic block addresses executed:
loading snapshot
... done.
opening nondet log for read : theRecording-rr-nondet.log
0xffffffff81c01c40
0xffffffff81c00920
0xffffffff81c0096f
0xffffffff81c009c2
0xffffffff81c009d1
0xffffffff81c01c4a
[ ... ]
This is, of course, not sufficient because all processes are going to be intermingled here. To the rescue comes Operating System Introspection (OSI). OSI adds the capability to obtain the process names and thread IDs for each basic block (and much more). So, lets add process names to the block addresses and print everything to an output file.
void before_block_exec(CPUState *cpuState, TranslationBlock *translationBlock)
{ // this function gets called right before every basic block is executed
if (panda_in_kernel(first_cpu) == 0) // I'm not interested in kernel modules
{
OsiProc * process = get_current_process(cpuState); // get a reference to the process this TranslationBlock belongs to
if (process)
{
fprintf(outputFile,"\n%s@%#018"PRIx64"", process->name, (translationBlock->pc);
free_osiproc(process); // always free unused resources
}
}
return 0;
};
And that is basically it. We just add some function prototypes and such to reduce compiler warnings and we have a finished plugin:
#include "panda/plugin.h"
// OSI
#include "osi/osi_types.h"
#include "osi/osi_ext.h"
// function prototypes
void before_block_exec(CPUState *cpuState, TranslationBlock *translationBlock) ;
void uninit_plugin(void *self) ;
bool init_plugin(void *self) ;
FILE * outputFile = 0; // pointer to output file...
void before_block_exec(CPUState *cpuState, TranslationBlock *translationBlock)
{ // this function gets called right before every basic block is executed
if (panda_in_kernel(first_cpu) == 0) // I'm not interested in kernel modules
{
OsiProc * process = get_current_process(cpuState); // get a reference to the process this TranslationBlock belongs to
if (process)
{
fprintf(outputFile,"\n%s@%#018"PRIx64"", process->name, (long unsigned int)(translationBlock->pc));
free_osiproc(process); // always free unused resources
}
}
return;
};
bool init_plugin(void *self)
{
panda_require("osi"); // ensure that OSI is loaded
assert(init_osi_api()); // ensure that OSI is loaded
outputFile = fopen("lighthouse.out", "w"); // open output file
panda_cb pcb = { .before_block_exec = before_block_exec };
panda_register_callback(self, PANDA_CB_BEFORE_BLOCK_EXEC, pcb); // register the callback function above
return true;
};
void uninit_plugin(void *self)
{
fclose(outputFile); // close output file
};
And we can then call the plugin like any other:
./panda-system-x86_64 -m 4096 -replay '/media/jan/80669BBB669BB080/ch34_1char' -os linux-64-ubuntu -panda osi -panda osi_linux:kconf_group=ubuntu:5.3.0-28-generic:64 -panda lighthouse_coverage
or for a windows guest
and we get the following type of output:
$ more lighthouse.out
gmain@0x00007f299b429bf9
gmain@0x00007f299b429c01
gmain@0x00007f299b445740
gmain@0x00007f299b445748
gmain@0x00007f299b445764
gmain@0x00007f299b44576f
gmain@0x00007f299b429c0d
gmain@0x00007f299cd195c9
[ ... ]
Next, we need to change the lighthouse parser for the 'mod+off'
format so that it can take our new mod@address
format ( I bolded the relevant code changes). I call this modat.py
:
import os
import collections
from ..coverage_file import CoverageFile
from lighthouse.util.disassembler import disassembler
class ModAtData(CoverageFile):
"""
A module@address log parser.
"""
def __init__(self, filepath):
super(ModAtData, self).__init__(filepath)
#--------------------------------------------------------------------------
# Public
#--------------------------------------------------------------------------
def get_offsets(self, module_name):
return self.modules.get(module_name, {}).keys()
#--------------------------------------------------------------------------
# Parsing Routines - Top Level
#--------------------------------------------------------------------------
def _parse(self):
"""
Parse modat coverage from the given log file.
"""
imagebase = disassembler._bv.start
modules = collections.defaultdict(lambda: collections.defaultdict(int))
with open(self.filepath) as f:
for line in f:
trimmed = line.strip()
# skip empty lines
if not len(trimmed): continue
# comments can start with ';' or '#'
if trimmed[0] in [';', '#']: continue
module_name, bb_offset = line.rsplit("@", 1)
modules[module_name][int(bb_offset, 16)-imagebase] += 1
self.modules = modules
Installation
PANDA:
- Create a folder, lighthouse_coverage in the PANDA plugins directory
- Drop this projects' files into that folder
- modify the config.panda file in the plugins directory to include lighthouse_coverage
Binary Ninja: modat.py
needs to be placed into the lighthouse/reader/parsers directory. In the Binary Ninja plugin directory, there should be a file called lighthouse_plugin.py
and a folder called lighthouse
. Place modat.py
there in the relative path lighthouse/reader/parsers
And now we get our payoff: Coverage data collected from the binary within a full system emulation:
References:
- [1] QEMU Interactive Runtime Analyser - https://github.com/geohot/qira
- [2] Code Coverage Explorer for IDA Pro & Binary Ninja - https://github.com/gaasedelen/lighthouse
- [3] Binary code coverage visualizer plugin for Ghidra - https://github.com/0ffffffffh/dragondance
- [4] Scriptable Binary Ninja plugin for coverage analysis and visualization - https://github.com/ForAllSecure/bncov
- [5] PANDA Plugins - https://github.com/moyix/panda/blob/master/docs/PANDA.md
- [6] Building a Feedback Driven Fuzzer - Dev Log 2 : Coverage - https://blog.fadyothman.com/building-a-feedback-driven-fuzzer-dev-log-2-coverage/