Sunday, May 22, 2011

Management (or the art of delegation)

Microsoft's .NET has a method called GetFunctionPointerForDelegate. It allows you to turn a managed delegate into a function pointer that can be passed around to native code. A delegate is usually a method bound to a particular instance of a class.

This sort of thing can be useful outside of the managed world as well. Many APIs that take callbacks allow you to pass in some additional user-specified data, usually as a void* or similar. However, some times they don't provide this capability. There are a few ways around this involving some variety of globals or thread-local storage, but this becomes awkward if we need to use more than one instance of this callback at a time.

AsmJit is a cute little C++ library for doing runtime code-gen for x86 and x64. In the following example I'm going to use it to let me use a member function as a callback. This code is for x86 using the cdecl calling convention, on linux.


#include "AsmJit.h"
#include <iostream>

typedef void (*callback)(int);

struct A
{
int m_accum;
void doIt(int x)
{
m_accum += x;
std::cout << "A: " << x << " " << m_accum << std::endl;
}
};

void A_trampoline(A* a, int x)
{
a->doIt(x);
}

void takes_callback(callback cb)
{
cb(1);
cb(2);
cb(3);
}

int main()
{
using namespace AsmJit;
A a = A();

Assembler assemblr;


assemblr.push(dword_ptr(esp, 4)); // push the original int arg
assemblr.push(imm(reinterpret_cast<intptr_t>(&a))); // push a
assemblr.call(imm(reinterpret_cast<intptr_t>(&A_trampoline))); // call A_trampoline(&a, [esp + 4])
assemblr.add(esp, 8); // clean up the stack, cdecl is caller cleans up
assemblr.ret(); // return

callback cb = function_cast<callback>(assemblr.make());
takes_callback(cb);

}


Here takes_callback is one of those lame callback taking apis that don't let you pass around any user-speicified info. Using a little bit of x86 assembly we create a thunk, that adds an additional argument (the instance of A) and forwards it to A's doIt method.

If we run this we get as ouptut:

A: 1 1
A: 2 3
A: 3 6


Now an interesting thing to consider is, what would this code look like with different calling conventions? In particular, Microsoft's __thiscall is interesting. __thiscall is like Microsoft's __stdcall, except for the quirk that the this pointer is passed in the ecx register. __stdcall is like cdecl, except the callee is responsible for cleaning up the stack. Knowing this, and assuming our callback function was expected to be __stdcall our thunk code could look like (pseudo code)

mov ecx, &a
jmp A::doIt


Only two instructions! Furthermore this template of two instructions works regardless of the arity of the member function we're forwarding to, and we don't have to copy any stack arguments or shuffle it around like we needed to for cdecl. Given the thunk uses a jmp, rather than a call, the thunk itself doesn't appear on the call stack during a stack trace, but it looks rather as though the member function had directly been called. This is arguably a nice feature from a debugging standpoint (is this a dynamically generated function on the stack, or did we go off into the woods?). It almost looks like __thiscall was at least partially designed with this use case in mind.