Sunday, March 13, 2011

Memory leaks and other bugs

It was pointed out to me that my last post had a bug (one that I wasn't aware of). Specifically, pure_modify leaks the old value. It is a simple fix:

template<typename F>
T pure_modify(F f)
std::auto_ptr<T> mem;
T* old_val; // not an auto_ptr becase if we throw we have not taken ownership
// We either threw from user code, or from new. And throwing user
// user code would be wrong anyway (impure)
AdapterF<F,T> adapter;
adapter.f = f;
adapter.mem = &mem;
adapter.old_val = &old_val;
delete old_val;
return *mem.release();

We'll also need to adjust AdapterF for this

template<typename F, typename V>
struct AdapterF
F f;
std::auto_ptr<V>* mem;
V** old_val;
void* operator()(void *in)
V r = f(*(*old_val = static_cast<V*>(in)));
mem->reset(new V(r));
return static_cast<void*>(mem->get());

Once pure_modify returns, we know that the last pointer value in old_val belongs to us, because we've successfully replaced it with something else, and it becomes our responsibility to clean it up.

So then, what was the other bug I alluded to last time? That was the so-called "ABA" problem. Basically, with this code I treat pointers as being unique identities for the life of the process. However, all heap implementations reuse memory (that's sort of the point, if you didn't reuse it you wouldn't need free) so two different objects can have the same address at different points during the lifetime of the program.

Two threads walk into an MVar. They both read the value and begin their operation. Thread A's operation is particular CPU intensive. Thread B is done lickity split and puts a new value in. Meanwhile Thread C comes along, and his operation is also fast. Thread B returned the old block of memory to the heap manger, so when Thread C finishes, it happens to grab that very same block for its result and puts it back into the MVar. Thread A finally finishes up, sees the pointer value is the same as last time, and happily shoves his modification in as the new value, completely unaware he had just wiped out the effects of two prior modifications.

And these sorts of things are the reasons why I'm probably not going to go off and implement this kind of abstraction at my day job. That and pure_modify's constraints aren't all that C++-friendly and it's hard to imagine good use cases. Some kind of atomic bignum library maybe?