Sunday, June 26, 2011

Lambda didn't (or won't) kill the expression template star

Now that C++0x is on it's way, and many compilers already implement lambdas, it may be thought that the venerable "functor" is on it's way out. Well aside from the obvious reasons of supporting legacy code, that's simply not true. operator() is here to stay, and I will tell you why.

Lambdas, though convenient, are opaque. You can't peek inside of a lambda or peel back it's inner workings, and the vast majority of the time, that's fine, you don't need to. The same is true for normal named functions and member functions. On occasion however, a slightly more transparent tool can be useful, and operator() can help you build such translucent functions.

Let me try to justify this with an example. LLVM is a library for building compiler backends. It is written in C++, and it happens to have a coding standards document. One of the style points is "Turn Predicate Loops into Predicate Functions".

Basically, instead of code like

bool pred = false;
for(unsigned i = 0; i < list.size(); ++i)
{
if( list[i].satisfies() )
{
pred = true;
break;
}
}

if( pred )
{
...
}


You should have code like

if( TestPred(list) )
{
...
}


They have their justifications, which you can read if you're interested. This really isn't about whether this is a good idea or not, but rather how to write TestPred. There is an obvious approach:
bool TestPred(const std::vector<Something>& list)
{
for(unsigned i = 0; i < list.size(); ++i)
{
if( list[i].satisfies() ) return true;
}
return false;
}


The problem with this, is that you can't reuse the predicate. Or rather, that you can, but only in a very specific way. If you need to test a std::list of Something instead you need to either copy the list or duplicate the code. Worse, if you don't know what the code is, you don't even have duplication as an option.

A brief aside, yes a solution to this mess would be to just use std::find_if in the first place. This is just however the motivating example that came to mind when I happened to be reading the LLVM coding guidelines, so bear with me.

So we can change TestPred to operate on iterators. Now we can use it with list, vector, etc.

template <typename FwdIter>
bool TestPred(FwdIter begin, FwdIter end)
{
for(FwdIter i = begin; i != end; ++i)
{
if( i->satisfies() ) return true;
}
return false;
// or return find_if(begin, end, [](const Something& s) { return s.satisfies(); }) != end
}


But before we had the nice feature of being able to call this on a std::vector directly. We could add overloads for that, but that gets tedious especially since we're writing one of these functions every time we have a new predicate we need to loop over. We can use templates to let us get that std::vector overload back (and more, std::list for example, and your plain old array) by writing the code for iteration once and plugging in the varying i->satisfies part.


template<typename F>
struct RangedAnyPredicate
{
typedef F Tester;
template<typename FwdIter>
bool operator()(FwdIter begin, FwdIter end) const
{
for(FwdIter i = begin; i != end; ++i)
{
if(F::test(*i))
{
return true;
}
}
return false;
}
};

struct satisfier
{
typedef const Something& testee_type;
static bool test(const Something& s)
{
return s.satifies();
}
};

static RangedAnyPredicate<satisfier> TestPred;


Here we've extracted out just the looping over part. TestPred will behave almost identically to our templated on a pair of iterators version of the function (exceptions being things like, we can't pass this TestPred around as a function pointer). This is more typing that previous version of course, even with multiple predicates the overhead of a struct for each one is high. Where it starts to pay dividends is when provide more overloads for free. We're going to need some traits to do that, and again while this is a bunch of code up front, some of these traits are independently useful and already exist.

We want to be able to call this on two iterators, on a container with begin and end methods, and on a standard array. We can distinguish between the two argument and single argument forms easily with the normal overloading mechanism. For distinguishing between containers with begin and end and standard arrays we'll assume anything not an array should have a begin and an end.

So to detect arrays will use an is_array trait:
template<typename T>
struct is_array
{ enum {val = false}; };

template<typename T, long N>
struct is_array<T [N]>
{ enum {val = true}; };


We can then use this information to decide to either call TestPred(arg.begin(), arg.end()) or TestPred(arg, arg + N).


template<bool C>
struct do_test;

template<>
struct do_test<true>
{
template <typename Self, typename T, int N>
static bool go(const Self& self, T (&a)[N])
{
return self(a, a + N);
}
};

template<>
struct do_test<false>
{
template<typename Self, typename C>
static bool go(const Self& self, const C& c)
{
return self(c.begin(), c.end());
}
};

...
// back inside RangedAnyPredicate
template<typename Container>
bool operator()(const Container& c) const
{
return do_test<is_array<Container>::val>::go(*this, c);
}
...


Now we've got overloads for vector, list, and any other container that might come along and behave like a standard sequence, as well as old fashioned arrays.

But there's more. You may have noticed we included a Tester typedef in RangedAnyPredicate. This is where the ability to make a functor less opaque than a regular function or a lambda starts to be useful. Let's say we have two predicates, PredA and PredB. And furthermore the coding guidelines were followed and we have TestPredA and TestPredB functions to work with. If we want to find out if a list has an eliminate that satisfies PredA or PredB we can do the obvious:

if( TestPredA(list) || TestPredB(list) )
{
...
}


But this is C++! The code above has the potential to needlessly traverse the list twice. We've got to be able to do something about that, right? Well not if TestPredA and TestPredB are regular functions. We can write a third function that does the or-ing, but it can't peek inside the functions. If however, we use RangedAnyPredicate to define TestPredA and TestPredB we can extend them as needed. First will need a predicate to represent the or-ing of the internal predicates of A and B:


template
struct OrCombiner
{
typedef typename Lhs::testee_type testee_type;
static bool test(testee_type testee)
{
return Lhs::test(testee) || Rhs::test(testee);
}
};


Next we use a little operator-overloading-fu in RangedAnyPredicate to make this easy to use:

// Back in RangedAnyPredicate
template<typename TestF>
RangedAnyPredicate< OrCombiner<F, TestF> > operator|(RangedAnyPredicate )
{
return RangedAnyPredicate< OrCombiner<F, TestF> >();
}


Now our code becomes


if( (TestPredA | TestPredB)(list) )
{
...
}


and only one pass is made over the list. If you want different semantics, of course you can just simply use the prior code. The same can be done for logical and as well of course.

This was of course a contrived example, but I hope of demonstrated how the functor still has its place, even in a world of lambdas.

Thursday, June 02, 2011

Taking Herb Sutter out of Context (or type system assisted type safety)

I was watching this interview with Herb Sutter and an off-hand comment was made about how you need GC (or automatic memory management) for "type safety". What is "type safety"? I'm going to use a very informal definition, that is, we say a program is "type safe" if a value of T* is not null it is correct to interpret the memory it refers to as a T. This is pretty hand-wavy, but good enough for this exercise.

Now you might say "Well duh, malloc returns void*", but let's instead imagine a C++ variant with no reinterpret_cast (or C casts) no void *, no malloc and no placement or global operator new.

Is this a "type safe" language? Or does it need GC to take it all the way?

Consider the following program, X and Y are classes unrelated by inheritance:


X* x = new X;
x->something();
delete x;
Y* y = new Y;
X x2;
*x = x2;


I just called X::operator= on a Y with no casts. I was able to do this because this hypothetical implementation happens to be super aggressive with regards to using recently freed memory. Now, this is undefined behavior in C++ and would presumably be so in our hypothetical "type safe(er)" subset.

But is that useful? If you're the kind of person who fervently believes in nasal demon style consequences for undefined behavior and that it is foolish to ask for more you can skip to the end, because any undefined behavior in the program can scramble it, you don't need to rely on "aggressive" heap managers and plausible explanations.

So then what we need to do is either a) define this behavior, or b) disallow it or c) refine"undefined behavior" to allow for nasal demons et. al. but not for breaking the type safety of our programs.

Option C is easily said, and since a language spec. is just things you say, might even be easily done (although saying it specifically and clearly would probably be difficult in practice).

We could try to define this behavior. What if delete also set the pointer to null (and we made dereferencing null defined behavior, it calls std::terminate let's say)? This obviously won't be sufficient:

X* a = new X ;
X* b = a;
delete a;


A delete like this could also only ever operate on lvalues.


delete i_only_love_for_your_sideeffects();


would become illegal.

The b = a program snippet is also a counter example to a delete that takes its argument out of scope.

Maybe we can fix this by messing with our "aggressive" allocator. We could specify that once an X, always an X (if not necessarily the same X). That is to say our allocator would be specified as reserving blocks of memory for specific types, once reserved a block could not be reused for allocations of different types.

This places some constraints on our allocator implementation, and is somewhat dissatisfying. But it does satisfy our hand-wavy definition of type-safety. Part of the problem is that our definition of type-safety doesn't seem to let us write safer programs. A pointer that unpredictably becomes an alias to another object without any assignment will probably result in a buggy program, even if we can confidently state the type it aliases.

Except this allocator doesn't even give us that. Remember we're talking about C++ (or a C++ variant anyway) here. Have you ever called a virtual method from a constructor? What about a pure virtual method? What about from a destructor? By the time delete is done our T might not be a T anymore. Assuming a less aggressive allocator (or the program just never news another T) we're left with a type-unsafe program again.

delete seems to be a serious problem. The way garbage collection handles it is, they don't let you delete anything. Can we do the same? We could just remove delete from the language. Then we're fine as long as we have infinite memory. As a practical matter though, we need some way to reclaim the unused memory, since we don't have infinite memory. Oops garbage collection again.

So getting rid of delete is tantamount to GC. Can we keep it somehow? Earlier we discussed the idea of having delete null out its argument. That didn't work because of aliasing. Sounds like we need to get rid of aliasing.

To simplify eliminating aliasing we're going to remove some features from the language. Don't worry: later we'll try and put them back.

We're going to get rid of the address of operator and references (basically any use of the ampersand that doesn't involve Boolean logic). By removing references we just throw away a whole slew of opportunities for aliasing. Ditching the address of operator means the only way of manufacturing a new pointer value is with, well, new. We'll allow aliasing NULL (or we can treat it as syntax if you like).

Ok first we'll do the obvious way of disallowing aliasing: no copy constructor or assignment operator for pointer types. This is clearly useless as you can't even pass pointers to functions then.

Let's get a little more complicated then: when a pointer typed variable appears on the right hand side of an assignment or as a parameter to a function, that variable is no longer considered to be in scope.

It's as though

A * x = new A;
A * y = x;

were transformed to

A * y = NULL;
{
A * x = new A;
y = x;
}


As you can see, this is a purely mechanical transform, and can be enforced at compile-time. Except for one little wrinkle: global pointers. So ban 'em, we don't need them. Or we can be a little nicer and just ban assigning from them, but that has its own wrinkle. Note that this restriction is infectious, no global structs with pointer members or members with pointer members.

As a convenience we can allow assigning to an assigned from pointer variable and treating it as though we've introduced a new variable with same type and name. This allows code like:

x = munge(x);


We have to be careful with conditionals: a simple rule is to require that the same variables are assigned-from in all branches of the conditional, and this can be checked at compile-time. This is some more pain for our programmers but such is life.

We're now type-safe but still manually managing memory. Once you say

delete x;


x goes out of scope, so we can't use the value x referred to after it's been deleted, and we know there are no aliases to it.


T* sneaky(T* x)
{
delete x;
return x; // compiler error here
}

T * x = ...;
T * y = x;
delete x; // compiler error here
y->stuff();


As you may have realized by now I didn't invent this idea, it's called linear types and there's been a fair amount of research into it which you can look up with your favorite search engine. In this case we've made all pointers linear types, in languages that support this for "real" they'll typically have non-linear versions of the types as well.

The usual example actually isn't memory (because GC makes using linear types for this redundant) but rather file handles. A linear file handle let's you make sure you've got an open handle before you try to read or write to it, and insures you don't close a handle twice, etc. I think you can imagine how that would work by this point.