Eric Niebler: stupid name lookup

Eric Niebler shows a name lookup failure caused by a recursive template function inferring its return type via decltype. Stupid problem with a rather nasty hacky workaround, but also a likely change in C++14 that will be most welcome, and makes the whole problem go away ūüôā

Link | Posted on by | Tagged , , , , | Leave a comment

Using the standard allocator for a different type [and get free functionality for your custom allocator]

Someone on comp.lang.c++.moderated recently asked “if I have a different allocator, can I use that allocator to allocate memory with vector or string…?”. Since there is not only the mechanism supported by C++98/03, but also a more general – and helpful – mechanism introduced in C++11, I thought I’d write a short post on both approaches.

The problem is this: we have a class template that has an allocator type parameter (defaulted to std::allocator), e.g:

template<class T, class Allocator = std::allocator<T> >
struct linked_list {
   typedef Allocator value_alloc;
   typedef /* ??? */ node_alloc;

We can now create new value type objects via the provided allocator, but how do we create node objects? You could use new, but if the user wants memory to be allocated in a particular way for the value type it’s likely they want the same policy to apply to all memory you allocate.

Fit the First

The solution for this is to use the ‘static rebind’ facility supported by the standard allocator, and hopefully any custom allocators:

template<class T, class Allocator = std::allocator<T>>
struct linked_list {
   typedef typename Allocator::template rebind<     T  >::other  value_alloc;
   typedef typename Allocator::template rebind<node<T> >::other  node_alloc;

You’ll notice that I rebound both allocator types; this is because nothing in the template parameter list specifies that Allocator must be an allocator of T objects, so is really just doing due diligence on an unknown that we’ve been given. (I know nothing even specifies that it’s an allocator or has an allocator-like interface, but hey, until we have concepts we’ll just go with it…)

This does exactly what is needed to solve the issue, and is supported by all standard-compliant standard library implementations, but let’s see if newer facilities can help us to improve on it.

A viable alternative?

An alternative that avoids the rebind is to take the allocator as a template-template parameter instead, employing a typical policy-based design pattern:

template<class T, template<class> class Allocator = std::allocator>
struct linked_list {
   typedef Allocator<     T  > value_alloc;
   typedef Allocator<node<T> > node_alloc;

Although this will work it has a major downside: it forces the allocator to be a class template taking only 1 type parameter, which is restrictive to the point of brokenness. You could improve on this in C++11 by using a template parameter pack – template<class...> class Allocator – but unless all template parameters are type parameters, and all but the first are defaulted, it is still broken.

C++11: a uniform rebind interface

With C++11’s added support for allocator traits, we now have a uniform interface for static rebinding. (Note that I make use of the new ‘alias declaration’ facility via the using keyword.)

template< class T, class Allocator = std::allocator<T> >
struct linked_list {
   using value_alloc = typename
      std::allocator_traits<Allocator>::template rebind_alloc<T>;
   using node_alloc  = typename
      std::allocator_traits<Allocator>::template rebind_alloc<node<T>>;

std::allocator_traits supplies a uniform interface to all allocator types without enforcing the entire default allocator interface on every allocator type. This is not all however; the default allocator_traits instantiation gives you the following for free:

  • default definitions of 9 member typedefs and 2 template alias declarations
  • default implementations for 4 member functions, including construct() and destroy(), which call placement new and the object destructor respectively.

If you are writing a custom allocator then this saves writing a fair amount of dull boilerplate, but note that if you wish to provide backward-compatibility with user code that does not use allocator_traits then you will still need to provide it all yourself.


If you don’t yet have support for the new allocator_traits then the first is the only viable option. For anyone able to take advantage of the new facilities afforded by allocator_traits then I recommend doing so.

If you are creating a custom allocator then it is up to you to determine whether the backward compatibility or minimisation of boilerplate is more important.

Posted in Uncategorized | Tagged , , | Leave a comment

Generalised type-deduction for class template instance construction

I’ll wager that most C++ developers with experience of templates have at some point discovered the limitation that a class template’s parameters cannot be deduced in construction. Since type deduction¬†can be performed in invocation of a function template however, I commonly create a helper function such as make_wrapper in the following sample to enable type deduction to take place when I want it:

#include <type_traits>

template< class T >
struct wrapper
   wrapper( const T& );

wrapper make_wrapper(const T& t)
   return wrapper(t);

static_assert( std::is_same< decltype(make_wrapper(42)), wrapper >::value, "incorrect type deduction" );

Common examples of this in the standard library are make_pair, make_shared and make_tuple (the latter 2 being available in C++11 only).

The annoyance of this however is that you must write a deduced construction helper for each class template you wish to be able to perform deduced construction of. Except that’s rather untrue: you could write a deduced construction helper for each class template parameter pattern, e.g. <class>, <class, class>, <class, class, class> etc, but with the aid of C++11 language & library additions we can do better than that.

Generalising the type-deduction helper

Thanks to perfect forwarding and variadic templates it is now possible to generalise this pattern to support any template class. If you are unfamiliar with these techniques I recommend reading Sutter’s description of a make_unique function (see the section entitled Enter make_unique) and Meyers’ post on what he calls ‘universal references‘.

#include <type_traits>

/** Changes std::reference_wrapper to the corresponding reference type. */
template< class T >
struct unwrap {
	typedef T type;

template< class T >
struct unwrap< std::reference_wrapper > {
	typedef T& type;

template< class T >
struct unwrap< std::reference_wrapper > {
	typedef const T& type;

/** Deduced construction of a template class with given parameters. */
template< template class T, class... Args >
auto make( Args&&... args ) -> T< typename unwrap< typename std::decay::type >::type... >
	return T<
		typename unwrap<
			typename std::decay::type
	>( std::forward(args)... );

Things to note about this implementation:

  • The unwrap template and specialisations enables the call site to force a reference type in the deduction in a similar way to std::bind.
  • std::decay enables this to model function parameter passing more closely by performing certain conversions, such as array-to-pointer.
  • a noexcept constructor does not result in make being noexcept, although it should be straightforward to add support to do so.

Putting it to use

The following code sample shows how this may be used:


/** Simple wrapper class. */
template< class T >
struct wrapper
	explicit wrapper( const T& t_ ): t(t_) {}

	template< class U >
	explicit wrapper( U&& u )
		: t( std::forward<span style="text-decoration: underline;">(u) )

	T t;

/** Simple helper to clarify the assertions below. */
template< class, class >
struct assert_wrapped_type_same;

template< class T, class U >
struct assert_wrapped_type_same< wrapper, U >
	static_assert( std::is_same< T, U >::value, "wrapped type mismatch" );

int main(int, char**)
	// basic test from literal; copy makes from rvalue
	auto w0 = make( 42 );
	assert_wrapped_type_same< decltype(w0), int >();
	assert( w0.t == 42 );

	// basic test from lvalue: copy makes and drops const qualification
	const std::string hello("world!");
	auto w1 = make( hello );
	assert_wrapped_type_same< decltype(w1), std::string >();
	assert( w1.t == "world!" );

	// test decay from array type.
	auto w2 = make( "xyzzy" );
	assert_wrapped_type_same< decltype(w2), const char* >();
	assert( std::string(w2.t, 5) == "xyzzy" );

	// test moving a non-copyable object into makeor
	std::unique_ptr upi( new long(120623) );
	auto w3 = make( std::move(upi) );
	assert_wrapped_type_same< decltype(w3), std::unique_ptr >();
	assert( !upi );
	assert( w3.t and *w3.t == 120623 );

	// test use of reference wrappers to qualify deduced type; std::cref -> const T&
	std::string s("must go faster");
	auto w4 = make( std::cref(s) );
	assert_wrapped_type_same< decltype(w4), const std::string& >();
	assert( &w4.t == &s );

	// and std::ref -> T&
	auto w5 = make( std::ref(s) );
	assert_wrapped_type_same< decltype(w5), std::string& >();
	assert( &w5.t == &s );

	// now a sample with a variadic template
	auto tup0 = make( 42, "hello", std::string("world!"), std::unique_ptr(new long(120623) ), std::cref(s), std::ref(s) );
		std::is_same< decltype(tup0), std::tuple<int, const char*, std::string, std::unique_ptr, const std::string&, std::string&> >::value,
		"tuple type mismatch" );
	assert( std::get(tup0) == 42 );
	assert( std::get(tup0) == std::string("hello") );
	assert( std::get(tup0) == std::string("world!") );
	assert( std::get(tup0) and *std::get(tup0) == 120623 );
	assert( &std::get(tup0) == &s );
	assert( &std::get(tup0) == &s );

You can download the complete code sample here; I have tested it successfully with gcc 4.7.2, but it won’t build on 4.6.3.

I’d love to hear what you have to say on this – whether it’s useful to you, if you have any suggestions/corrections etc.

Update, 2013-03-28

Mike Spertus and Daveed Vandevoorde have recenly submitted a standards proposal that would enable deduction of class template parameters in some constructor calls: n3602. The approach is fundamentally different from the make-based approach, which is good as that has some significant issues (not least that it only works where the template type parameter list matches the constructor parameter list).

In their approach for each of the class template’s constructors the compiler will add a function template with the same parameter list to an overload set. The template parameters can then be deduced by performing overload resolution on that set via a synthesised (i.e. pretend) call.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Extending the empty base class optimisation (EBO) to members

The empty base class optimisation or EBO is a space-saving technique that is employed by many popular C++ compilers. Simply put by the standard (§1.8/5 in all versions):

…a most derived object shall have a non-zero size and shall occupy one or more¬†bytes of storage. Base class subobjects may have zero size.

This technique is employed in particular by some standard library implementation where a container template class’ constructor takes a (templated) allocator instance. Even if an allocator object is empty, to accept and store it as a distinct member sub-object requires that it occupy real space within the container class instance; this will likely be 1 byte (the minimum non-zero size) plus whatever padding is required for your platform’s alignment (likely a further 3 or 7 bytes for 32-bit or 64-bit respectively). By instead making the allocator type a¬†base class of the container, this space is no longer required.

I recently found an alternative approach described by Nathan Myers in 1997, in which he wraps some other member object in a special-purpose struct, and then makes the empty type a base of¬†that, thus still employing EBO. It’s a nifty little technique that, although rarely likely to be useful, will be of immense help should I ever have an issue with memory footprint of such a type.

All the detail can be read on Nathan’s website here:¬†The “Empty Member” C++ Optimization.

Posted in Uncategorized | Tagged | 3 Comments

Efficient argument passing in c++11 [via A Sense of Design]

Boris Kolpackov has written a series of 3 articles on the minefield of efficient argument passing in C++11.

In part 1 he discusses the addition to C++11 – rvalue references – that improves argument-passing efficiency in some cases, but that it adds complexity for the general case. There are some interesting points here, particularly that there is no 1-size-fits-all solution for the argument signature (const lvalue reference, rvalue reference, value).

I think for a while I’ve been vaguely aware that something didn’t feel quite right, since I’d been tailoring a few method signatures with the knowledge of whether or not the caller would be moving an object in, and whether I want to copy or just reference the argument. Clearly the loss of opacity, and fragility in the face of implementation change are undesirable in our code.

In part 2 Boris outlines a wrapper similar to std::reference_wrapper that can be constructed with either an rvalue reference or const lvalue reference. It’s rather a good wrapper, but impractical for use everywhere; more suitable for generic library code that needs to be as tight as possible.

Part 3 summarizes and appraises the argument reference wrapper approach, and makes some general conclusions about argument passing generally in C++11.

It’s a shame that this area is as un-clear cut as it seemed; whether a language change to mitigate this occurs, or is even possible, will remain to be seen. For now, following Boris’ suggestion to think about the conceptual requirements of each passed argument, and avoid premature optimisation, seems to be the best approach.

Update 2012/10/17

Sumant Tambe of¬†C++ Truths recently gave a talk entitled “C++11 idioms” at Silicon Valley Code Camp, in which he discussed the above approach to efficient argument passing, plus pros/cons and alternative approaches.

Scott Meyers has also posted a response to Sumant’s presentation, building on it and adding his own wisdom, in particular:

The fundamental problem is that perfect forwarding and overloading make¬†very¬†bad bedfellows, because perfect forwarding functions want to take¬†everything. They’re the greediest functions in C++.

There’s an interesting discussion with¬† plenty of good stuff in the comments under that post too.

Posted in Uncategorized | Tagged , | Leave a comment

Why processing a sorted array might be faster than unsorted

I’ve just read this week’s stackoverflow newsletter, and was intrigued by a recent top question “Why is processing a sorted array faster than an unsorted array?”

Anyone writing performance-critical processing code should be aware of this; the C++ comparison showed that processing the sorted array was nearly 6x faster than with the unsorted, which is certainly an eyebrow-raising result. A straight Java equivalent showed a similar, albeit less extreme, difference.

The reason for this potentially surprising behaviour is that in the code with the unsorted array, the CPU’s branch predictor fails 50% of the time and stalls the pipeline.¬†I’ll hand you over to GManNickG to show you the code and Mysticial to explain, rather well, what is going on: Why is processing a sorted array faster than an unsorted array?

This is something I was aware of but have never taken into account, so I’ll be trying to remember Mysticial’s general rule of thumb to “avoid data-dependent branching in critical loops“.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Virtual move constructor

Virtual constructor: recap

Most experienced C++ developers should be familiar with the virtual constructor idiom, which allows us to duplicate a polymorphic object where we do not know its exact type, only the base.

It is usually called clone and can be implemented and used as shown in this rather contrived example:

struct fruit {
	virtual ~fruit() {}

	virtual std::unique_ptr<fruit> clone() const = 0;

struct apple
	: fruit
	virtual std::unique_ptr<fruit> clone() const {
		return std::unique_ptr<fruit>( new apple(*this) );

// orange, pear, roddenberry etc...

int main() {
	const fruit& f = apple();
	auto fc = f.clone();

Note that when using unique_ptr we lose the ability to have covariant return types, since unique_ptr< apple > is not derived from unique_ptr< fruit >. In my experience I’ve never needed a derived-type copy out of clone (outside of unit testing anyway), so the covariance offers me nothing, whereas the unique_ptr gives me that nice warm fuzzy feeling ūüôā

Virtual move constructor

Now, it may seem odd to want a virtual move constructor – why would I want to move an object’s content into a new object rather than continue using the old object? I’ll give the use case below, but for now just take as read that it might be helpful ūüôā

It’s rather easy to implement if you already know the virtual constructor idiom, as we only need to remove the constness and generate an rvalue reference to *this.

Building on the above example we have:

struct fruit {
	virtual ~fruit() {}

	virtual std::unique_ptr<fruit> clone() const = 0;
	virtual std::unique_ptr<fruit> move_clone() = 0; // clearly non-const

struct apple
	: fruit
	virtual std::unique_ptr<fruit> clone() const; /* as before */

	virtual std::unique_ptr<fruit> move_clone()  {
		return std::unique_ptr<fruit>( new apple( std::move(*this) ) );

If we’re wanting a virtual move constructor then we’re likely to want to create move constructors for the classes as well, but I’ve left that out from here.


So I know the concept of a virtual move constructor seems odd, but I’ve got a use case I can think of no better solution for than this:

  • We have a non-copyable polymorphic buffer object¬†that we wish to move ownership of into a functor, which will be run exactly once passing ownership elsewhere. Due to the constraints of the system (and following standard library¬†convention) this functor must be copyable.
  • To safely enable the functor to be copyable we convert the unique_ptr< buffer >¬†into a shared_ptr¬†inside the functor.
  • Once the functor has run we wish to release ownership of the buffer from¬†the shared_ptr…but shared_ptr doesn’t provide a release¬†method.
  • So we ‘move clone’ the shared buffer instance to release its internally-owned¬†resources to a new instance, allowing shared_ptr to destroy the now ‘zombified’¬†original instance.

I couldn’t find any mentions of ‘virtual move constructor’ about, so it’s likely I’m doing something so ridiculous that nobody else has thought of it. I can’t see any alternatives that wouldn’t require making buffer copyable and maintaining an internal reference count¬†however¬†(which would be comparable performance to this solution, but with more pain and maintenance), so comments and alternative solutions are welcome.

Posted in Uncategorized | Tagged , | 8 Comments