C++ Metaprogramming Adventure: Named Parameters

Back in January I embarked on a C++ Metaprogramming Adventure. I thought it would be a fun experiment to try implement named parameters in C++. Indeed it was fun but infuriating as well at times, especially when compiler started throwing internal errors 🙂

The primary goal was not to achieve something that I or anyone else would actually use. I fully expected the overhead in terms of performance and more importantly in terms of code readability to be too much. If you are looking for an actual viable solution for this problem, it looks like boost includes a rather good one. However if you are looking to learn some interesting metaprogramming techniques, you’ll find plenty in this article.

What is metaprogramming anyways? Well, programming is about telling computer what to do with input data and what to output. Metaprogramming is about telling computer what to do with a program. I did not modify the compiler though, the adventure took place within the confines of C++ code. However, what I tried to achieve sounds more like a language feature than just a library function and that’s why I call it metaprogramming.

Motivation for Named Parameters

The main point of a named parameter is that the caller has to write the name, which helps reader and prevents bugs. Imagine a method:

void add( FooObject * foo, bool takeOwnership )

Usually the caller would just write true/false for the bool parameter.

add( foo, true )

For a casual reader it can be hard to guess what the bool parameter means. If there are more parameters like that, it can be hard to remember their order even for someone who uses the method regularly. And if there are too many, it can be hard to write a call correctly even after checking the declaration. However, if the method supported named parameters, the caller could/would have to supply name for the parameter, for example:

add( foo, take_ownership : true )

Of course C++ doesn’t support such syntax, but I tried to achieve something similar.

Code

This metaprogramming adventure is split into several stages so that you are not overloaded by multiple new concepts at once. I’ll show simplified examples in this article. The actual source code is usually more complex in terms of number of parameters etc. The code depends on typestring.hh, which is not included in the archive.

Stage 1: typestring

When I scouted for possible approaches I found a gem called typestring. I immediately started thinking about the ways it could be utilized.

Typestring is a variadic template class where the template parameters are characters.

irqus::typestring<'a', 'b', 'c'> abc;

Unfortunately, C++ does not let you pass a string literal instead of a list of characters, yet. So George “irrequietus” Makrydakis, author of typestring, developed a macro that does that. The following declaration of abc is equivalent to the first one.

typestring_is("abc") abc;

Under the hood the macro evaluates only a fixed number of characters. It also uses precompiler rather aggressively, I have ran into internal compiler errors with Visual Studio 2017 and had to reduce the number of evaluated characters to just four. The more code I wrote the worse it got, to the point it didn’t compile at all at times. GCC on the other hand works flawlessly.

Stage 2: NamedValue

If I can treat a name (string literal) as a type, then I can utilize C++ parameter rules to enforce a match between passed parameter name and the expected parameter name.

NamedValue< typestring_is("abc"), int > namedValue( 0 );

Here, I pair integer value of zero with name of “abc”. NamedValue is a template class with two parameters. The first one is typestring and the second is payload type. See NamedValue.h for details. At runtime the typestring parameter defines behavior of the getName() method but the important part happens at compile-time. The fact that a typestrings of different name is a different type affects NamedValue as seen below:


NamedValue< typestring_is("abc"), int > abc( 0 );
NamedValue< typestring_is("abd"), int > * abdPtr = nullptr;
abdPtr = &abc; // Error C2440 'initializing':
// cannot convert from 'NamedValue<irqus::typestring<97,98,99>,int> *'
// to                  'NamedValue<irqus::typestring<97,98,100>,int> *'

Stage 3: First prototype

At this point I was ready to make first actual function with NamedParameters. All I needed to do is to use NamedValue with appropriate typestring name and payload type as function parameter, both in function declaration/definition and in the call.


static void foo( NamedValue<typestring_is("bar"), bool> bar ) { ... }

...

foo( NamedValue<typestring_is("bar"), bool>( false ) );

However, that code is rather wordy and I hate to repeat myself. So I added couple macros that let me write much more terse code, see NamedValueMacros.h. The first issue was that I wanted to specify just the name and value when calling a function and not much else. Hence I made the NV macro that can be used like this:

NV( bar, false )

This creates a temporary NamedValue named “bar” with value of false. The payload type is inferred. To achieve this I made a template function MakeNamedValue


template<class TName, class TPayload>
static NamedValue<TName, TPayload> MakeNamedValue(TPayload value)
{
	return NamedValue<TName, TPayload>(value);
};

This allows the compiler to infer the TPayload type from the passed parameter. The macro is now easy:

#define NV( name, value ) MakeNamedValue<typestring_is(#name)>( value )

The number-sign or “stringizing” operator (#) converts macro parameters to string literals without expanding the parameter definition. That way I don’t have to enclose the name in quotation marks. So named parameter can now be passed reasonably, but function declaration/definition is still a pain. Therefore I made also the NamedParam macro.

#define NamedParam( type, name ) NamedValue<typestring_is(#name), type> name

Another use of the “stringizing” operator, however this time it also prevents the necessity of repeating name – once in the string literal form and once in the identifier form.

Using the macros, the original example is reduced to the following:


static void foo( NamedParam( bool, bar ) ) { ... }

...

foo( NV( bar, false ) );

At this point I got a functional example but I knew it has rather unfortunate limits. The next two stages outline what can and cannot be done with this setup.

Stage 4: Default parameters

One of the advantages of named parameters is that, at least conceptually, any parameter can be omitted provided a default value is defined. In contrast traditional functions calls can omit only trailing parameters. I was not there yet but I noticed that at least the standard C++ default values still work.

static void foo( NamedParam(bool, bar) = false ) { ... }

Indeed after the expansion of the NamedParam macro we get following code.

static void foo(  NamedValue<typestring_is("bar"), bool> bar = false ) {}

The only mystery is how can you assign false to a value typed as NamedValue<typestring_is(“bar”), bool>. Well, C++ compiler happily uses the constructor from payload value to perform an automatic conversion. This would not be possible if the constructor had the “explicit” specifier.

Stage 5: Unordered parameters via overloading

The other advantage of named parameters, again at least conceptually, is that the caller does not have to follow any specific order of listing them. Can I achieve that in C++ easily? Not really. The easy solution for simple cases (one or two parameters) is overloading. However, as I mentioned in the motivation section, named parameters are most useful when there is a lot of parameters. Anyhow, the example below showcases overloading.


static void foo( NamedParam( bool, bar ) = false,
                 NamedParam( bool, baz ) = true )
{ ... }
static void foo( NamedParam( bool, baz ),
                 NamedParam( bool, bar ) = false )
{ foo( bar, baz ); }

Creating overloads becomes almost exponentially more laborious and error prone activity as the number of parameter grows. Can we find a different approach, that would make this manageable?

Stage 6: Unordered parameters via template pattern matching

This is the part where the proverbial excrement hits the fan. It took a lot of code to make this work (although it scales linearly with the parameter count) and due to the overall complexity I have not provided any syntactic sugar for the implementer. At least calling the final method is as easy as before.

The high level idea is not that hard, but before we get to that, let me present two data structures that play key roles.

Compile-time List

You might be familiar with this kind of list from prolog. Compile-time list (CTL) is a template class defined in CompileTimeList.h. Non-empty CTL has two fields, THead head and TTail tail, where the head is the first element of the list and the tail is a CTL containing the rest of the elements. Empty CTL is represented by the class EmptyCompileTimeList. This way the type of each element is known at compile time.

The ConcatenateCtl() function allows easy manipulation with CTLs.


template<class TLeftHead, class TLeftTail, class TRightHead, class TRightTail>
auto ConcatenateCtl(
	CompileTimeList<TLeftHead, TLeftTail> left,
	CompileTimeList<TRightHead, TRightTail> right )
{
	ConcatenateCtl( left.head, ConcatenateCtl(left.tail, right) );
}

As you can see the new tail is concatenation of the tail of the left argument concatenated with the right argument. There are several more overloads for various edge cases, such as EmptyCompileTimeList in either argument and omitted argument. There are also overloads for special type IgnoredCtlElement *, which is simply omitted from the resulting CTL.

MakeCtl() can help with creation of a list from elements.


template<
	class T1 = IgnoredCtlElement *,
	class T2 = IgnoredCtlElement *,
	...
>
auto MakeCtl(
	T1 e1 = nullptr,
	T2 e2 = nullptr,
...
)
{
	return (
		ConcatenateCtl(e1,
		 ConcatenateCtl(e2,
...
		              ConcatenateCtl(e15
		)))))))))))))))
	);
}

This showcases an interesting trick, the default value nullptr combined with default type IgnorecCtlElement * means that omitted parameters become nullptr typed as IgnoredCtlElement *. ConcatenateCtl() skips elements typed as IgnoredCtlElement *. That way I have avoided having to create an overload for each possible argument count.

Parameter Structure

This is just a structure that holds parameters instead of them being passed individually via function parameter list.


class FooParameters
{
public:

	unique_ptr<bool> bar;
	unique_ptr<bool> baz;
};

It uses unique_ptr so that an unset parameter can be represented by nullptr. Not the most efficient approach, but it works. If a parameter is omitted, the unique_ptr remains null and the business logic function at the end can address that by using default value instead.

Listify

The first step is to collect all parameters passed to the function and put them into a CTL.

The function has to be a template so that parameter types are known at compile-time. That is a common theme of this template pattern matching approach.


template<class T1, class T2>
void foo_listify( T1 p1, T2 p2)
{
	auto list = MakeCtl(p1, p2);
	return foo_initStruct(list);
}

MakeCtl returns a CTL, properly typed one as well, so that types of p1, p2, p3 and p4 are preserved in the list type.

Init struct.

The second step is to initialize an empty paramter structure.


template<class THead, class TTail>
void foo_initStruct( CompileTimeList< THead, TTail> list )
{
	FooParameters paramStruct;
	return foo_matchHead( list, paramStruct );
}

This is simple since unique_ptr initialize themselves to nullptr on their own, but it is an important step. It is also worth noting that foo_matchHead functions modify the local paramStruct variable declared here.

Parameter matching

The third step is to loop through all CTL elements and match them to a parameter. The parameter value is assigned to appropriate field in parameter structure. The loop is actually a recursive call to foo_matchHead() function performed as long as CTL is non-empty.

Of course there are several overloads of foo_matchHead(), otherwise there wouldn’t be much pattern matching to talk about. The main body of work is done by the following family of functions.


template<class TTail>
void foo_matchHead(
	CompileTimeList< NamedValue<typestring_is("bar"), bool>, TTail> list, 
	FooParameters & params )
{
	params.bar.reset( new bool(*(list.head)) );
	return foo_matchHead( list.tail, params);
}

As you can see this method is called only if list head happens to be NV( bar, ? ). If that is the case, the value of the head is assigned to bar in parameter structure. Then it continues recursively with tail.

When all elements are exhausted, the tail passed to the next matchHead() is EmptyCompileTimeList so the following function is called.


void foo_matchHead( EmptyCompileTimeList emptyList, 
                    FooParameters & params )
{
	return foo_final(params);
}

Not much to do here, just execute the business logic.

There is also a catch-all overload for the edge case when parameter is not recognized due to name and payload type not matching anything expected.

Business logic

The final step is to execute the business logic of the function using the final values in the parameter structure.

It is a simple step but let’s show it in case there are any doubts on how to access the parameters.


static void foo_final( FooParameters & params )
{
	wcout
		<< "foo(" << endl
		<< "    bar: " << toString(params.bar) << endl
		<< "    baz: " << toString(params.baz) << endl
		<< ")" << endl;
}

The toString() function returns “true”/”false”/”null” depending on the value of passed unique_ptr &.

Stage 7: Unordered parameters and optional/required parameter checking

The Stage 6 approach does support default values for any parameter and order of parameters does not matter. However it is a bit too lenient – if a required parameter is omitted the code still compiles. The business logic function may complain at run-time, but that’s a bit too late. It also allows the same parameter to be set multiple times.

I didn’t like either of that so I thought let’s keep compile-time information about the parameter availability.


template<bool barSet, bool bazSet>
class FooParameterAvailability
{
public:
	typedef FooAvailability<true, bazSet> WithBarSet;
	typedef FooAvailability<barSet, true> WithBazSet;

	static void AssertBarNotSet()
	{
		static_assert(!barSet, "Multiple values for DefaultV.");
	}

	static void AssertBazNotSet()
	{
		static_assert(!bazSet, "Multiple values for DoWrite.");
	}
};

Interestingly this class is never instantiated, since it holds only compile-time information.

Arguably the typedef part scales quadratically with named parameter count. The usage is as follows:

	TParameterAvailability::AssertBarNotSet();
	params.bar.reset(new bool(*(list.head)));
	foo_matchHead<
		typename TParameterAvailability::WithBarSet,
		typename TList::TTail
	>::run(list.tail, params);

The assert method usage is quite straightforward, just call it after matching a parameter, before assignment. Afterwards flip on the appropriate template argument by passing the right WithXSet type instead of the original.

Now obviously I had to add TParameterAvailability to foo_matchHead() methods. Something unfortunate happened, though. If I recall correctly the compiler had issue with mixing template functions, template function specializations and overloads. The solution was to stick with one template function and its specializations. But then I ran into a limitation of template function specializations so I had to switch to template class and its specializations. Therefore foo_matchHead() became a class with a run() method. So the full code of one of the classes is as follows:

template<class TParameterAvailability, class TTail>
class foo_matchHead<
	TParameterAvailability, 
	CompileTimeList< NamedValue<typestring_is("bar"), bool>, TTail>
>
{
public:
	typedef CompileTimeList<
		NamedValue<typestring_is("bar"), bool>,
		TTail
	> TList;

	static void run(TList list, FooParameters & params)
	{
		TParameterAvailability::AssertBarNotSet();
		params.bar.reset(new bool(*(list.head)));
		foo_matchHead<
			typename TParameterAvailability::WithBarSet,
			typename TList::TTail
		>::run(list.tail, params);
	}
};

Required parameters can be enforced by changing the EmptyComplieTimeList specialization to static_assert fail and at the same time providing yet another specialization where the required parameters are known to be available.

template<class TParamAvailability>
class foo_matchHead<TParamAvailability, EmptyCompileTimeList>
{
public:
	static bool run(EmptyCompileTimeList emptyList, FooParameters & params)
	{
		static_assert(
			!std::is_same<TParamAvailability, TParamAvailability>::value,
			"Missing parameter."
		);
	}
};

template<bool bazSet>
class foo_matchHead<
	FooParameterAvailability<true, bazSet>,
	EmptyCompileTimeList
>
{
public:
	typedef EmptyCompileTimeList TList;

	static void run(TList list, FooParameters & params)
	{
	}
};

!std::is_same(X,X) is a fancy way to say false. I did this because some compilers will report failure on static_assert even if the method is never called. But they may do so only if the static_assert is independent on template arguments.
In this example bar is a required parameter, it is specified by FooParameterAvailability<true, bazSet>.

You might have noticed run() doesn’t call business logic here. That’s a small improvement in separation of concerns, the job of foo_matchHead is to set up parameter structure. The business logic method is called afterwards.

Stage 8: Make named parameters functions easy to write

At this point I had the syntax and features I wanted for the caller, but writing a named parameter function became a colossal ordeal. Macro sorcery and template wizardry to the rescue!

NamedParameterFunction

So the main idea here was to create a generic class NamedParameterFunction that handles the conversion from parameter CTL to parameter structure. In order to achieve that I had to encode the behavior of individual named parameters into template arguments. The attached implementation handles up to 15 named parameters.

In order to declare each named parameter I have made following template class that can be utilized via pattern matching.

template<class TName, class TPayload, bool TIsRequired = false>
class NamedParameterDeclaration
{
public:
	typedef NamedValue<TName, TPayload> NamedValue_type;
	typedef TName Name_type;
	typedef TPayload Payload_type;
	static const bool isRequired = TIsRequired;
};

Another case of a class that is never instantiated. The only reason for it having any members at all (the typedefs and the bool) is to make them easily accessible. Pattern matching can be done on template arguments but passing a template argument without having a convenient typedef is awkward.

NamedParameterDeclarations are used as template argument of NamedParameterFunction to declare up to 15 of them

template<
	class TParameter1Declaration = UnusedParameter,
	class TParameter2Declaration = UnusedParameter,
	...
>
class NamedParameterFunction
{
    ...
}

Each parameter declaration defaults to UnusedParameter if actual declaration is not supplied. That way I can support up to 15 named parameters rather than exactly 15.

class UnusedParameter
{
public:
	typedef UnusedParameter NamedValue_type;
	typedef UnusedParameter Payload_type;
	static const bool isRequired = false;
};

It defines the same members, albeit with special values.

The TParameterXDeclarations can be used inside the NamedParameterFunction template class to craft code that behaves accordingly. The simplest example is probably the ParameterStructure class.

class ParameterStructure
{
public:
	unique_ptr<typename TParameter1Declaration::Payload_type> p1;
	unique_ptr<typename TParameter2Declaration::Payload_type> p2;
	...
};

The class ParamCtlToStruct (known as foo_matchHead in previous stage) also utilizes the TParameterXDeclarations.

template<class TParameterAvailability, class TTail>
class ParamCtlToStruct<
	TParameterAvailability,
	CompileTimeList< typename TParameter13Declaration::NamedValue_type, TTail>
>
{
public:
	typedef CompileTimeList<
		typename TParameter13Declaration::NamedValue_type,
		TTail
	> TList;
	typedef typename TParameter13Declaration::Payload_type Payload_type;

	static void run(TList list, ParameterStructure & params)
	{
		TParameterAvailability::p13SingleAssignmentCheck();
		params.p13.reset(new Payload_type(*(list.head)));
		ParamCtlToStruct<
			typename TParameterAvailability::WithP13Set,
			typename TList::TTail
		>::run(list.tail, params);
	}
};

Interesting observation: All unused parameters should lead to exactly the same ParamCtlToStruct specialization, except the run() body would refer to a different parameter number. Not a problem though and I’m not entirely sure why, but it could be SFINAE filtering out the duplicates.

Compared with the previous stage not much has changed here, just instead of hard-coded values there are references to TParameterXDeclaration members.

I had to change one thing – how required parameters are handled. That part got quite rough.

So the first change was to add a member classes to ParameterAvailability that can verify a the particular parameter is available.

template< bool p1Set = false, bool p2Set = false, ... >
class ParameterAvailability
{
public:
	...

	class P1RequiredParameterChecker {
	public:
		static void required()
		{
			static_assert(p1Set, "Required Parameter 1 not provided.");
		}
	};

	...
}

I used a class here because then I can pass it as a type parameter further. That way I don’t have to copy paste the following RequiredParameterChecker class for each of the 15 named parameters.

template<bool isRequired, typename TSpecificRequiredParameterChecker>
class RequiredParameterChecker
{
public:
	static void run()
	{
	}
};

template<typename TSpecificRequiredParameterChecker>
class RequiredParameterChecker<true, TSpecificRequiredParameterChecker>
{
public:
	static void run()
	{
		TSpecificRequiredParameterChecker::required();
	}
};

As you can see, RequiredParameterChecker is quite simple – if named parameter is required, it uses the specialization that actually performs the check. If not, it uses the basic definition that does nothing. Thinking about this I bet boost has facilities for this kind of template-conditional call. That’s what I did here, template-conditionally call required(). I say template-conditionally, because if parameter is not required not only required() is not called, the generated code does not even contain dead code of the call.

The final step is easy, just run the check for each named parameter.

template<class TParameterAvailability>
class ParamCtlToStruct<TParameterAvailability, EmptyCompileTimeList >
{
public:
	typedef EmptyCompileTimeList TList;

	static void run(TList list, ParameterStructure & params)
	{
		RequiredParameterChecker<
			TParameter1Declaration::isRequired,
			typename TParameterAvailability::P1RequiredParameterChecker
		>::run();
		RequiredParameterChecker<
			TParameter2Declaration::isRequired,
			typename TParameterAvailability::P2RequiredParameterChecker
		>::run();

		...
	}
};

NV_ macros

NamedParameterFunction handles the hardest part, but it is still quite complicated to set up named parameter function. I have addressed that with a set of three macros. For reference, let’s see how would a “manual” named parameter function definition look like.

template<
	class T1 = IgnoredCtlElement *,
	class T2 = IgnoredCtlElement *,
	...
>
bool foo(
	T1 p1 = nullptr,
	T2 p2 = nullptr,
	...
)
{
	unique_ptr<bool> bar;
	unique_ptr<int> baz;
        ...
	{
		auto paramStruct = NamedParameterFunction<
			NamedParameterDeclaration<typestring_is("bar"), bool, true>,
			NamedParameterDeclaration<typestring_is("baz"), int, false>,
			...
		>::ParamsToStructure( p1, p2, ... );
		bar = move( paramStruct.p1 );
		baz = move( paramStruct.p2 );
		...
	}

	wcout
		<< "foo(" << endl
		<< "    bar: " << toString(bar) << endl
		<< "    baz:  " << toString(baz) << endl
		<< ")" << endl;
	return true;
}

This does not look that bad mainly thanks to the power of ellipsis, but ellipsis doesn’t work in actual code. Even then it is quite technical and not user friendly.

First, I took care of template and actual arguments, since that is easy and always the same.

#define NV_SIGNATURE_TEMPLATE_ARGS \
class T1 = IgnoredCtlElement *,\
class T2 = IgnoredCtlElement *,\
...

#define NV_SIGNATURE_PARAMETERS \
T1 p1 = nullptr,\
T2 p2 = nullptr,\
...

This way user can still include his own template and actual arguments. Now the hard part, a macro that declares local variables for the extracted parameters and does the extraction.

#define NV_EXTRACT_PARAMETERS( \
	p1Name, p1Type, p1IsRequired, \
	p2Name, p2Type, p2IsRequired, \
	...
) \
	unique_ptr<p1Type> p1Name; \
	unique_ptr<p2Type> p2Name; \
	...
	{ \
		auto paramStruct = NamedParameterFunction< \
			NamedParameterDeclaration< \
				typestring_is(#p1Name), p1Type, p1IsRequired>, \
			NamedParameterDeclaration< \
				typestring_is(#p2Name), p2Type, p2IsRequired>, \
			...
		>::ParamsToStructure( p1, p2, ... ); \
		p1Name = move( paramStruct.p1 ); \
		p2Name = move( paramStruct.p2 ); \
		...
	}

The only trouble is, that the macro expect 15 parameter triplets. Passing less is an error. I could not figure a seamless way to handle this, so I had to settle with the following.

#define NV_EXTRACT_PARAMETERS2( \
	p1Name, p1Type, p1IsRequired, \
	p2Name, p2Type, p2IsRequired \
) \
NV_EXTRACT_PARAMETERS( \
	p1Name, p1Type, p1IsRequired, \
	p2Name, p2Type, p2IsRequired, \
	unusedNameParameter3, void *, false, \
	unusedNameParameter4, void *, false, \
	...
)

There would have to be NV_EXTRACT_PARAMETERSX for all X in range [1,15]. Sadly, this pollutes the scope of the named parameter function with unusedNameParameterX identifiers. I don’t think it is a big deal.

The three macros make the job of defining named parameter function much easier:

template<NV_SIGNATURE_TEMPLATE_ARGS>
bool foo(NV_SIGNATURE_PARAMETERS)
{
	NV_EXTRACT_PARAMETERS(
		bar, bool, true,
		baz, int, false
	)

	wcout
		<< "foo(" << endl
		<< "    bar: " << toString(bar) << endl
		<< "    baz:  " << toString(baz) << endl
		<< ")" << endl;
	return true;
}

I think that is quite dry. Still, as I said in the beginning, boost probably does it better.