A Tour of Rare C++ Features Part 5: The Function Declaration from Hell

In the previous installment  we discovered the ancient Digraphs and Trigraphs, as well as the more modern lambda expressions. This post focuses on the type from hell, which is overly complicated, but still full of fun.

The guiding example of this series is still the same and should now start to make some sense if you have read the previous posts in this series:

cpp
/* This is valid C++ */
auto main() -> decltype('O.o') try
<%[O_O = 0b0]<%
https://daniel.schemmel.net/post/2015/a-tour-of-rare-c++-features-part-1
typedef struct o O;
o*(*((&&o(o*o))<:'o':>))(o*o);
if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;
else return 0==O==0 ? throw O_O : O_O;
%>();%>
catch(...) { throw; }

It all begins with the innocent typedef in line 5, which creates the type alias o for the type struct o. In turn, this actually creates the type o, which now is a class type (remember class and <struct do almost the same thing in C++) that is incomplete, as there is no known definition.

This typedef is reminiscent of an old C idiom, which is used to make a type available as both struct T and directly T:

cpp
typedef struct T {
    // maybe add some members
} T;

Welcome to Hell

cpp
/* This is valid C++ */
auto main() -> decltype('O.o') try
<%[O_O = 0b0]<%
https://daniel.schemmel.net/post/2015/a-tour-of-rare-c++-features-part-1
typedef struct o O;
o*(*((&&o(o*o))<:'o':>))(o*o);
if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;
else return 0==O==0 ? throw O_O : O_O;
%>();%>
catch(...) { throw; }

I really did not know how to properly introduce this snippet of code. If this was smell-net, I might have started with a good dose of brimstone… After looking at it for a while during the writing of this article, I decided that letting it stand for itself would be disturbing enough.

Let me start by saying that anyone, ever writing a line of code like this for real software should be subject to the most insidious torture one could possibly imagine. Maybe by making them explain exactly what it does in public cough.

Starting at the very highest level, this declares a function. Yes it is actually possible to declare a function at block level - one may not define it, but similar to the incomplete type struct o, it can be declared. In the following, I have marked the return type red, the arguments blue and the name green:

o*(*((&&o(o*o))<:'o':>))(o*o);

Starting with the easy part: The arguments consist of a single parameter of type o* and the name o. While the name of the function is o as well, and therefore clashes with the type struct o, this will only become active after the declaration, that is in the next line. Unintuitively, this will not actually be a problem, but rather shadow the type o (although it will still be available by using its alias O or by writing struct o explicitly).

The return type is a bit more complex. Well, "a bit" may be a bit of an understatement. But now, that we have divined what this statement is at its core, we can start to work from the inside out. The next part that needs to be understood is the &&, which means "r-value reference to". R-value references are similar to normal l-value references (written as &), and differ primarily in what they say about the object they reference: R-value references indicate that the object is at the end of its lifetime, and thus may be modified without repercussions (as nobody will be using it anymore). While r-value references are very interesting and loads of fun (useful, too), they are a broad topic that will have to wait for another time, as it is of no further relevance to the guiding example of this series of articles.

The next fragment is the <:'o':>, which uses the Digraphs introduced in the previous installment  of this series. By resolving them, we get ['o'], which would look like an array, if only 'o' where a number instead of a letter… Which is the case, as char is a normal integral type. While the C++ standard does not say, which number 'o' is (for most environments it will be 111), this means that we really are talking about an array here.

To illustrate the final piece, I have highlighted the hitherto undiscussed parts of our function declaration from hell:

o*(*((&&o(o*o))<:'o':>))(o*o);

This may look familiar to some, as it is a pointer to a function, that also takes a single parameter of type o* named o and returns an o*.

Putting it together

After analyzing the type of the function declaration piece by piece, we can join all parts together to come to the following conclusion:

Line 6 of the example that has been the pride and joy of this series of posts declares a function named o, that:

This type is one example of the kind of grammars humans are really bad at . Of course, programming languages should be human and machine readable…

Could this have been written in a way that can also be read by humans?

Well, sort of. After all, o has a pretty humongous type - no amount of prettification is going to change that. However, the impact can be lessened by using a few helper types that tame the syntax of this declaration:

cpp
template<typename return_type, typename... argument_types>
using function_pointer = return_type(*)(argument_types...);

template<decltype(sizeof(char)) N, typename T>
using array = T[N];

template<typename T>
using rvalue_reference = T&&;

int main() {
    using O = struct o;
    rvalue_reference<array<'o', function_pointer<o*, o*>>> o(o*);
}

The names and types are still the same as in the original example, I have merely added a few helper templates, used more readable syntax and removed everything not relevant. The return type of o can now be read from left to right without any jumping around and having to remember the syntax for function pointers or how to return a reference to an array from a function.

For anyone who thinks that type aliases templates are at least dark-grey magic, it is easy to remove them from the example by simply making the explicit:

cpp
using function_pointer = o*(*)(o*);
using array = function_pointer['o'];
using rvalue_reference = array&&;

int main() {
    using O = struct o;
    rvalue_reference o(o*);
}

Of course, it would make the most sense to simply require that the return type of o be a properly encapsulated class, or at most an (r-value) reference to that class. After all, if you need a function to return it, you should be able to come up with a reasonable name for what it really is.

Gimme more!

This post is part 5 of a series on rarely used C++ features. Continue to the next part, Part 6 .