Daniel Schemmel
In the previous installmentif
statements, function style casts and grouping digits of numbers. This time, our focus will lie on the only remaining two statements of our guiding example.
By now, most of the guiding example of this series should be pretty well known, with the only parts that may still be somewhat enigmatic being highlighted below:
/* This is valid C++ */
auto main() -> decltype('O.o') try
<%[O_O = 0b0]<%
https://daniel.schemmel.net/post/2015/a-tour-of-rare-c++-features-part-1
typedef struct o O;
o*(*((&&o(o*o))<:'o':>))(o*o);
if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;
else return 0==O==0 ? throw O_O : O_O;
%>();%>
catch(...) { throw; }
Having already analyzed the if
previouslyelse
branch, meaning that the innocent looking 1,000.00;
may wreak all kinds of havoc if it ever were to be executed anyway.
It is however almost as innocent as it looks. In fact, if it were to be executed, nobody would ever know, as it has no effect at all. This should not stop us from trying to understand what it really is, though. As anyone reading this post can be expected to be somewhat fluent in English, the deception that this innocent little expression tries to pull of might even work on some of you. It is not just a number, even though it looks just like $1,000.00.
Looking more closely at the definitions for integerf(1, 000.00)
), the comma might make more sense.
Googling "C++ comma"int
with value 1
and a double
with value 0.0
that are operands to the comma operator.
To clarify: The commas in a function call are not comma operators: f(1, 000.00)
will call f
with two parameters, while f((1, 000.00))
will call f
with the result of the comma operator applied to 1
and 0.0
. Similarly do the commas in a variable declaration like int x, y;
have special meaning, and are not some kind of weird invocation of the comma operator.
As it turns out, the comma operator has a far more interesting syntax than semantics: It simply evaluates the left hand side, discards it and then evaluates and returns the right hand side2.
Concluding the analysis of this branch, it is never executed, and if it were, it would first ignore an unused value, 1
, followed by ignoring the other unused value, 0.0
.
The else
branch consists of a single statement as well: return 0==O==0 ? throw O_O : O_O;
. The first thing that we have to notice is that this is where the return type of the surrounding lambda functionreturn
statement becomes the return type of the whole lambda function.
The ternary operator or conditional operator a ? b : c
is fairly simple: It works like an if
for expressions that if a
is true, returns b
, otherwise it returns c
. Most of the time, its type is determined as a general type that fits both b
and c
by attempting to convert b
to the type of c
and vice versa3.
Alright, that means we need to figure out the types of the middle and right operands if we want to know the type of the whole expression. The right operand is easy, since we already knowO_O
has type int
(and value 0
). The middle operand however is a throw
expression, which has type void
.
When first learning that a throw
actually has a type at all, I was pretty surprised - after all its whole purpose is to not act as a nice little expression that has a tidy little result, but rather to escape the constraints of ordinary control flow. However, seeing it in this light, it does make sense that its type is void
, even if it is just to allow it to be classified as an expression, not a statement4.
As it turns out, there is a special rule regarding the use of throw
as the middle or right argument to the conditional operator that recognizes the fact that its return type does not really matter. For our example, this means that takes a deep breath: The return type of the lambda expression is the type of the argument to the return statement that is the type of the conditional expression that is the type of O_O
that is the type of 0b0
that is int
. Pfew.
Until now I have evaded the question of what the condition does, which is rather simple, but may be counter-intuitive to someone used to mathematical notation. Recalling the condition (0==O==0
) we see two equality comparisons. Unlike in mathematical notation, this does not mean that the whole expression is true if all three operands are equal, but rather is evaluated left-associative(0==O) == 0
without changing the meaning.
Since we are still in the if
(although in the else
branch), the meaning of O
is still the variable that was declared in line 7 and analyzed in the previous installmentstruct o*
and was evaluated to false
(we are in the else
branch after all). Since it is a pointer type, this means that it must be a null pointer. Comparing 0
to a value of pointer type causes it to be the null pointer constant5, and the inner equality test to be true
.
The outer comparison is therefore between true
and 0
which means the outer equality test evaluates to false
in turn, as true
is converted to 1
which is obviously not zero.
/* This is valid C++ */
auto main() -> decltype('O.o') try
<%[O_O = 0b0]<%
https://daniel.schemmel.net/post/2015/a-tour-of-rare-c++-features-part-1
typedef struct o O;
o*(*((&&o(o*o))<:'o':>))(o*o);
if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;
else return 0==O==0 ? throw O_O : O_O;
%>();%>
catch(...) { throw; }
The if
chooses the else
branch, which returns the right hand operand of the conditional, which is an int
with value 0
. The result of the lambda is then discarded in the main function6.
Although, for me, this compiles with a warning that there are paths for this lambda to exit without returning a value, this is not quite true, as it always does the above, without depending on any kind of input.
While it has been a wild ride, the example has now been fully analyzed and should hold no further surprises for anyone who actually bothered to read the full series. As a final comment: Yes, although it may look intimidating, this program does absolutely nothing.
This post is part 7 of a series on rarely used C++ features. This is the last part of this series.
At the time of writing, with my browser preferences, IP address, Google's mood and so forth. ↩
Note that this means that all effects of the left hand side are sequenced before those of the right hand side. Therefore, the comma operator is really similar to a semicolon, only that it does not separate two statements, but two expressions. To get an inkling of what this means, consider the following snippet trying to do an xor swapx^=y^=x^=y;
This snippet leads to undefined behavior, basically because it modifies a variable twice without sequencing one before the other. The built-in (non-overloaded) comma operator allows us to easily write a correct version of this snippet: x^=y, y^=x, x^=y;
, as it creates a sequence to which the evaluation must adhere. Note that this does not apply for the case in which a user-provided operator,
is used, which is just treated as a ordinary function call. ↩
While that is mostly correct and easy to reason about, it is not the full truth, even considering the exception for throw
expressions that the example abuses. The full rules can be found in the standard or at cppreference.com
This allows a variety of syntactic contortions - most of them about as pretty as an Elder God abusing an extra-fluffy kitten:
void stage_death(bool die) {
// look ma, no curlies:
if(die) ::std::cout << "[groans]\n", throw "blood";
}
void stage_death() {
// my teacher told me to always use a return statement
// if I want to leave a function:
return throw "blood";
}
void is_this_function_pure() {
if(throw "no", "something that can be ctxt converted to bool") {
// unreachable
}
}
Yes, there is a difference between the null pointer constant and an integer with value zero. Which meaning of 0
is chosen greatly depends on the context. ↩
A final kink: The main function is the only non-void
function in C++ that may legally not return a value when executed. Its result then implicitly indicates success (0
on all relevant platforms). ↩