Is this a bug? +goto
Michelle Long
HappyDance321 at gmail.com
Wed Nov 7 20:03:47 UTC 2018
On Tuesday, 6 November 2018 at 13:53:41 UTC, MatheusBN wrote:
> On Tuesday, 6 November 2018 at 05:46:40 UTC, Jonathan M Davis
> wrote:
>> On Monday, November 5, 2018 7:55:46 PM MST MatheusBN via
>> Digitalmars-d-learn wrote:
>>> On Tuesday, 6 November 2018 at 01:55:04 UTC, Jonathan M Davis
>>>
>>> wrote:
>>> >> And I found a bit strange that in such code, since "x" is
>>> >> never used, why it isn't skipped.
>>> >
>>> > It's skipped right over. The goto jumps out of the scope,
>>> > and the line with
>>> >
>>> > int x;
>>> >
>>> > is never run. In fact, if you compile with -w or -wi, the
>>> > compiler will give you a warning about unreachable code.
>>>
>>> That is exactly my point.
>>>
>>> Since "x" it's skipped and never used, it shouldn't just be a
>>> warning (unreachable code) instead of an error?
>>>
>>> I'm trying to understand why/when such code could give any
>>> problem.
>>>
>>> On the other hand if the code were:
>>>
>>> {
>>> goto Q:
>>> int x;
>>>
>>> Q:
>>> x = 10; // <- Now you are accessing an uninitialized
>>> variable.
>>> }
>>>
>>> Then I think an error would be ok.
>>
>> D tries to _very_ little with code flow analysis, because once
>> you start having to do much with it, odds are that the
>> compiler implementation is going to get it wrong. As such, any
>> feature that involves code flow analysis in D tends to be
>> _very_ simple. So, D avoids the issue here by saying that you
>> cannot skip the initialization of a variable with goto. The
>> compiler is not going to do the complicated logic of keeping
>> track of where you access the variable in relation to the
>> goto. That's exactly the sort of thing that might be obvious
>> in the simple case but is highly likely to be buggy in more
>> complex code. Code such as
>>
>> {
>> goto Q;
>> int x;
>> }
>> Q:
>>
>> or
>>
>> {
>> if(foo)
>> goto Q;
>> int x;
>> }
>> Q:
>>
>>
>> is fine, because the compiler can trivially see that it is
>> impossible for x to be used after it's been skipped, whereas
>> with something like
>>
>> goto Q;
>> int x;
>> Q:
>>
>> the compiler has to do much more complicated analysis of what
>> the code is doing in order to determine that, and when the
>> code isn't trivial, that can get _really_ complicated.
>>
>> You could argue that it would be nicer if the language
>> required that the compiler be smarter about it, but by having
>> the compiler be stupid, it reduces the risk of compiler bugs,
>> and most people would consider code doing much with gotos like
>> this to be poor code anyway. Most of the cases where goto is
>> reasonable tend to be using goto from inside braces already,
>> because it tends to be used as a way to more efficiently exit
>> deeply nested code. And with D's labeled break and continue,
>> the need for using goto outside of switch statements also
>> tends to be lower than it is in C/C++.
>>
>> - Jonathan M Davis
>
> It's clear now about this decision and by the way thanks for
> replying all my doubts.
>
> MatheusBN.
Don't let their psychobabble fool you. They are wrong and you
were right from the start.
There is no initialization of the variable, or, if there
is(because it's "on the tack, which is "initialized" at the start
of the function"), the variable is still never used and that is
the whole problem.
What you will find with some of these guys is they start with the
assumption that everything D does is correct then they try to
disprove anything that goes against it by coming up with reasons
that explain why D does it the way it does. It is circular
reasoning and invalid. Each step they come up with some new
explanation when you pick holes in their previous ones.
Eventually it's either "It's because D is not designed to do
that" or "write an enhancement yourself" type of answer.
The fact is simple: Who ever implemented the goto statement did
not create code to handle this case and chose the easiest route
which is to error out. This was either oversight or "laziness".
It's really simple as that. Not once has anyone proven that the
semantics are illogical, which is what it would require for the
compiler to be absolutely correct in it's error.
In this case, they are simple wrong because it requires no flow
analysis or any complex logic to determine. It's not because C is
stupid and is unsafe, it's unreachable, etc...
The compiler simply knows what line and scope a variable is
initialized on(since it can determine if a variable is used for
initialization, which is a logic error) and it simply has to
determine if the goto escapes the scope before using any
initialized variable.
It can do this easily but the logic was not added.
Case A:
{
if (true) goto X;
int x;
}
X:
Case B:
{
if (true) goto X;
{
int x;
}
}
X:
These two cases are EXACTLY the same semantically. It's like
writing A + B and (A + B).
What the extra scope does though is create a new scope in the
compiler AST and this separates the goto logic, which is properly
implemented to handle that case.
The fact that one produces one error and the other is valid
proves that the compiler is incomplete. Adding scopes does not
change semantics no different than adding parenthesis(which is
just scope). ((((((3)))))) is the same as 3. (obviously not all
scopes can be eliminated in all cases, but this isn't one of
those cases)
And, so, the real answer is simply the compiler does not test
this case. My point with the previous post was to point it out...
but as you see, a lot of the fanboys come in and simply defend
what D does as if it is the most valid way from the get go. This
is their mind set. They reason from their conclusions. I've seen
them do it quite often. I'm not sure what the motivations are. If
they don't understand the problem(Sometimes simple is very
confusing for some) or if they want to obfuscate or what.
The idea for any sane person would be to check and see if the
code has a semantically logical meaning first. In this case it
does. Goto is a common control flow feature and sometimes
necessary to greatly simplify certain problems(since D does not
have the ability to escape nested scopes such as return3, which
returns from 3 nested scopes in).
If one can transform logically the "offending" code in to a
semantically equivalent piece of code(this is known as
mathematical transformation, such as rewriting a mathematical
expression using logically valid rules) that involves no real
changes(such as adding scopes), and one fails and the other
doesn't, it means the compiler has a bug.
It's like when people drop parenthesis: (3 + 4)*2 =?= 3 + 4*2.
It's illogical. If the compiler did this transformation it would
produce invalid results and it would be impossible to reason
about code.
If the compiler gives errors for one of two identical
mathematical tree's(remember, programs are just mathematical
formulas, just really complex, but AST's abstractly the same)
then the compiler has a problem.
It's like saying that (3 + 4)*2 is invalid but 3*2 + 4*2 is valid.
It means the compiler did not implement the distributive property.
People that don't know what they are talking about will then try
to justify why one works and the other doesn't using some
circular or invalid logic rather than actually understanding what
is going on. It is damn near impossible to reason with these
people because they always start with their conclusion and try to
make all the pieces fit that conclusion. Sometimes they
eventually come around to a logical conclusion but only they've
created a rats nest of reasons and cannot proceed any further but
to say, basically, "it is what it is".
The problem is they still never understand what the actual
problem is... (because of the rats nest they have just made
themselves even more confused)
The problem with the goto is clearly stated and to counter it as
being illogical one must simply prove one example where it would
result in invalid logic(not crapping out the compiler... the
compiler is not perfect and so will have bugs and errors in it.
The goal is not to justify those bugs and errors but to fix them
so the compiler does a better job and is more logically
expressive).
e.g., two cases (the `Case` term is not part of a switch in D,
just use to denote the two possible scenarios)
Case A:
{
if (true) goto X;
int x;
}
X:
Case B:
{
if (true) goto X;
{
int x;
}
}
X:
Why is case A any different than case B(in general, the above is
an example, the compiler might optimize things, which we don't
want to do since optimizations are secondary effects that are not
as important as logical consistency)? We are simply talking about
the pure semantics of programming. It doesn't really matter what
language we use to express it, This is not a problem in D but a
problem in programming languages. The question is simply: Are the
two case semantically equivalent? (e.g., does (3) = 3? (5) = 5,
(x) = x, (((((x+y*3))))) = x+y*3, etc )
Since we are not thinking of any specific compiler(although we
have to use the syntax and language grammar of D since ultimately
it has to do with D and it has to be expressed in some language,
so D is the obvious choice) we can't use circular reasoning(e.g.,
D does it this way and D is right so...).
Now, the fact is, these are identical statements semantically...
trivially so. It really can't get any simpler. Doesn't matter
what D does. If D can't see that then D is incomplete.
Now, since we ultimately have to translate in to D and compilers
do strange things, it is possible that *in D* they are not
identical. E.g., if D inserted initialization of locals at the
start of scope and de-initializers at the end of scope, they
would not be the same.
which one could express as:
Case A:
int x;
{
if (true) goto X;
//int x;
}
~x;
X:
Case B:
{
if (true) goto X;
int x;
{
//int x;
}
~x;
}
X:
Which, it is clear that x is initialized before the goto in case
A and after in case B. This could cause problems(chances are if D
did something like this then it would result in invalid programs
and compilers bugs at some point).
Sometimes though, because compilers are very complex, it is
necessary to prevent certain cases from occurring so certain
other semantics can be used. Sometimes compilers simply crap out
precisely because that is the easiest thing to do. Of course, if
this is done, someone should know about it and be able to explain
why the compiler chose to do this rather than the most logical
thing.
Don't let people bludgeon you in to submission. Truth and logic
is not dictatorial but absolute.
More information about the Digitalmars-d-learn
mailing list