Sutter's ISO C++ Trip Report - The best compliment is when someone else steals your ideas....

Tue Jul 10 01:50:39 UTC 2018

On Monday, 9 July 2018 at 22:50:07 UTC, Mr.Bingo wrote:
> On Tuesday, 3 July 2018 at 04:54:46 UTC, Walter Bright wrote:
>> On 7/2/2018 7:53 PM, John Carter wrote:
>>>> Step 2 is to (gradually) migrate std:: standard library 
>>>> precondition violations in particular from exceptions (or 
>>>> error codes) to contracts. The programming world now broadly 
>>>> recognizes that programming bugs (e.g., out-of-bounds 
>>>> access, null dereference, and in general all 
>>>> pre/post/assert-condition violations) cause a corrupted 
>>>> state that cannot be recovered from programmatically, and so 
>>>> they should never be reported to the calling code as 
>>>> exceptions or error codes that code could somehow handle.
>>> 
>>> Ah, that's a really nice statement.
>>
>> So, I have finally convinced the C++ world about that! Now if 
>> I can only convince the D world :-)
>>
>> (I'm referring to the repeated and endless threads here where 
>> people argue that yes, they can recover from programming bugs!)
>
> If this is the case then why do we need a reboot switch? Never 
> say never!
>
> If you really believe this then why do you print out minimal 
> debug information when an error occurs? If programming bugs 
> were essentially fatal, then wouldn't be important to give as 
> much information when they occur so they can easily be fixed so 
> they do not happen again?
>
> Having too much information is a good thing!

I have learnt some very hard and painful lessons over the last 
few years of working on an embedded device without an MMU.

The chief one is that relying on corrupted services, which are in 
an undefined state, are a startling Bad Thing to use to extract 
and record information.

It's a toss up as to whether the information extraction routine 
will crash or loop or produce garbage, and whether the routine 
that records the crash information crashes, or loops or records 
garbage.

The solution is to extract and stash only that information using 
services you can verify line by line.

ie. If it is possible it may be corrupted (eg. heap, RTOS 
services) don't use it.

Then reboot to put it into a defined state, and then persist the 
information.

With an MMU life is easier... you can rely on the kernel to take 
a coredump and persist that for you. But again, that is "outside" 
the run time of the program.

> Having too much information is a good thing!

Not if it is garbage, or crashes, or freezes the system because 
the services it uses are corrupt. Then its a Very Very Bad Thing.

The best approach I have found is to "crash early and often".

Seriously.

The earlier in the execution path you find the defect and fix it, 
the more robust your system will be.

Nothing creates flaky and unreliable systems more than allowing 
them to wobble on past the first point where you already know 
that things are wrong.