try/catch idiom in std.datetime

Mon Nov 18 15:42:13 PST 2013

On Monday, November 18, 2013 14:40:51 Walter Bright wrote:
> I'm glad we're discussing this. There's an important misunderstanding.
> 
> On 11/18/2013 2:16 PM, Jonathan M Davis wrote:
> > But that would just duplicate the validation. You validate by parsing the
> > string, and you extract the necessary data from it by parsing it.
> > Validating the data first would just double the work - on top of the fact
> > that strings are most likely to have come from outside the program rather
> > than having been generated internally and then parsed internally. This is
> > exactly the sort of case where I think that separate validation makes no
> > sense. Separate validation only makes sense when the result is _less_
> > overhead, not more.
> The point of asserts is that they are supposed to be redundant. (If they
> were not redundant, then they would trip and you'd have a program bug.)
> Asserts are there to detect program bugs, not to validate input data. This
> is also why the optimizer can remove asserts without affecting the meaning
> of the code.

I understand this. The problem is that in some cases, in order to do the 
check, you have to do all of the work that the function you're trying to 
protect bad input from has to do anyway - even without it asserting anything. 
So, having a separate function do the checking would just be extra overhead. 
For instance, if you had to parse a string in order to get data out of it, the 
function checking the string's validity would have parse the string out and 
get all of the data out of it, validating the string's format and the data's 
validity in the process, whereas the function that does the actual parsing to 
give you the result (as opposed to checking the input) has to do all of that 
same parsing and data extraction. Maybe, if another function had already 
validated the string, it could avoid a few of the checks, but many of them 
have to be done just to parse the string (e.g. if its format is wrong, you 
can't even get at the data properly, regardless of whether the data is valid 
or not). So, you don't save much in the way of checking if you have a 
validation function, and you add overhead, because the data has to be 
processed twice.

In other cases, validation is as simple as asserting something about the 
input, in which case, it's simple enough to assert within the function (which 
would then go away in -release) and to have a validation function do the 
checking and get no extra overhead, but that's not always the case, and when 
it's not, it makes no sense to me to use DbC. In such cases, defensive 
programming makes far more sense.

Also, if the data _always_ has to be checked (which isn't always the case in 
std.datetime), then it makes no sense to separate the validation from the 
function doing the work.

I think that whether DbC or defensive programming is more appropriate comes 
down primarily to two things:

1. Does the validation need to be part of the function for it to do its job, 
or does doing the validation require doing what the function is going to do 
anyway? If so, the defensive programming makes more sense. If not, then DbC 
makes more sense.

2. Is this function treating its caller as part of the program or as a user? 
If the caller is being treated as part of the program, then DbC tends to make 
sense, as its reasonable to require that the caller knows what the function 
requires and is effectively part of the same code as the function. If the 
caller is being treated as a user (as is often going to be the case with 
libraries), then it's generally better to use defensive programming, because 
it ensures that the function gets and operates on valid input rather than 
resulting in undefined behavior when the caller gives bad input (and unless the 
library is compiled without -release or the function is templated, assertions 
won't do anything to help in a library).

Efficiency tends toward lean towards using DbC, whereas user-friendliness leans 
toward defensive programming. In general, I would use DbC internally to a 
program and defensive programming in a library.

Having validator functions definitely helps with DbC, as it gives the caller a 
way to validate the input when necessary and avoid the validation when it 
isn't. But it puts all the onus on the caller and makes it much, much more 
likely that functions will be misused, and if the function is in a library, 
then the odds are that if validation is done incorrectly by the caller, it'll 
never get checked by the callee, and you'll end up with buggy code with 
undefined behavior.

I think that you bring up good points, but I also don't think that the 
situation is anywhere near as clearcut as you do.

- Jonathan M Davis