Flag proposal

Sat Jun 11 08:40:17 PDT 2011

On 2011-06-11 09:56:28 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail at erdani.org> said:

> On 6/11/11 8:16 AM, Michel Fortin wrote:
>> On 2011-06-11 07:54:58 -0400, Andrei Alexandrescu
>> <SeeWebsiteForEmail at erdani.org> said:
>> 
>>> Consider two statements:
>>> 
>>> 1. "I dislike Flag. It looks ugly to me."
>>> 
>>> 2. "I dislike Flag. Instead I want named arguments."
>>> 
>>> There is little retort to (1) - it simply counts as a vote against.
>>> For (2) the course of action is to point out the liabilities of
>>> changing the language.
>> 
>> I'm actually not sure whether I want named arguments or not, but I'm
>> quite sure I don't want to use Flag!"" in my code. I'd actually prefer a
>> simple bool parameter to Flag!"".
>> 
>> Currently, it looks like we have these possibilities:
>> 
>> // definition // call with a constant
>> 
>> void func(bool abc); -> func(true);
> 
> The call entails simple data coupling as documented by Steve McConnell: 
> you can pass any unstructured Boolean for any meaning of abc.

Which is often useful if the value is conditional to a boolean 
expression. The only lacking thing is the parameter name which would 
make things clear to the reader.

Structured data is useful only if you pass it around; if you use it 
only once as a function parameter and nowhere else, then it just gets 
in the way. If your argument was that structured data is always 
preferred to unstructured data, I disagree.

>> enum Abc { no, yes }
>> void func(Abc abc); -> func(Abc.yes);
> 
> To add the documentation effort:
> 
> /**
> This is an argument for func. Refer to func below.
> */
> enum Abc {
>    no, /// you don't want func to do Abc
>    yes /// you do want func to do Abc
> }
> 
> /**
> This is func. Mind Abc defined above.
> */
> void func(Abc abc);
> 
> I think we agree this is rather awkward (I know because I wrote a fair 
> amount of such).
> 
> So we have the advantage of a nice call syntax and the disadvantage of 
> verbose definition and documentation.

Yes, and I think most of the time this should be a bool. Or to be 
precise: if it's not worth documenting separately, especially if it's 
used just once as a flag to a specific function, and if you don't 
expect it to extend to more than yes/no, then it should be a bool.

>> void func(Flag!"Abc" abc); -> func(Flag!"Abc".yes);
>> -> func(yes!"Abc");
>> -> func(Yes.Abc);
>> 
>> which then becomes this if you're using a boolean expression instead of
>> a constant:
> 
> Aha! This reasoning is flawed as I'll explain below.
> 
>> // definition // call with an expression
>> 
>> void func(bool abc); -> func(expression);
>> 
>> enum Abc { no, yes }
>> void func(Abc abc); -> func(expression ? Abc.yes : Abc.no);
>> -> func(cast(Abc)expression);
>> 
>> void func(Flag!"Abc" abc); -> func(expression ? Flag!"Abc".yes :
>> Flag!"Abc".no);
>> -> func(expression ? yes!"Abc" : no!"Abc");
>> -> func(expression ? Yes.Abc : No.Abc);
>> -> func(cast(Flag!"Abc")expression);
>> 
>> My take on this is that we shouldn't try to reinvent the boolean in the
>> standard library.
> 
> I think this characterization is wrong. Let me replace the meaningless 
> Abc with an actual example, e.g. OpenRight in std.algorithm.
> 
> OpenRight is not a Boolean. Its *representation* is Boolean. It is 
> categorical data with two categories. You can represent it with an 
> unstructured Boolean the same way you can represent an automaton state 
> with an unstructured integer or temperature with an unstructured 
> double, but then you'd have the disadvantages that dimensional analysis 
> libraries are solving.
> 
> For representing categorical data with small sets, programming 
> languages use enumerated types. This is because in a small set you can 
> actually give name each element. That way you have a separate type for 
> the categorical data so you can enjoy good type checking. The mistake I 
> believe you are making is the conflation of a categorical data with two 
> categories with an unstructured Boolean. By making that conflation you 
> lose the advantages of good typechecking in one fell swoop.

I think you're misinterpreting. I don't like yes/no enums because I 
don't find the value names meaningful, but I'm perfectly fine with 
two-element enums if they are properly named.

> (But not all categorical data is a small set, and consequently 
> enumerated types are insufficient. Consider e.g. the notion of a user 
> id. People routinely use integers for that, and suffer endless 
> consequences because of bugs caused by unstructured integers posing as 
> user IDs. I have seen instances of such bugs in several codebases in 
> different languages.)

I totally agree with making specific types to avoid mixing unrelated 
things, as long as it's reasonable. You wouldn't argue for a UserId 
type if values of this type weren't passed around.

> As a direct consequence, it is *wrong* to desire to pass an 
> unstructured Boolean expression in lieu of OpenRight. So it is *good* 
> that you can't. What you *should* be doing is to define an OpenRight 
> value in the first place and use it, or construct it in place with 
> "expr ? OpenRight.yes : OpenRight.no", with the advantage that the 
> conversion intent is explicit and visible.

But boundaries can be open or closed on the right, but also on the 
left. Unfortunately, because you choose to call the enum OpenRight, it 
can only be used on the right, and nowhere else.

What you're doing with OpenRight, and more generally with Flag!"", is 
narrowing excessively the category to the point where it can be used at 
one place and one place only: as a specific parameter to a specific 
function. If you had another parameter for the left side, you'd create 
an OpenLeft enum with exactly the same choices. I doubt this kind of 
categorization has any advantage.

Actually, I think the advantage you seek has nothing to do with 
categorization and much more to do with a desire to see those parameter 
names appear at the call site. You're actually using 
over-categorization to achieve that, and with Flag!"" you're going to 
make this systematic. Sorry, I can't approve.

>> If you want to replace a bool with a two-option enum
>> at some places for clarity, that's fine. But I wouldn't elevate that to
>> a pattern meant to be used everywhere. And personally, I don't like the
>> proliferation of yes/no enums: if you use an enum, value names should be
>> more meaningful than a simple yes/no.
> 
> I think you'd be entirely wrong to make this distinction. There's zero, 
> one, and many. Not zero, one, two, and many.

No idea what you mean there.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/