Algorithms should be free from rich types

Tue Jul 4 02:14:38 UTC 2023

On 7/3/23 3:27 PM, H. S. Teoh wrote:
> On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer via Digitalmars-d wrote:
>> On 7/3/23 2:05 PM, H. S. Teoh wrote:
> [...]
>>> I think we all agree that the mechanics of this won't (and
>>> shouldn't) change. But I think the OP was arguing at a higher level
>>> of abstraction.  It isn't so much about whether private should be
>>> overridable or not, or even whether some piece of data in an object
>>> should be private or not; the question IMO is whether the library
>>> could have been designed in such a way that there's no *need* for
>>> private data in the first place. Or at least, the need for such is
>>> minimized.
>>>
>>> A library with tons of private state and only a rudimentary public
>>> API is generally more likely to have situations where the user will
>>> be left wishing that there were a couple more knobs to turn that can
>>> be used to customize the library's behaviour.
>>
>> But that's the thing, there are parts that *simply must be private*.
>> No matter how you cut it, it has to have some level of privacy,
>> because otherwise, you can't enforce semantic invariants with the
>> type.
>>
>> Should array length (not the property, but the actual data field) be
>> public?  What about the pointer? Of course not. Yet, you still might
>> want to access those things for some reason. That doesn't mean it's
>> worth a change to public just for that one reason.
> 
> We're actually agreeing with each other, y'know. :-D
> 

Yeah kind of. It's just that there are 2 types of privacy labeling, 
careless and designed.

> As I said, the *ideal* is that you wouldn't have private state, or that
> the private state would be minimal.  In practice, of course, certain
> things *should* be private, and that's not a problem. The problems the
> OP described arise when either private is used carelessly, causing
> things to be private that really need not be, or the API is poorly
> designed, so that parts of the library that ought to be reusable aren't
> just because of some arbitrary decision made by the author.

If you carelessly label your fields as public, then realizing later they 
should have been private is costly, maybe impossible.

If you carelessly label your fields as private, while it might upset 
some people, making them public later is easy.

So if you are going to "not care" about public/private, technically the 
less risky choice is to make everything private, and worry about it 
later if it becomes an issue. So in that sense I disagree with the OP point.

That being said, I've done a lot of libs where I just don't care and 
leave everything public. It's mostly because I don't expect widespread 
usage, and I also don't mind breaking peoples code (I don't think any of 
my projects that I started are past 1.0 yet). But something like Phobos 
shouldn't be so careless. We really should continue to make careless 
things private unless there is a good reason to make them public.

> 
> I've never heard people complaining about how the array length data
> field is private, for example.  That's because it being private does not
> hinder the user from doing whatever he wants to do with the array (short
> of breaking the implementation and doing something involving UB, of
> course).  That's an example of proper usage of private.

It's an obvious example that we all can agree on. If we agree there are 
clearly cases where private is important, than we start working our way 
back to where the line should be drawn.

> An example of where private hinders what a user might wish to do is an
> algorithm used internally by the library, that for whatever reason is
> private and unusable outside of the library code, even though the
> algorithm itself is general and can be applied outside of the scope of
> the library.  Often in such cases there are immediate pragmatic reasons
> for it -- the implementation of the algorithm is bound to internal
> implementation details of other library code, for example. So you can't
> actually make it public without also making lots of things public that
> probably shouldn't be.  But at a higher level, one asks the question,
> why is that algorithm implemented in that way in the first place?  It
> could have been implemented generically, and the library could have used
> just a specialized instance of it to solve whatever it is it needs to
> solve, but the algorithm itself should be available for user code to
> use.  *That's* the proper design.

I agree that some things shouldn't be private. But what's the answer? 
When it should be public, just change it to public!

An actual example of this in Phobos is the absence of a binary search 
algorithm. It's there, in SortedRange. But that implementation is 
private basically for no good reason (it can be trivially extracted into 
its own function). And SortedRange in itself is a schizophrenic meld of 
overbearing restrictions and puzzling allowances.

The only reason I haven't made a PR for it is I just made a copy in my 
own code and have moved on. But it would probably be pretty trivial to 
expose.

-Steve