Generality creep

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Thu Mar 28 13:04:11 UTC 2019


On 3/28/19 8:37 AM, Andrei Alexandrescu wrote:
> On 3/28/19 7:59 AM, ag0aep6g wrote:
>> On 28.03.19 12:51, Andrei Alexandrescu wrote:
>>> The problem is RefRange.
>>
>> too terse
> 
> Reasoning from first principles would go as follows. Context: algorithms 
> are expressed naturally in terms of ranges, and composite (higher-order) 
> ranges are built and work with ranges. Event: RefRange comes along, 
> exposes an oddity ("save(), copying, and assignment could do slightly 
> different things"), breaks a bunch of these structures, and requires 
> arcane changes.
> 
> The conclusion is not to operate such changes everywhere (i.e. reason by 
> analogy). The right conclusion is that save() is unnecessarily general 
> and underspecified.

Similar issues:

"People may attempt to use Phobos algorithms with ranges of enums based 
on character types, or other subtypes of character types." Reasoning by 
analogy: those should work, too. Leads to generality creep. Reasoning 
from first principles: UTF8 is predicated on one byte type, no real gain 
to be made.

"There are several ways of iterating strings, e.g. by code point, by 
code unit, or by grapheme." Reasoning by analogy: strings should be 
ranges, hence we need to choose one default mode of iteration. Reasoning 
from first principles: strings are wholes, not ranges, and iteration is 
chosen as needed.

"There are UTF8, UTF16, and UTF32 encodings." Reasoning by analogy: they 
should all work the same. Reasoning from first principles: what's really 
needed here? Realistically nobody uses UTF32, UTF16 is an evolutionary 
dead-end still alive mostly as a mild annoyance on Windows systems, and 
then everybody and their cat is using UTF8. It follows we should 
consolidate on one encapsulated string type that is NOT an array 
(because arrays are ranges) and that offers iteration modes on demand. 
All that gobbledygook with e.g. file and path functions supporting lazy 
ranges of characters should go. It's an unnecessary and expensive to 
maintain distraction.



More information about the Digitalmars-d mailing list