We need better documentation for functions with ranges and templates

rumbu via Digitalmars-d digitalmars-d at puremagic.com
Tue Dec 15 06:03:50 PST 2015


On Tuesday, 15 December 2015 at 12:28:02 UTC, ZombineDev wrote:
> On Tuesday, 15 December 2015 at 11:26:04 UTC, rumbu wrote:
>>
>> Looking at the .net source code, the Count extension method is 
>> also doing a best effort "count" by querying the ICollection 
>> interface.
>
> Yes, I have looked at the source code, before writing this, so 
> I knew exactly how it worked. In short : terrible, because it 
> relies only on OOP. But that's not the point. Why should anyone 
> need to look at the source code, to see what this function 
> does? I thought this is what the docs were supposed to tell.
>
>>
>> public static int Count<TSource>(this IEnumerable<TSource> 
>> [...]
>>
>> The Remarks section clearly states the same thing:
>>
>> "If the type of source implements ICollection<T>, that 
>> implementation is used to obtain the count of elements. 
>> Otherwise, this method determines the count."
>>
>>
>> And personally, I found the MS remark more compact and more 
>> user friendly than:
>> [...]
>
> If you look at table at the beginning of page 
> (https://dlang.org/phobos/std_range_primitives.html) you can 
> clearly see a nice concise description of the function. Even if 
> you don't know complexity theory there's the word "Compute" 
> which should give you an idea that the function performs some 
> non-trivial amount of work. Unlike:
>
>> Returns the number of elements in a sequence.
>
> Which implies that it only returns a number - almost like an 
> ordinary getter property. I am scared to think that if back 
> then C# got extension properties, it might have been 
> implemented as such.
>
>> Not everybody is licensed in computational complexity theory 
>> to understand what O(n) means.
>
> LOL. Personally, I would never want to use any software written 
> by a programmer, who can't tell the difference.
>
> Well ok, let's consider a novice programmer who hasn't studied 
> yet complexity theory.
>
> Option A: They look at the documentation and see there's some 
> strange O(n) thing that they don't know. They look it up in 
> google and find the wonderful world of complexity theory. They 
> become more educated and are grateful the people who wrote the 
> documentation for describing more accurately the requirements 
> of the function. That way they can easily decide how using such 
> function would impact the performance of their system.
>
> Option B: They look at the documentation and see that there's 
> some strange O(n) thing that they don't know. They decide that 
> it's extremely inhumane for the docs to expect such significant 
> knowledge from the reader and they decide to quit. Such novices 
> that do not want to learn are better off choosing a different 
> profession, than inflicting their poor written software on the 
> world.

We are talking about a better documentation, not about the C# vs 
D performance, we already know the winner. Since C# is an 
OOP-only language, there is only one way to do reflection - using 
OOP, (voluntarily ignoring the fact that NGen will reduce this 
call to a simple memory read in case of arrays).

Your affirmation:

> the docs don't even bother to mention that it is almost always 
> O(n), because non of the > Enumerable extention methods 
> preserve the underlying ICollection interace

was false and you don't need to look to the source code to find 
out, the Remarks section is self-explanatory:

"If the type of source implements ICollection<T>, that 
implementation is used to obtain the count of elements. 
Otherwise, this method determines the count."

This is a *good* documentation:
- "Count" is a better name than "walkLength"; every other 
programming language will use concepts similar to count, cnt, 
length, len.
- You don't need to understand computer science terms to find out 
what a function does;
- If you are really interested about more than finding out the 
number of elements, there is a performance hint in the Remarks 
section.
- Links are provided to concepts: even the return type (int) has 
a link.
- It clearly states what's happening if the range is not defined
- It clearly states what's happening if the range contains more 
than int.max elements

On the contrary, the D documentation, introduces a bunch of 
non-linked concepts, but it tells me that it's possible to 
perform O(n) evaluations:
- isInputRange
- isInfiniteRange
- hasLength
- empty
- popFront

There is no indication what happens if the range is undefined in 
D docs. In fact, inconsistent behavior:
- it will return 0 in case of null arrays;
- it will throw AccessViolation for null ranges (or probably 
segfault on Linux);

There is no indication what happens if the range contains more 
than size_t.max elements:
- integer overflow;

Like someone said: D has genius programmers, but worst marketers.








More information about the Digitalmars-d mailing list