Ready for review: new std.uni

Walter Bright newshound2 at digitalmars.com
Fri Jan 11 13:36:21 PST 2013


On 1/11/2013 1:21 PM, Dmitry Olshansky wrote:
> 12-Jan-2013 00:50, Walter Bright пишет:
>> On 1/11/2013 11:31 AM, Dmitry Olshansky wrote:
>>> Anyway it's polished and ready for the good old collective destruction
>>> called
>>> peer review. I'm looking for a review manager.
>>
>> Thank you for doing this. Some notes:
>>
>
> Great bashing, thanks :)

Anytime!


>> struct InversionList:
>>
>>     1. It implements length and empty, but otherwise does not seem to be
>> a range, though its name implies it is a container.
>>
>
> Containers also have length and can be empty. Ranges on top of it are .byChar,
> .byInterval.

Some more text about accessing it as a range would be helpful. Being a set of 
"codepoints" yet accessing it "by char" makes no sense.

In unicode, the terms "char", "codepoint", "octet", "grapheme", etc., are 
bandied about and are often conflated and confused with each other. I suggest a 
brief section defining those terms, and then scrupulously, consistently, anally, 
and nitpickingly adhering to them in the documentation and names of things.


>>     2. The description "InversionList is a packed interval-based data
>> structure that represents a set of codepoints." completely baffles me.
>>
>
> OK, probably it was a bad idea to combine implementation detail and general
> description in one statement.
>
> How about: InversionList is a set of codepoints, optimized for size.
> It represents a set as an array of intervals using 6 bytes per interval of
> codepoints.
>
> Better?

The name says "list" and yet it is described as a "set". Being a "list" makes an 
implication that it is a linked list. BTW, what is an "interval"?


>>     3. The table for opBinary appears twice.
>>
>>     4. toSourceCode says "if the codepoint". If what codepoint? And what
>> funcName is ""?
>>
>> mapTrieIndex:
>>
>>     I have no idea what this does from the description. Is it a map like
>> in std.algorithms? I.e. does it work with ranges? Need an example.
>>
>> TrieBuilder:
>>
>>     1. No description. I'm completely baffled.
>>
>>     2. putRange? What does that do?
>>
>> General lack of examples.
>>
>> Trie:
>>
>>      Should have a wikipedia link to what a trie is.
>>
>> buildTrie:
>>
>>      Contains a number of functions that return auto, but no mention of
>> what is returned. While I like auto, in these cases it is not helpful,
>> because the user needs to know what type is returned.
>>
>
> Trust me you won't like it when you see it :) Part of the reason it's hidden.
> But you are correct that documentation on these artifacts is *ehm* ... sketchy.
> Will fix/update.

If the compiler type is hard to grok, ok, but at least the documentation should 
make it clear what is returned.


>> Grapheme:
>>
>>      Should provide an InputRange so a user can iterate over its
>> codepoints. It already appears to provide part of the range interface.
>
> Container != range. Just slice it and RA range is in your hands. I need to put
> this info somewhere prominent I guess.
>
> BTW foreach(v; Grapheme("abc")) { ... }
> should work because of automatic slicing.

Yes, but is the resulting v an octet, char, codepoint ????


> All to the point, thanks.

Welcs!


More information about the Digitalmars-d mailing list