New std.uni: ready for more beating

Dmitry Olshansky dmitry.olsh at gmail.com
Mon Feb 25 12:39:58 PST 2013


23-Feb-2013 21:14, H. S. Teoh пишет:
>> P.S. Time to go for the formal review?
> [...]
>
> Alright, I decided to just jump in and re-review std.uni. I *really*
> want to see this in Phobos, the sooner the better.
>

Great. Sorry, I had to put your comments on the back-burner, and then 
I'd found out that std.uni no longer compiles. New release darn it...

Turned out that some screws were tightened up in the compiler and some 
meaningless qualifiers are no longer accepted. That plus some vague shit 
with apparently obligatory @property on save() for isForwardRange trait 
(wtf?).

Phh... OK, now let's go over these.

> Here are some comments:
>
> - In the first part of the docs, Terminology section, under "Code unit":
>    I think you mistyped a ddoc macro, it should be ($(D char)) instead of
>    (($D char)).

Fixed.

> - lineSep, paraSep: are these fixed values? It would be nice to indicate
>    what their values are.


Yup, a carry-over from the old std.uni. Documented.

> - UnicodeDecomposition: it would be nice to document the values in this
>    enum.
>
> - normalize(): I think your code example has a duplicated line (NFKC
>    example appears twice).
>

Fixed.

> - allowedIn(): How about an example where a character is *not* allowed
>    in a normalization form?
>

These are typically hard to recognize visually but I'll try :)

> - InversionList.opBinary: I still prefer ^ instead of ~ for symmetric
>    difference. In D, ~ means "append", and it's very confusing when x~y
>    means symmetric difference instead of append.

I need more then a single opinion on this matter. Yes, I don't quite 
like '~' but it's the symbol used in std.regex patterns and to make 
matters worse '^' means something completely different there.

> - unicode.opDispatch: it would be nice to provide links to official
>    Unicode documentation that lists all blocks/scripts, as a reference.

I provided one, but it's probably not listed where it should be, see 
first paragraphs that outline the stuff in this module.

I'm thinking I'd better compose a small table of _guaranteed_ 
properties. This would allow me to safely cleanup the ridiculous sets 
later on. Since the stuff was extracted from Unicode data files, even 
I'm not sure which sets are there exactly :)

The guaranteed ones are Scripts, Blocks, General Category and few nice 
sets like ASCII (plus HangulSyllableType, of course!).

> - combiningClass: maybe provide a link to official Unicode docs that
>    list combining class values?

If there was one good link... I might make an enum with symbolic names 
for the ones that are in use and have meaning.

> OK, a lot of this is just nitpicks... but overall, this new std.uni
> looks very good. Looking forward to it being merged into Phobos!
>

And for that to happen somebody has to put on the review manager's robe 
and do the ceremony (hint) :)

I the meantime I need to present it also as a pull request to help 
reviewing the code.

-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list