Combining decoding and matching
Dmitry Olshansky
dmitry.olsh at gmail.com
Sat Nov 16 05:48:49 PST 2013
16-Nov-2013 17:02, bearophile пишет:
> Dmitry Olshansky:
>
>> Pull & peek at preliminary results
>> https://github.com/D-Programming-Language/phobos/pull/1685
>>
>> Docs
>> http://blackwhale.github.io/phobos/std_uni.html#MatcherConcept
>
> Good. Are those ideas usable for other Phobos functions, like group?
>
> http://forum.dlang.org/thread/snnmkdmhxouqjqaneshu@forum.dlang.org?page=3#post-crnqodahnxjtuoqzisxw:40forum.dlang.org
>
Directly? - no. It's was all about preparing a matcher for a set of
codepoints in advance by using 4 (for UTF-8) distinct tables one per
encoded length.
As to group it has to find runs of identical items. It can be speed up
for Unicode if you take into account 2 simple tricks:
- you don't need to decode - just identify the size of current dchar
(stride) and see how many repetitions of such follow it;
- special case if the current (w)char ASCII (or BMP for UTF-16) so as to
speed up counting (1 char vs variable length slice of 1-4 chars, ditto
with wchar)
>
> Bye,
> bearophile
--
Dmitry Olshansky
More information about the Digitalmars-d
mailing list