More fun with autodecoding

Steven Schveighoffer schveiguy at gmail.com
Sat Sep 8 15:36:25 UTC 2018


On 8/9/18 2:44 AM, Walter Bright wrote:
> On 8/8/2018 2:01 PM, Steven Schveighoffer wrote:
>> Here's where I'm struggling -- because a string provides indexing, 
>> slicing, length, etc. but Phobos ignores that. I can't make a new type 
>> that does the same thing. Not only that, but I'm finding the 
>> specializations of algorithms only work on the type "string", and 
>> nothing else.
> 
> One of the worst things about autodecoding is it is special, it *only* 
> steps in for strings. Fortunately, however, that specialness enabled us 
> to save things with byCodePoint and byCodeUnit.

So it turns out that technically the problem here, even though it seemed 
like an autodecoding problem, is a problem with splitter.

splitter doesn't deal with encodings of character ranges at all.

For instance, when you have this:

"abc 123".byCodeUnit.splitter;

What happens is splitter only has one overload that takes one parameter, 
and that requires a character *array*, not a range.

So the byCodeUnit result is aliased-this to its original, and surprise! 
the elements from that splitter are string.

Next, I tried to use a parameter:

"abc 123".byCodeUnit.splitter(" ");

Nope, still devolves to string. It turns out it can't figure out how to 
split character ranges using a character array as input.

The only thing that does seem to work is this:

"abc 123".byCodeUnit.splitter(" ".byCodeUnit);

But this goes against most algorithms in Phobos that deal with character 
ranges -- generally you can use any width character range, and it just 
works. Having a drop-in replacement for string would require splitter to 
handle these transcodings (and I think in general, algorithms should be 
able to handle them as well). Not only that, but the specialized 
splitter that takes no separator can split on multiple spaces, a feature 
I want to have for my drop-in replacement.

I'll work on adding some issues to the tracker, and potentially doing 
some PRs so they can be fixed.

-Steve


More information about the Digitalmars-d mailing list