More fun with autodecoding

Steven Schveighoffer schveiguy at gmail.com
Wed Aug 8 21:01:18 UTC 2018


On 8/8/18 4:13 PM, Walter Bright wrote:
> On 8/6/2018 6:57 AM, Steven Schveighoffer wrote:
>> But I'm not sure if the performance is going to be the same, since now 
>> it will likely FORCE autodecoding on the algorithms that have 
>> specialized versions to AVOID autodecoding (I think).
> 
> Autodecoding is expensive which is why the algorithms defeat it. Nearly 
> none actually need it.
> 
> You can get decoding if needed by using .byDchar or .by!dchar (forgot 
> which it was).

There is byCodePoint and byCodeUnit, whereas byCodePoint forces auto 
decoding.

The problem is, I want to use this wrapper just like it was a string in 
all respects (including the performance gains had by ignoring 
auto-decoding).

Not trying to give too much away about the library I'm writing, but the 
problem I'm trying to solve is parsing out tokens from a buffer. I want 
to delineate the whole, as well as the parts, but it's difficult to get 
back to the original buffer once you split and slice up the buffer using 
phobos functions.

Consider that you are searching for something in a buffer. Phobos 
provides all you need to narrow down your range to the thing you are 
looking for. But it doesn't give you a way to figure out where you are 
in the whole buffer.

Up till now, I've done it by weird length math, but it gets tiring (see 
for instance: 
https://github.com/schveiguy/fastaq/blob/master/source/fasta/fasta.d#L125). 
I just want to know where the darned thing I've narrowed down is in the 
original range!

So this wrapper I thought would be a way to use things like you always 
do, but at any point, you just extract a piece of information (a buffer 
reference) that shows where it is in the original buffer. It's quite 
easy to do that part, the problem is getting it to be a drop-in 
replacement for the original type.

Here's where I'm struggling -- because a string provides indexing, 
slicing, length, etc. but Phobos ignores that. I can't make a new type 
that does the same thing. Not only that, but I'm finding the 
specializations of algorithms only work on the type "string", and 
nothing else.

I'll try using byCodeUnit and see how it fares.

-Steve


More information about the Digitalmars-d mailing list