More fun with autodecoding
Steven Schveighoffer
schveiguy at gmail.com
Wed Aug 8 21:01:18 UTC 2018
On 8/8/18 4:13 PM, Walter Bright wrote:
> On 8/6/2018 6:57 AM, Steven Schveighoffer wrote:
>> But I'm not sure if the performance is going to be the same, since now
>> it will likely FORCE autodecoding on the algorithms that have
>> specialized versions to AVOID autodecoding (I think).
>
> Autodecoding is expensive which is why the algorithms defeat it. Nearly
> none actually need it.
>
> You can get decoding if needed by using .byDchar or .by!dchar (forgot
> which it was).
There is byCodePoint and byCodeUnit, whereas byCodePoint forces auto
decoding.
The problem is, I want to use this wrapper just like it was a string in
all respects (including the performance gains had by ignoring
auto-decoding).
Not trying to give too much away about the library I'm writing, but the
problem I'm trying to solve is parsing out tokens from a buffer. I want
to delineate the whole, as well as the parts, but it's difficult to get
back to the original buffer once you split and slice up the buffer using
phobos functions.
Consider that you are searching for something in a buffer. Phobos
provides all you need to narrow down your range to the thing you are
looking for. But it doesn't give you a way to figure out where you are
in the whole buffer.
Up till now, I've done it by weird length math, but it gets tiring (see
for instance:
https://github.com/schveiguy/fastaq/blob/master/source/fasta/fasta.d#L125).
I just want to know where the darned thing I've narrowed down is in the
original range!
So this wrapper I thought would be a way to use things like you always
do, but at any point, you just extract a piece of information (a buffer
reference) that shows where it is in the original buffer. It's quite
easy to do that part, the problem is getting it to be a drop-in
replacement for the original type.
Here's where I'm struggling -- because a string provides indexing,
slicing, length, etc. but Phobos ignores that. I can't make a new type
that does the same thing. Not only that, but I'm finding the
specializations of algorithms only work on the type "string", and
nothing else.
I'll try using byCodeUnit and see how it fares.
-Steve
More information about the Digitalmars-d
mailing list