More fun with autodecoding

Steven Schveighoffer schveiguy at gmail.com
Mon Sep 10 12:21:36 UTC 2018


On 9/10/18 1:45 AM, Chris wrote:

> After a while your code will be cluttered with absurd stuff like this. 
> `.byCodeUnit`, `.byGrapheme`, `.array` etc. Due to my experience with 
> `splitter` et. al. I tried to create my own parser to have better 
> control over every step.

I considered that, but I'm still trying to make this buffer reference 
thing work. Phobos just needs to be fixed. This is actually not as 
hopeless as I once thought. But what needs to happen is all of Phobos 
algorithms need to be tested with byCodeUnit et. al.

> After a few *minutes* of testing things I ran 
> into this bug [1] that didn't get fixed till early 2018. I never started 
> to write my own step-by-step parser. I'm glad I didn't.

It actually was fixed accidentally in 2017 in this PR: 
https://github.com/dlang/druntime/pull/1952. The bug was closed in 2018 
when someone noticed the code no longer failed.

Essentially, the whole string switch algorithm was replaced with a 
completely rewritten better approach. This is a great example of why we 
should be moving more of the compiler magic into the library -- it's 
just easier to write and understand there.

> I wish people began to realize that string handling is a basic necessity 
> and that the correct handling of strings is of utmost importance. Please 
> keep us updated on how things work out (or not) for you.

Absolutely, D needs to have great support for string parsing and 
manipulation. The potential is awesome.

I will keep it up, what I'm trying to fix is the fact that using 
std.algorithm to extract pieces from a buffer, but then using the 
position in that buffer to determine things (i.e. parsing) is really 
difficult without some stupid requirements like pointer math.

> [Please, nobody answer my post pointing out that a) we don't understand 
> Unicode and b) that it's an insult to the Universe to draw attention to 
> flaws that keep pestering us on an almost daily basis - without trying 
> to fix them ourselves stante pede. As is clear from Steve's efforts, the 
> Universe doesn't seem to care.)

I don't characterize it as the universe not caring. Phobos has a legacy 
problem with string handling, and it needs to somehow be addressed -- 
either by painfully extracting the problem, or painfully working around 
it. I don't think anyone here thinks there isn't a problem or that it's 
insulting to bring it up. But anything that needs to be done is painful 
either way, which is why it's not happening very fast.

-Steve


More information about the Digitalmars-d mailing list