Of possible interest: fast UTF8 validation

Jack Stouffer jack at jackstouffer.com
Wed May 16 18:04:54 UTC 2018


On Wednesday, 16 May 2018 at 17:18:06 UTC, Joakim wrote:
> I think you know what I'm referring to, which is that UTF-8 is 
> a badly designed format, not that input validation shouldn't be 
> done.

UTF-8 seems like the best option available given the problem 
space.

Junk data is going to be a problem with any possible string 
format given that encoding translations and programmer error will 
always be prevalent.


More information about the Digitalmars-d mailing list