dmd foreach loops throw exceptions on invalid UTF sequences, use replacementDchar instead
Guillaume Piolat
first.last at gmail.com
Wed Nov 10 11:47:16 UTC 2021
On Wednesday, 10 November 2021 at 10:23:31 UTC, Ola Fosheim
Grøstad wrote:
> On Friday, 5 November 2021 at 10:13:13 UTC, Guillaume Piolat
> wrote:
>> Well you only know that it is meant to be utf8 in the context
>> of the auto-decoding foreach (which must still exist). string
>> in actual programs may contains binary files, strings in other
>> codepages encodings.
>
> I had a look at the [documentation](
> https://dlang.org/spec/arrays.html#strings ) today, and it said:
>
> «char[] strings are in UTF-8 format.»
>
> I would assume that this is normative? Maybe change the
> documentation to use more forceful specification language so
> that it says: «char[] strings MUST be in UTF-8 format.»
I'm not sure what is intended.
import("file.stuff") yields string.
So there is at least one gap, as it is often used with binary
files that ain't UTF-8.
Also look at that signature:
https://dlang.org/phobos/std_utf.html#validate
By spec it shall only return true then.
It seems in practice it doesn't have to be utf-8 until you use
something that assume it is. Which is ok for me.
More information about the Digitalmars-d
mailing list