dmd foreach loops throw exceptions on invalid UTF sequences, use replacementDchar instead
jfondren
julian.fondren at gmail.com
Sun Nov 7 18:44:45 UTC 2021
On Sunday, 7 November 2021 at 02:12:36 UTC, zjh wrote:
> On Sunday, 7 November 2021 at 01:59:47 UTC, jfondren wrote:
>> On Sunday, 7 November 2021 at 01:12:19 UTC, zjh wrote:
>
> Rust has more than ten `kinds` of strings. Maybe we can add
> `2/3` one.
Meanwhile, in Rust:
```rust
#[cfg(test)]
mod tests {
fn type_of<T>(_: T) -> &'static str {
core::any::type_name::<T>()
}
const INVALID: &'static str = unsafe {
std::str::from_utf8_unchecked(&[
0x68, 0x65, 0x6c, 0x6c, 0x6f, 0xa7, 0x85, 0xaf, 0x74,
0x68, 0x65, 0x72, 0x65,
])
};
#[test]
fn iter_invalid() {
for c in INVALID.chars() {
println!("{} {}, {}", type_of(c), c as u32, c);
}
}
}
```
If you smuggle invalid UTF into a type that Rust expects to be
valid UTF (the same case as `string` in D, allegedly), then
Rust's equivalent of `foreach (dchar c; str) { }` just emits
invalid chars -- two of 'em, somehow.
104, 101, 108, 108, 110 - "hello"
453, 1012 - ???
104, 101, 114, 101 - "here" (the 't' is lost)
This is similar to `foreach (dchar c;
std.encoding.codePoints(str)) { }` which emits three dchars
between "hello" and "there", but which also has an assert failure
in non-release builds.
More information about the Digitalmars-d
mailing list