[Issue 18462] New: std.regex.matchFirst doesn't work well with characters from extended ASCII
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Mon Feb 19 01:38:27 UTC 2018
https://issues.dlang.org/show_bug.cgi?id=18462
Issue ID: 18462
Summary: std.regex.matchFirst doesn't work well with characters
from extended ASCII
Product: D
Version: D2
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P1
Component: phobos
Assignee: nobody at puremagic.com
Reporter: greensunny12 at gmail.com
---
void main(string[] args)
{
import std.string, std.stdio, std.regex;
static ctr = regex(`^`);
// unicode works
string line = "µ";
line.representation.writeln; // [194, 181]
// but not extended ASCII
line = "\xB5"; // [181]
line.writeln; // works
auto m = line.matchFirst(ctr);
}
---
The error message is:
```
std.utf.UTFException@/usr/include/dlang/dmd/std/utf.d(1380): Invalid UTF-8
sequence (at index 1)
----------------
??:? pure dchar std.utf.decodeImpl!(true, 0, const(char)[]).decodeImpl(ref
const(char)[], ref ulong) [0x8884beda]
??:? pure @trusted dchar std.utf.decode!(0, const(char)[]).decode(ref
const(char)[], ref ulong) [0x8884be5d]
??:? pure @safe bool std.regex.internal.ir.Input!(char).Input.nextChar(ref
dchar, ref ulong) [0x8885e318]
```
--
More information about the Digitalmars-d-bugs
mailing list