Regex and utf8

Walter Bright newshound1 at digitalmars.com
Sun Jul 20 12:45:34 PDT 2008


Roman Balitskiy wrote:
> When I try to parse cyrillic text I get "Error: 4invalid UTF-8 sequence". I use dmd 1.030 on Ubuntu 8.04 with utf8 locale. I have tryed upcomming gdc 0.25 with the same results.
> 
> 	if (auto m = std.regexp.search(`abжdef`, `[ж]`))   // Here is cyrillic letter 'je'
> 		writefln("%s[%s]%s", m.pre, m.match(0), m.post);
> 


The back quotes are for wysiwyg strings, and the UTF translation doesn't 
happen. Try using "" strings instead.



More information about the Digitalmars-d mailing list