Regex and UTF-8
Andrea Fontana
advmail at katamail.com
Fri Nov 18 05:58:31 PST 2011
I build a data access layer in c++. This layer works with mongo db where
string are always encoded using UTF-8. I've ported this layer in D using
swig. String is written correctly in console but when i use std.regex
sometimes it gives an exception:
core.exception.UnicodeException at src/rt/util/utf.d(290): invalid UTF-8
sequence
Byte sequence (for better undestanding) is:
[83, 195, 179, 32]
And the string was "Sò " (with accented o and a space)
I'm not a utf expert, so Is it a wrong utf-8 encoding or it is a bug on
utf.d?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20111118/6d7a7560/attachment.html>
More information about the Digitalmars-d
mailing list