[phobos] UTF-8 string slicing

Benjamin Shropshire benjamin at precisionsoftware.us
Fri Aug 19 19:58:34 PDT 2011


On 08/18/2011 02:21 AM, unDEFER wrote:
> Hello!
>
> D language specification says that it supports UTF-8 strings, but I can't
> find how to slice UTF-8 string by character index, not by bytes numbers.
> Why there is no simple slice function in std.utf like attached code?

BTW: your code is flawed. Feed it some of the stuff near the end of this 
post and it will fail:

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

tl;dr; your code doesn't slice on characters but something called (IIRC) 
code points. If you start worrying about diacritic (and many end user 
will want you to)
you need to do a bunch more processing.

http://en.wikipedia.org/wiki/Diacritic

> Thank you in advance.
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20110819/261f6574/attachment.html>


More information about the phobos mailing list