Unicode handling comparison
Simen Kjærås
simen.kjaras at gmail.com
Wed Nov 27 12:13:07 PST 2013
On 27.11.2013 19:07, Andrei Alexandrescu wrote:
> On 11/27/13 7:43 AM, Jakob Ovrum wrote:
>> On that note, I tried to use std.uni to write a simple example of how to
>> correctly handle this in D, but it became apparent that std.uni should
>> expose something like `byGrapheme` which lazily transforms a range of
>> code points to a range of graphemes (probably needs a `byCodePoint` to
>> do the converse too). The two extant grapheme functions,
>> `decodeGrapheme` and `graphemeStride`, are *awful* for string
>> manipulation (granted, they are probably perfect for text rendering).
>
> Yah, byGrapheme would be a great addition.
It shouldn't be hard to make, either:
import std.uni : Grapheme, decodeGrapheme;
import std.traits : isSomeString;
import std.array : empty;
struct ByGrapheme(T) if (isSomeString!T) {
Grapheme _front;
bool _empty;
T _range;
this(T value) {
_range = value;
popFront();
}
@property
Grapheme front() {
assert(!empty);
return _front;
}
void popFront() {
assert(!empty);
_empty = _range.empty;
if (!_empty) {
_front = decodeGrapheme(_range);
}
}
@property
bool empty() {
return _empty;
}
}
auto byGrapheme(T)(T value) if (isSomeString!T) {
return ByGrapheme!T(value);
}
void main() {
import std.stdio;
string s = "তঃঅ৩৵பஂஅபூ௩ᐁᑦᕵᙧᚠᚳᛦᛰ¥¼Ññ";
writeln(s.byGrapheme);
}
--
Simen
More information about the Digitalmars-d
mailing list