How to correctly deal with unicode strings?

Gary Willoughby dev at nomad.so
Wed Nov 27 06:34:13 PST 2013


I've just been reading this article: 
http://mortoray.com/2013/11/27/the-string-type-is-broken/ and 
wanted to test if D performed in the same way as he describes, 
i.e. unicode strings being 'broken' because they are just arrays.

Although i understand the difference between code units and code 
points it's not entirely clear in D what i need to do to avoid 
the situations he describes. For example:

import std.algorithm;
import std.stdio;

void main(string[] args)
{
	char[] x = "noël".dup;

	assert(x.length == 6); // Actual
	// assert(x.length == 4); // Expected.

	assert(x[0 .. 3] == "noe".dup); // Actual.
	// assert(x[0 .. 3] == "noë".dup); // Expected.

	x.reverse;

	assert(x == "l̈eon".dup); // Actual
	// assert(x == "lëon".dup); // Expected.
}

Here i understand what is happening but how could i improve this 
example to make the expected asserts true?


More information about the Digitalmars-d-learn mailing list