Something goes wrong with range and sort?

Aleksandar Ruzicic aleksandar at ruzicic.info
Fri Jun 14 03:20:47 PDT 2013


On Friday, 14 June 2013 at 10:07:13 UTC, Andrea Fontana wrote:
>
> 	string[] test_filter(string[] words)
> 	{
> 		static blackList =
> 		[
> 			"d", "c", "e", "a", "è", "é", "e"
> 		].sort();
> 		
> 		return words.filter!((a) => 
> !blackList.assumeSorted.contains(a)).array;
> 	}
>
> 	test_filter(["a", "b", "test", "hello"]).writeln;
>
>
> This code crash! It's just a useless trimmed-down version of a 
> more complex code, just to show you the bug.
>
> If you remove accented letters from "blacklist" array it works 
> fine. Why?

sort() requires UTF32 input (a dstring/dchar[]) or it will fail.

UTF8 (i.e. string type in D) is a variable-length encoding and if 
such input is given to sort it would refuse to compile since it 
cannot sort it in-place (a guarantee sort() makes).

I guess that in case UTF8 input with 1-byte only characters is 
given to sort() it can detect that there are no variable length 
characters and that is probably the reason your code works when 
you remove non-ascii characters.


More information about the Digitalmars-d mailing list