Table of strings sorting problem

S. Chancellor dnewsgr at mephit.kicks-ass.org
Fri Mar 10 18:30:12 PST 2006


On 2006-03-10 17:20:35 -0800, Aarti <aarti at interia.pl> said:

> Hello all D-Fans!
> 
> I encountered a problem with string sorting according to Polish 
> language rules. Here is a simple test program:
> 
> // ----------------------------------
> import std.stdio;
> void main() {
> 	char[][] table;
> 	table.length=15;
> 	
> 	table[0]="ą";
> 	table[1]="a";
> 	table[2]="ć";
> 	table[3]="c";
> 	table[4]="ę";
> 	table[5]="e";
> 	table[6]="ń";
> 	table[7]="n";
> 	table[6]="ł";
> 	table[7]="l";
> 	table[8]="ó";
> 	table[9]="o";
> 	table[10]="ś";
> 	table[11]="s";
> 	table[12]="ź";
> 	table[13]="ż";
> 	table[14]="z";
> 
> 	table.sort;
> 
> 	foreach(char[] s; table) {
> 		writef(s);
> 	}
> 	writefln();
> }
> // ----------------------------------
> 
> Output of this test is:
> aceloszóąćęłśźż
> 
> when it should be:
> aącćeęlłoósśzźż
> 
> It looks like sort doesn't sort properly according to language rules.
> 
> Is it a known issue? How to sort strings in D according to language rules?
> 
> PS. Possibility of using Polish characters in class identifiers is for 
> me really cool. In C++ books in examples you can see all the time 
> Trojkat instead of Trójkąt (triangle) and it looks awful.
> 
> Regards
> Marcin Kuszczak

Sort works off of the binary value of a character.  To implement a sort 
algorithm for polish language on characters would need to be manually 
done by you.  You would need to specify a map from the character to 
it's sort order and sort based on that.   I'm not sure if the sort 
property takes a delegate, that was something that was proposed before. 
   You could mainly say it's coincidence that the latin characters fall 
in order numerically.  (It was probably done on purpose with the person 
who decided the ASCII character values though.)

-S.




More information about the Digitalmars-d mailing list