How to reverse char[]?
Jonathan M Davis
jmdavisProg at gmx.com
Wed Feb 8 10:10:21 PST 2012
On Wednesday, February 08, 2012 17:52:17 Manfred Nowak wrote:
> Jonathan M Davis wrote:
> > thanks to how unicode works
>
> This does not mean, that the data structure representing a sequence of
> "letters" has to follow exactly the "working" you cited above. That
> data structure must only enable it efficiently. If a requirement for
> sequences of letters is, that a sequence `s' of letters indexed by some
> natural number `n' gives the letter `s[n]' and that is not efficiently
> possible, than unicode and its "workings" are as maldesigned as the
> alphabet Gutenberg has to take to produce books:
>
> take randomly an ancient book `b' and randomly a letter `c'. Then try
> to verify that `b[ 314.159] == c'. Of course you are allowed to read
> only one letter.
It is impossible to have a random access range of characters with unicode
unless you have a range of graphemes - which would require a grapheme to be a
struct of some kind which represented a character - either that or an array of
arrays. So, you could have
char[][]
where each char[] is a grapheme. But as long as you're dealing with an array
of code units or code points like we do now, it's impossible to have efficient
random access of characters. Phobos currently takes the tact of treating a
code point as a character, which _mostly_ works, but it's not correct.
And while unicode could definitely have been designed better IMHO (e.g. forcing
code point order with modifying code points and _not_ having multiple ways to
generate the same character), the core problem is that you're forced to have
variable length encodings. It wouldn't be feasible to have an integral value
which represented _every_ single character, because of the combinatorial
explosion caused by code points which modify other code points (e.g.
subscript, superscript, cedille, etc.). So, there are problems which are just
integral to the issue of designing unicode and which cannot be avoided no
matter how good a job you do at designing unicode. And, of course, there are
issues with the design on top of that.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list