Sorting char arrays
Magnus Lie Hetland
magnus at hetland.org
Tue Mar 13 02:13:23 PDT 2012
On 2012-03-12 17:24:56 +0000, Jonathan M Davis said:
> The problem is that sort requires a random access range, and narrow string
> (string and wstring) aren't, because they're variable length encoded. I'm not
> sure that strings _can_ be efficiently sorted, because of that, but maybe
> there's a sorting algorthm that could do it reasonably efficiently,
I'd certainly think that'd be posible. (Might, in fact, be a nice
problem for the students in my algorithms class ;)
I guess I'm just tripped up by the fact that char[] is treated as an
"almost-string", and hence is "infected" by the variable-length
encoding of string (i.e., const char[]).
This all makes sense, for sure -- at least as a practical tradeoff or
the like. (After all, UTF-8 is a super-elegant solution to a very
difficult problem.) It's just so easy to assume that T[] is a
random-access range. It seems so *obvious*. And it is true for any type
T except the variable-length chars... :)
In my case, I was just using char literals (and strings) as an easy way
of encoding test cases for a template class, and the sorting (+ uniq)
was just a way of removing duplicates. (Could've used a hash, of
course.) So I was really just treating it as a ubyte[]. Slightly
Evil[tm], I guess.
> and we could
> special case sort for narrow strings to use that one, but it's a while since I
> messed with sorting algorithms, so I don't remember all of their
> characteristics off of the top of my head. Certainly, with how sort is currenly
> implemented, it can't handle any range which isn't a random access range.
No, I get that. I was just assuming that any T[] could be treated as a
random access range if I really wanted to ;)
--
Magnus Lie Hetland
http://hetland.org
More information about the Digitalmars-d-learn
mailing list