Sort characters in string

Jonathan M Davis newsgroup.d at jmdavisprog.com
Wed Dec 6 09:44:30 UTC 2017


On Wednesday, December 06, 2017 09:34:48 Ola Fosheim Grøstad via 
Digitalmars-d-learn wrote:
> On Wednesday, 6 December 2017 at 09:24:33 UTC, Jonathan M Davis
>
> wrote:
> > UTF-32 on the other hand is guaranteed to have a code unit be a
> > full code point.
>
> I don't think the standard says that? Isn't this only because the
> current set is small enough to fit? So this may change as Unicode
> grows?

It's most definitely the case right now, and given how Unicode decoding
works, I don't see how it could ever be the case that a UTF-32 code unit
would not be a code point - not without breaking all of the Unicode handling
in existence. And per wikipedia's short article on code points

----------------
The Unicode code space is divided into seventeen planes (the basic
multilingual plane, and 16 supplementary planes), each with 65,536 (= 216)
code points. Thus the total size of the Unicode code space is 17 × 65,536 =
1,114,112.
----------------

And uint.max is 4,294,967,295, leaving about 3855x space to grow into even
if they kept adding more code point values by adding more planes or however
that works.

I'd have to go digging through the actual standard to know for sure what it
actually guarantees though.

- Jonathan M Davis




More information about the Digitalmars-d-learn mailing list