fixedstring: a @safe, @nogc string type
Moth
postmaster at gmail.com
Wed Jan 12 19:50:51 UTC 2022
On Tuesday, 11 January 2022 at 12:22:36 UTC, WebFreak001 wrote:
> [snip]
>
> you can relatively easily find out how many bytes a string
> takes up with `std.utf`. You can also iterate by code points or
> graphemes there if you want to translate some kind of character
> index to byte position.
>
> HOWEVER it's not clear what a character is. Sure for the posted
> cases here it's no problem but when it comes to languages based
> on combining glyphs together to form new glyphs it's no longer
> clear what is a character. There are Graphemes (grapheme
> clusters) which are probably the closest to what everybody
> would think a character is, but IIRC there are edge cases with
> that a programmer wouldn't expect, like adding a character not
> increasing the count of characters of the string because it
> merges with the last Grapheme. Additionally there is a
> performance impact on using Graphemes over simpler things like
> codepoints which fit 98% of use-cases with strings. Codepoints
> in D are mapped 1:1 using dchar, take up to 2 wchars or up to 4
> chars. You can use `std.utf` to compute byte lengths for a
> codepoint given a string.
aha, i think i might have miscommunicated here - i was talking
about an error i thought i was having where a fixedstring of
`"áéíóú"` wasn't equal to a string literal of the same, but as it
turned out i was misreading the error message [i had been trying
to assign a literal larger than the fixedstring could take]. to
tell the truth, unicode awareness is... not something i really
want to mess with right now, lol. it would be nice to have the
option at some point in the future though.
> I would rather suggest you support FixedString with types other
> than `char`. (wchar, dchar, heck users could even use any
> arbitrary type and use this as array class) For languages that
> commonly use more than 1 byte per codepoint or for interop with
> Win32 unicode APIs, JavaScript strings, C# strings, UTF16 files
> in general, etc. programmers might opt to use FixedString with
> wchar then.
>
> With D's templates that should be quite easy to do (add a
> template parameter to the struct like `struct
> FixedString(size_t maxSize, CharT = char)` and replace all
> usage of char in your code with `CharT` in this case)
[i've pushed an update to the repo for
this!](https://github.com/Moth-Tolias/fixedstring/releases/tag/v1.1.0) =] it was a bit more complicated than a simple replace all, but not too hard.
More information about the Digitalmars-d-announce
mailing list