[review] new string type
Steven Schveighoffer
schveiguy at yahoo.com
Fri Dec 3 12:29:12 PST 2010
On Fri, 03 Dec 2010 14:40:30 -0500, Jerry Quinn <jlquinn at optonline.net>
wrote:
> I tend to do a lot of transforming strings, but I need to track offsets
> back to the original text to maintain alignment between the results and
> the input. For that, indexes are necessary and we use them a lot.
In my daily usage of strings, I generally use a string as a whole, not
individual characters. But I do occasionally use it.
Let's also understand that indexing is still present, what is deactivated
is the ability to index to arbitrary code-units. It sounds to me like
this new type would not affect your ability to store offsets (you can
store an index, use it later when referring to the string, etc. just like
you can now).
My string type does not allow for writeable strings. My plan was to allow
you access to the underlying char[] and let you edit that way. Letting
someone write a dchar into the middle a utf-8 string could cause lots of
problems, so I just disabled it by default.
Not sure how that affects your 'transforming' work, are you actually
changing the data or just lazily transforming? I'm interested to hear
whether you think my string type would be a viable alternative.
> Probably the right thing to do in this case is just pay for the cost of
> using dchar everywhere, but if you're working with large enough
> quantities of data, storage efficiency matters.
The huge advantage of using utf-8 is backwards compatibility with ASCII
for C functions.
-Steve
More information about the Digitalmars-d
mailing list