Fixing std.string
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Thu Aug 19 20:39:35 PDT 2010
On 08/19/2010 09:22 PM, dsimcha wrote:
> As I mentioned buried deep in another thread, std.string is in serious need of
> fixing, for two reasons:
>
> 1. Most of it doesn't work with UTF-16/UTF-32 strings.
>
> 2. Much of it requires the input to be immutable even when there's no good
> reason for this constraint.
Absolutely. Thanks for looking into this!
> I'm trying to understand a few things before I dive into fixing it:
>
> 1. How did it get to be this way? Why did it seem like a good idea at the
> time to only support UTF-8 and only immutable strings?
I don't know - my guess is that UTF-8 is widespread in English-speaking
countries and this is one.
> 2. Is there any "deep" design/technical issue that makes these hard to fix,
> or is it basically just lack of manpower and other priorities?
The latter. I wanted to get to this for the longest time, and I think
it's awesome that you're looking into it.
> 3. Is there any good reason to avoid just templating everything to work with
> all 9 string types (mutable/const/immutable char/wchar/dchar[]) or whatever
> subset is reasonable for the given function?
There's no reason. But I hope we'd go a step further:
a) Aggressively make everything string-specific more general and move it
into std.algorithm.
b) After (a) ideally std.string should contain only a modicum of
string-specific stuff such as case and whitespace information. I believe
the functionality of the following functions could easily be generalized
and move to std.algorithm or std.range, perhaps consolidated with
existing functionality and under a different name: cmp, indexOf,
lastIndexOf, repeat, join, split, stripl, stripr, strip, chomp,
chompPrefix, replace, replaceSlice, insert, count, maketrans, translate,
squeeze, munch, succ, tr.
The other functions (or certain overloads of the above) stay put in
std.string and should be indeed templated by input with the constraint
if (isSomeString!Str)
or better yet allow any input, forward, or bidirectional range (as the
algorithm needs) constained by
if (isXxxRange!R && is(ElementType!R : dchar).
Thanks again for looking into this, it's important and rewarding work.
Andrei
More information about the Digitalmars-d
mailing list