Fixing std.string

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Thu Aug 19 20:39:35 PDT 2010


On 08/19/2010 09:22 PM, dsimcha wrote:
> As I mentioned buried deep in another thread, std.string is in serious need of
> fixing, for two reasons:
>
> 1.  Most of it doesn't work with UTF-16/UTF-32 strings.
>
> 2.  Much of it requires the input to be immutable even when there's no good
> reason for this constraint.

Absolutely. Thanks for looking into this!

> I'm trying to understand a few things before I dive into fixing it:
>
> 1.  How did it get to be this way?  Why did it seem like a good idea at the
> time to only support UTF-8 and only immutable strings?

I don't know - my guess is that UTF-8 is widespread in English-speaking 
countries and this is one.

> 2.  Is there any "deep" design/technical issue that makes these hard to fix,
> or is it basically just lack of manpower and other priorities?

The latter. I wanted to get to this for the longest time, and I think 
it's awesome that you're looking into it.

> 3.  Is there any good reason to avoid just templating everything to work with
> all 9 string types (mutable/const/immutable char/wchar/dchar[]) or whatever
> subset is reasonable for the given function?

There's no reason. But I hope we'd go a step further:

a) Aggressively make everything string-specific more general and move it 
into std.algorithm.

b) After (a) ideally std.string should contain only a modicum of 
string-specific stuff such as case and whitespace information. I believe 
the functionality of the following functions could easily be generalized 
and move to std.algorithm or std.range, perhaps consolidated with 
existing functionality and under a different name: cmp, indexOf, 
lastIndexOf, repeat, join, split, stripl, stripr, strip, chomp, 
chompPrefix, replace, replaceSlice, insert, count, maketrans, translate, 
squeeze, munch, succ, tr.

The other functions (or certain overloads of the above) stay put in 
std.string and should be indeed templated by input with the constraint

if (isSomeString!Str)

or better yet allow any input, forward, or bidirectional range (as the 
algorithm needs) constained by

if (isXxxRange!R && is(ElementType!R : dchar).

Thanks again for looking into this, it's important and rewarding work.


Andrei


More information about the Digitalmars-d mailing list