dchar unicode phobos

Wed Jun 7 10:04:43 PDT 2006

Sean Kelly skrev:
> Johan Granberg wrote:
>>
>> Yes I have needed support for dchar[] with functions like split , 
>> splitline and strip in std.string.
>> Yes your ways of doing the support looks ok, I would choose 3 thou 
>> instead of 1. It may bee because I'm not 100% sure about how 1 would 
>> work. (care to give an example)
>> No I have not looked closly at mango yet. (Will do)
> 
> Oskar has an array template library that can do much of this, and I have 
> the beginnings of one in Ares as well.  The source is here:
> 
> http://svn.dsource.org/projects/ares/trunk/src/ares/std/array.d
> 
> As you can see however, half the functions are commented out because 
> template function overloading basically just doesn't work yet.

I agree that it would be really nice if those types of templates worked 
today, but all of those functions can be rewritten in a way that works 
with current D. Considering the amount of time it took us to get the 
current (most basic) ifti support, I would rather use a solution that 
works today, than wait an indefinite amount of time for something that 
may never happen. :) I fully appreciate your stand point though and 
would love to hear something from Walter regarding future ifti support.

Some things that I would like to see improved (in descending order of 
importance) are ifti support for:
1. template member functions
2. mixed explicit/implicit arguments: f!(int)('x') => f!(int,char)('x')
3. template specializations
4. better template function overloading
5. generic matching: template t(X) { void t(X[] a, X b) {}}

> Eventually however, I plan to add split, join, etc.  These will probably 
> all assume fixed-width elements, with improved support for char and 
> wchar strings in a std.string module, as supporting variable width 
> encoding will slow down the algorithms.

It sounds reasonable to avoid any variable length awareness in 
std.array, but I don't really see how supporting that will make split or 
join any slower. For instance

(char[]).split(char)
(char[]).split(char[])
(char[]).split(bool delegate(char))

Aren't affected by variable length encodings. Only:

(char[]).split(dchar)
(char[]).split(bool delegate(dchar))

are, (by using a dchar foreach over a char[]), but here, the user is 
explicit about wanting a multi byte implementation. Putting the 
implementation of the last two versions in std.string gives a neat 
std.string/std.array separation, but risk confusing the user:

- Why would "abc".split('a') be in std.array while "abc".split('å') 
requires std.string?

Regards,

Oskar