dchar unicode phobos

Oskar Linde oskar.lindeREM at OVEgmail.com
Wed Jun 7 09:15:16 PDT 2006


pragma skrev:
> In article <e66qqr$1er5$1 at digitaldaemon.com>, Johan Granberg says...
>> pragma wrote:
>>> In article <e66hf1$otn$1 at digitaldaemon.com>, Johan Granberg says...
>>>> That D supports UTF is great, and by using dchar[] all Unicode code 
>>>> points can bee used. But phobos does not support dchar[]s adequately. 
>>>> (or wchar[]s for that matter) Wouldn't it bee expected of the language 
>>>> standard library to support all of the languages string encodings?
>>>>
>>>> Proposal: add wchar[] and dchar[] versions of the string functions in phobos
>>>>
>>>> (should this bee filed as a bug?)
>>> Ya know, I never really thought about this, but you're right: D has three
>>> character types yet only has full library support for one of them.
>>>
>>> If you ask me, there's only so many ways to go about this:
>>>
>>> 1. Refactor std.string to use implicit templates
>>> 2. Branch std.string into three modules, one for each char type
>>> 3. Support all three char types via overloads within std.string
>>>
>>> Personally, I like #1 since it would be seamless to implement, and would require
>>> almost exactly as much code as is in use now.  The only drawback here is centers
>>> around problems with distributing template code in libraries.
>>>
>>> Also, do you personally need this kind of support in your project?  Have you
>>> looked at Mango?
>>>
>>> - EricAnderton at yahoo
>> Yes I have needed support for dchar[] with functions like split , 
>> splitline and strip in std.string.
>> Yes your ways of doing the support looks ok, I would choose 3 thou 
>> instead of 1. It may bee because I'm not 100% sure about how 1 would 
>> work. (care to give an example)
> 
> Sure.  D will now try to implicitly instantiate templates where it finds them.
> So you can do this:
> 
> /**/ template trim(TChar){
> /**/   TChar[] trim(TChar[] src){ /* ... */ }
> /**/ }

D is unfortunately not really that smart yet. You need exactly the same 
function argument types and in the same order as the template arguments.

template trim(MyString) {
	MyString trim(MyString src) { /* */ }
}

works.

> 
> ..and the call to trim will still be as simple as the non-templated version:
> 
> /**/ dchar[] foo,bar;
> /**/ foo = trim(bar);

and even bar.trim() will work.

> The astute observer will notice that any array type can be used as a parameter
> in the above example.  Proper use of static if() and the 'is' operator can
> easily ensure that only char, wchar and dchar are being used.  Template
> overloads, while verbose, are another way to go.

I don't really see any reason to limit string functions to char, wchar 
and dchar. Strings in other encodings (for instance latin1, iso8859-1), 
are readily encoded as ubyte[] or with a typedef:ed type. It would be 
useful to be able to work with such string too. I have myself several 
times cast latin1 strings into char[], just to be able to use one of the 
std.string functions on it before casting the result back into a ubyte[].

/Oskar



More information about the Digitalmars-d mailing list