toUTFz again

Jonathan M Davis jmdavisProg at gmx.com
Tue Mar 13 11:13:42 PDT 2012


On Tuesday, March 13, 2012 19:00:29 Andrej Mitrovic wrote:
> I've completely lost track of what happened with the whole toUTF16z
> story, but anyway since it's in std.utf why doesn't it just forward to
> toUTFz?

There's a pull request which includes other changes to std.utf which has been 
sitting around for some time and may not make it in which includes that 
change. Andrei wanted a generic toUTFx function rather than toUTF8, toUTF16, 
and toUTF32. So, I wrote it. It definitely improves generic code, but it makes 
non-generic code uglier. So, it's not entirely clear what's going to happen 
with it. The changes to toUTF16z will definitely make it in eventually, but the 
situation with that pull request needs to be sorted out.

https://github.com/D-Programming-Language/phobos/pull/279

> const(wchar)* toUTF16z(T)(T input)
> if (isSomeString!T)
> {
> return toUTFz!(const(wchar)*)(input);
> }
> 
> That way it can take any string argument and not just a UTF8 string.
> Currently it only accepts UTF8 which is an unnecessary restriction.
> 
> Secondly, it's difficult to make an alias to 'toUTFz'. For example, if
> you want a short version of 'toUTFz!(char*)' in your code, you would
> typically write an alias like this:
> 
> alias toUTFz!(char*) toCharPtr;
> 
> However that won't work because toUTFz requires a second type
> argument. If toUTFz was a template that forwarded to other
> implementation templates it would make the above alias possible.
> Here's what I mean:
> 
> // equivalent to one of toUTFz templates in std.utf
> auto toUTFzImpl(P, S)(S s) { return null; }
> 
> template toUTFz(P)
> {
> P toUTFz(S)(S str)
> {
> return toUTFzImpl!(P)(str);
> }
> }
> 
> void main()
> {
> alias toUTFz!(char*) toUTF8z;
> toUTF8z("foo");
> toUTF8z("foo"w);
> toUTF8z("foo"d);
> }
> 
> That's much simpler than having to declare every possible combination
> just to use a simple alias:
> alias toUTFz!(const(char*), char[]) toUTF8z;
> alias toUTFz!(const(char*), wchar[]) toUTF8z;
> alias toUTFz!(const(char*), dchar[]) toUTF8z;
> alias toUTFz!(const(char*), string) toUTF8z;
> alias toUTFz!(const(char*), wstring) toUTF8z;
> alias toUTFz!(const(char*), dstring) toUTF8z;

That change can probably be made, though it might also break code which uses 
such aliases, so I'll have to play around with it to see if that can be 
avoided (simply having two versions of the template - one with one argument 
and one with two - may take care of that though).

I should probably just break up that existing pull request, put toUTF in its 
own pull request as a generic function in addition to toUTF8, toUTF16, and 
toUTF32 (instead of replacing them) in order to facilitate both generic and 
non-generic code, and then create another pull request with the toUTFz 
changes.

- Jonathan M Davis


More information about the Digitalmars-d-learn mailing list