Issues with constants, and inout (was Real World usage of D, Today)

kris foo at bar.com
Fri Jan 26 20:15:15 PST 2007


Andrei Alexandrescu (See Website For Email) wrote:
> kris wrote:
> [about implicit conversion rules]
> 
>> extern (C) int printf (char*, ...);
>>
>> class Foo
>> {
>>         void write (int x) {printf("int\n");}
>>         void write (uint x) {printf("uint\n");}
>>         void write (long x) {printf("long\n");}
>>         void write (char x) {printf("char\n");}
>>         void write (wchar x) {printf("wchar\n");}
>>         void write (double x) {printf("double\n");}
>>
>>         void write (char[] x) {printf("char[]\n");}
>>         void write (wchar[] x) {printf("wchar[]\n");}
>> }
>>
>> void main()
>> {
>>         auto foo = new Foo;
>>
>>         foo.write ('c');
>>         foo.write (1);
>>         foo.write (1u);
>>         foo.write (3.14);
>>         //foo.write ("asa");
>> }
>>
>> prints:
>>
>> char
>> int
>> uint
>> double
>>
>>
>> DMD has actually become smarter than the last time I tried something 
>> like this: it manages to select the correct overload for 'c' whereas 
>> before it couldn't decide whether int or uint was a better match for 
>> the char instead. This is good.
>> It seems clear from the above that D /defaults/ the type of character, 
>> and undecorated integers, to something appropriate? In the above case 
>> 'c' is defaulted to char, rather than wchar, for example. The 
>> undecorated int constant is defaulted to int, rather than uint or 
>> long. This is good.
> 
> 
> Yup. So far so good.
> 
>> Now for the broken part. When you uncomment the string constant, the 
>> compiler gets all confused about whether it's a char[] or wchar[]. 
>> There is no defaulting to one type, as there is for other constants 
>> (such as char). It /is/ possible to decorate the string constant in a 
>> similar manner to decorating integer constants:
>>
>> foo.write ("qwe"c);
>>
>> And this, of course, compiles. It's a PITA though, and differs from 
>> the rules for other constants.
> 
> 
> I talked to Walter about this and it's not a bug, it's a feature :o). 
> Basically it's hard to decide what to do with an unadorned string when 
> both wchar[] and char[] would want to "attract" it. I understand you're 
> leaning towards defaulting to char[]? Then probably others will be unhappy.
> 


You'll have noticed that the constant 'c' defaults to /char/, and that 
there's no compile-time conflict between the write(char) & write(wchar)? 
  Are people unhappy about that too? Perhaps defaulting of char 
constants and int constants should be abolished also?

I just want the compiler to be consistent, and it's quite unlikely that 
I'm alone in that regard -- consistency is a very powerful tool. 
Besides, aren't so-called 'features' actually bugs, coyly renamed by the 
marketing department? :)

BTW: if you remove the write(char) overload, the compiler says it 
doesn't know which of int/uint overloads to select for 'c', and 
completely ignores write(wchar) as an viable option. That seems 
reasonable, but it clearly shows that a char constant is being defaulted 
to type char; and it's vaguely amusing in a twisted manner <g>


>> Things start to go south when using templates with string constants. 
>> For example, take this template sig:
>>
>> uint locate(T) (T[] source, T match, uint start=0)
>>
>> This is intended to handle types of char[], wchar[] and dchar[]. 
>> There's a uint on the end, as opposed to an int. Suppose I call it 
>> like this:
>>
>> locate ("abc", "ab", 1);
>>
>> we get a compile error, since the int-constant does not match a uint 
>> in the sig (IFTI currently needs exact sig matches). In order to get 
>> around this, we wrap the template with a few functions:
>>
>> uint locate (char[] source, char[] match, uint start=0)
>> {
>>     return locateT!(char) (source, match, start);
>> }
>>
>> uint locate (wchar[] source, wchar[] match, uint start=0)
>> {
>>     return locateT!(wchar) (source, match, start);
>> }
>>
>> and dchar too.
>>
>> Now we call it:
>>
>> locate ("abc", "ab", 1);
>>
>> Well, the int/uint error goes away (since function matching operates 
>> differently than IFTI matching), but we've now got our old friend back 
>> again -- the constant char[], wchar[], dchar[] mismatch problem.
> 
> 
> I think a sound solution to this should be found. It's kind of hard, 
> because char[] is the worst match but also probably the most used one. 
> The most generous match is dchar[] but wastes much time and space for 
> the minority of cases in which it's useful.

In the FWIW department, after writing several truckloads of 
text-oriented library code and wrappers for external text-processing 
libs, I've reached a simple conclusion: utf8 is where it's at for ~80% 
of code written, IMO. There's probably a ~15% need to go to utf32 for 
serious text handling (Word Processor, etc), and utf16 is the half-way 
house that the remaining percentage resort to when compromising (such as 
when stuffing things into ROM).

Before the flames rise up and engulf that claim, Let's consider one 
major exclusion: certain GUI APIs use utf16 throughout. What the heck 
does one do in that situation if the compiler defaults strings-constants 
to char[] instead of wchar[]?

Well, it's actually no issue at all since those APIs typically don't 
have method overloads for char/dchar also. They have only utf16 
signatures instead, for any given method name, because they only deal in 
utf16. Thus, the compiler can happily morph a string constant to a 
wchar[] instead of the default -- *just as it happily does today*

Let's also keep in mind we're talking string constants only, rather than 
all strings. I'll just try not to harp on about the consistency mismatch 
between char & char[] constants any more than I have done already.


> 
>> How about another type of template? Here's one that does some simply 
>> text processing:
>>
>> T[] layout(T) (T[] output, T[] format, T[][] subs...)
>>
>> This has an output buffer, a format string, and a set of optional 
>> args; all of the same type. If I call it like so:
>>
>> char[128] tmp;
>> char[]    world = "world";
>> layout (tmp, "hello %1", world);
>>
>> that compiles ok. If I use wchar[] instead, it doesn't compile:
>>
>> wchar[128] tmp;
>> wchar[]    world = "world";
>> layout (tmp, "hello %1", world);
>>
>> In this case, the constant string used for formatting remains as a 
>> char[], so the template match fails (args: wchar[], char[], wchar[])
>>
>> However, if I change the template signature to this instead:
>>
>> T[] layout(T) (T[] output, T[][] subs...)
>>
>> then everything works the way I want it to, but the design is actually 
>> wrong (the format string is now not required). String constants can be 
>> a royal PITA.
> 
> 
> Color me convinced. :o) I have no bright ideas on solving it though.

Perhaps a change to a default type might take care of it? After all, 
this particular issue is specific to string-constants only; not for 
other types (such as char/int/long/float).

Certainly worth a try, one would think?


> 
>> inout
>> -----
>>
>> Since you're working on inout also, I'd like to ask what the plan is 
>> relating to a couple of examples. Tango uses this style of call quite 
>> regularly:
>>
>> get (inout int x);
>> get (inout char[] x);
>> etc.
>>
>> This is a clean way to pass-by-reference, instead of dealing with all 
>> the pointer syntax. I sure hope D will retain reference semantics like 
>> this, in some form?
> 
> 
> It will, and in the same form.

Grand!

> 
>> One current problem with inout, which you might not be aware of, is 
>> with regard to const structs. I need to pass structs by reference, 
>> because I don't want to pass them by value. Applying inout is the 
>> mechanism for describing this:
>>
>> struct Bar {int a, b;}
>>
>> Bar b = {1, 2};
>>
>> void parf (inout Bar x) {}
>>
>> void main()
>> {
>>    parf (b);
>> }
>>
>> That all works fine. However, when I want to make those structs 
>> /const/ instead, I cannot use inout since it has mutating semantics: I 
>> get a compile error to that effect:
>>
>> const Bar b = {1, 2};
>>
>>  >> Error: cannot modify const variable 'b'
>>
>> That is, there's no supported way to pass a const struct by reference. 
>> The response from Walter in the past has been "just use a pointer 
>> instead" ... well, yes I could do that. But it appears to be 
>> indicative of a problem with the language design?
> 
> 
> This case is on the list. You will definitely have a sane and simple way 
> to pass const structs by reference, while having a guarantee that they 
> can't be changed by the callee.

Praise the lord !


>> Why do I want to use const? Well, the data held therein is for 
>> reference only, and (via some future D vendor) I want that reference 
>> data placed into a ROM segment. I do a lot of work with MCUs, and this 
>> sort of thing is a common requirement.
> 
> 
> I agree.
> 
> 
> Andrei


Cheers; this kind of detailed reply (above) is very much appreciated

- Kris



More information about the Digitalmars-d mailing list