[Issue 8660] New: Unclear semantics of array literals of char type, vs string literals

monarch_dodra monarchdodra at gmail.com
Fri Sep 14 05:50:57 PDT 2012


On Friday, 14 September 2012 at 11:28:04 UTC, Don wrote:
> --- Comment #0 from Don <clugdbug at yahoo.com.au> 2012-09-14 
> 04:28:17 PDT ---
> Array literals of char type, have completely different 
> semantics from string
> literals. In module scope:
>
> char[] x = ['a'];  // OK -- array literals can have an implicit 
> .dup
> char[] y = "b";    // illegal
>
> A second difference is that string literals have a trailing \0. 
> It's important
> for compatibility with C, but is barely mentioned in the spec. 
> The spec does
> not state if the trailing \0 is still present after operations 
> like
> concatenation.

I think this is the normal behavior actually. When you write 
"char[] x = ['a'];", you are not actually "newing" (or "dup"-ing) 
any data. You are just letting x point to a stack allocated array 
of chars. So the assignment is legal (but kind of unsafe 
actually, if you ever leak x).

On the other hand, you can't bind y to an array of immutable 
chars, as that would subvert the type system.

This, on the other hand, is legal.
char[] y = "b".dup;

I do not know how to initialize a char[] on the stack though 
(Appart from writing ['h', 'e', 'l', ... ]). If utf8 also gets 
involved, then I don't know of any workaround.

I think a good solution would be to request the "m" prefix for 
literals, which would initialize them as "mutable":
x = m"some mutable string";

> A second difference is that string literals have a trailing \0. 
> It's important
> for compatibility with C, but is barely mentioned in the spec. 
> The spec does
> not state if the trailing \0 is still present after operations 
> like
> concatenation.
>
> CTFE can use either, but it has to choose one. This leads to 
> odd effects:
>
> string foo(bool b) {
>     string c = ['a'];
>     string d = "a";
>     if (b)
>         return c ~ c;
>     else
>         return c ~ d;
> }
>
> char[] x = foo(true);   // ok
> char[] y = foo(false);  // rejected!
>
> This is really bizarre because at run time, there is no 
> difference between
> foo(true) and foo(false). They both return a slice of something 
> allocated on
> the heap. I think x = foo(true) should be rejected as well, it 
> has an implicit
> cast from immutable to mutable.

Good point. For anybody reading though, the actual code example 
should be
enum char[] x = foo(true);   // ok
enum char[] y = foo(false);  // rejected!

> I think the best way to clean up this mess would be to convert 
> char[] array
> literals into string literals whenever possible. This would 
> mean that string
> literals may occasionally be of *mutable* type! This would 
> means that whenever
> they are assigned to a mutable variable, an implicit .dup gets 
> added (just as
> happens now with array literals). The trailing zero would not 
> be duped.
> ie:
> A string literal of mutable type should behaves the way a 
> char[] array literal
> behaves now.
> A char[] array literal of immutable type should behave the way 
> a string literal
> does now.

I think this would work with my "m" suggestion


More information about the Digitalmars-d-bugs mailing list