[Issue 8660] New: Unclear semantics of array literals of char type, vs string literals
monarch_dodra
monarchdodra at gmail.com
Fri Sep 14 05:50:57 PDT 2012
On Friday, 14 September 2012 at 11:28:04 UTC, Don wrote:
> --- Comment #0 from Don <clugdbug at yahoo.com.au> 2012-09-14
> 04:28:17 PDT ---
> Array literals of char type, have completely different
> semantics from string
> literals. In module scope:
>
> char[] x = ['a']; // OK -- array literals can have an implicit
> .dup
> char[] y = "b"; // illegal
>
> A second difference is that string literals have a trailing \0.
> It's important
> for compatibility with C, but is barely mentioned in the spec.
> The spec does
> not state if the trailing \0 is still present after operations
> like
> concatenation.
I think this is the normal behavior actually. When you write
"char[] x = ['a'];", you are not actually "newing" (or "dup"-ing)
any data. You are just letting x point to a stack allocated array
of chars. So the assignment is legal (but kind of unsafe
actually, if you ever leak x).
On the other hand, you can't bind y to an array of immutable
chars, as that would subvert the type system.
This, on the other hand, is legal.
char[] y = "b".dup;
I do not know how to initialize a char[] on the stack though
(Appart from writing ['h', 'e', 'l', ... ]). If utf8 also gets
involved, then I don't know of any workaround.
I think a good solution would be to request the "m" prefix for
literals, which would initialize them as "mutable":
x = m"some mutable string";
> A second difference is that string literals have a trailing \0.
> It's important
> for compatibility with C, but is barely mentioned in the spec.
> The spec does
> not state if the trailing \0 is still present after operations
> like
> concatenation.
>
> CTFE can use either, but it has to choose one. This leads to
> odd effects:
>
> string foo(bool b) {
> string c = ['a'];
> string d = "a";
> if (b)
> return c ~ c;
> else
> return c ~ d;
> }
>
> char[] x = foo(true); // ok
> char[] y = foo(false); // rejected!
>
> This is really bizarre because at run time, there is no
> difference between
> foo(true) and foo(false). They both return a slice of something
> allocated on
> the heap. I think x = foo(true) should be rejected as well, it
> has an implicit
> cast from immutable to mutable.
Good point. For anybody reading though, the actual code example
should be
enum char[] x = foo(true); // ok
enum char[] y = foo(false); // rejected!
> I think the best way to clean up this mess would be to convert
> char[] array
> literals into string literals whenever possible. This would
> mean that string
> literals may occasionally be of *mutable* type! This would
> means that whenever
> they are assigned to a mutable variable, an implicit .dup gets
> added (just as
> happens now with array literals). The trailing zero would not
> be duped.
> ie:
> A string literal of mutable type should behaves the way a
> char[] array literal
> behaves now.
> A char[] array literal of immutable type should behave the way
> a string literal
> does now.
I think this would work with my "m" suggestion
More information about the Digitalmars-d-bugs
mailing list