Is implicit string literal concatenation a good thing?

Denis Koroskin 2korden at gmail.com
Thu Feb 26 10:12:09 PST 2009


On Thu, 26 Feb 2009 20:59:34 +0300, Sergey Gromov <snake.scaly at gmail.com> wrote:

> Mon, 23 Feb 2009 03:48:17 +0000 (UTC), BCS wrote:
>
>> Hello bearophile,
>>
>>> If there are guarantees that "abc" "def" are folded at compile time,
>>> then the same guarantees can be specified for "abc" ~ "def". I can't
>>> see a problem.
>>
>> While it is not part of the spec, I do see a problem. If it were  
>> added....
>>
>>>
>>> I have also compiled this code with DMD:
>>>
>>> void main() {
>>> string foo = "foo";
>>> string bar = foo ~ "bar" ~ "baz";
>>> }
>>> Resulting asm, no optimizations:
>>>
>>> L0:		push	EBP
>>> mov	EBP,ESP
>>> mov	EDX,FLAT:_DATA[0Ch]
>>> mov	EAX,FLAT:_DATA[08h]
>>> push	dword ptr FLAT:_DATA[01Ch]
>>> push	dword ptr FLAT:_DATA[018h]
>>> push	dword ptr FLAT:_DATA[02Ch]
>>> push	dword ptr FLAT:_DATA[028h]
>>
>> note 6 things
>>
>>> push	EDX
>>> push	EAX
>>> push	3
>>> mov	ECX,offset FLAT:_D11TypeInfo_Aa6__initZ
>>> push	ECX
>>> call	near ptr __d_arraycatnT
>>> xor	EAX,EAX
>>> add	ESP,020h
>>> pop	EBP
>>> ret
>>> Resulting asm, with optimizations:
>>>
>>> L0:		sub	ESP,0Ch
>>> mov	EAX,offset FLAT:_D11TypeInfo_Aa6__initZ
>>> push	dword ptr FLAT:_DATA[01Ch]
>>> push	dword ptr FLAT:_DATA[018h]
>>> push	dword ptr FLAT:_DATA[02Ch]
>>> push	dword ptr FLAT:_DATA[028h]
>>> push	dword ptr FLAT:_DATA[0Ch]
>>> push	dword ptr FLAT:_DATA[08h]
>>
>> again 6 things
>>
>>> push	3
>>
>> I think that is a varargs call
>>
>>> push	EAX
>>> call	near ptr __d_arraycatnT
>>> add	ESP,020h
>>> add	ESP,0Ch
>>> xor	EAX,EAX
>>> ret
>>> I can see just one arraycatn, so the two string literals are folded at
>>> compile time, I think.
>>>
>>> Bye,
>>> bearophile
>>
>> I think that DMD does some optimization for a~b~c etc. so that there is  
>> only
>> one call for any number of chained ~ (array cat n). In this case I think
>> it is doing that.
>
> Surely enough, if you look into the compiled .obj you won't find
> "barbaz" there.  All sub-strings are separete, regardless of the
> optimization options.

Here is a test:

import std.stdio;

void main()
{
    string t1 = "bar1" ~ "baz1";
    string t2 = t1 ~ "bar2" ~ "baz2";
    string t3 = t1 ~ ("bar3" ~ "baz3");
    
    writefln(t1);
    writefln(t2);
    writefln(t3);
}

compiled test executable contains strings bar1baz1 and bar3baz3.

Forth to note that declaring t1, t2 and t3 as const (i.e. "const string t1" etc) makes the concatenations entirely at compile-time.




More information about the Digitalmars-d mailing list