Need a Faster Compressor

Rainer Schuetze via Digitalmars-d digitalmars-d at puremagic.com
Sun May 22 10:44:22 PDT 2016



On 22.05.2016 00:58, Walter Bright wrote:
> On 5/21/2016 3:41 PM, Guillaume Boucher wrote:
>> Sorry if I didn't memorize everything in this forum from the last 20
>> years, can
>> you give a link to some reasoning?
>
> DMC++ matches the Microsoft name mangling scheme, which includes such
> compression. It proved hopelessly inadequate, which is why I implemented
> compress.c in the first place (it was for the DMC++ compiler).
>
> (And frankly, I don't see how an ad-hoc scheme like that could hope to
> compare with a real compression algorithm.)

You are right about the symbols using the VC mangling. The test case 
"1.s.s.s.s.s" in the bugreport translated to C++ yields

?foo at Result@?1???$s at UResult@?1???$s at UResult@?1???$s at UResult@?1???$s at UResult@?1???$s at H@testexpansion@@YA at H@Z@@testexpansion@@YA at U1?1???$s at H@2 at YA@H at Z@@Z@@testexpansion@@YA at U1?1???$s at UResult@?1???$s at H@testexpansion@@YA at H@Z@@2 at YA@U1?1???$s at H@2 at YA@H at Z@@Z@@Z@@testexpansion@@YA at U1?1???$s at UResult@?1???$s at UResult@?1???$s at H@testexpansion@@YA at H@Z@@testexpansion@@YA at U1?1???$s at H@2 at YA@H at Z@@Z@@2 at YA@U1?1???$s at UResult@?1???$s at H@testexpansion@@YA at H@Z@@2 at YA@U1?1???$s at H@2 at YA@H at Z@@Z@@Z@@Z@@testexpansion@@YA at U1?1???$s at UResult@?1???$s at UResult@?1???$s at UResult@?1???$s at H@testexpansion@@YA at H@Z@@testexpansion@@YA at U1?1???$s at H@2 at YA@H at Z@@Z@@testexpansion@@YA at U1?1???$s at UResult@?1???$s at H@testexpansion@@YA at H@Z@@2 at YA@U1?1???$s at H@2 at YA@H at Z@@Z@@Z@@2 at YA@U1?1???$s at UResult@?1???$s at UResult@?1???$s at H@testexpansion@@YA at H@Z@@testexpansion@@YA at U1?1???$s at H@2 at YA@H at Z@@Z@@2 at YA@U1?1???$s at UResult@?1???$s at H@testexpansion@@YA at H@Z@@2 at YA@U1?1???$s at H@2 at YA@H at Z@@Z@@Z@@Z@@Z at QAEXXZ

i.e. 936 characters. I think this is due to the very bad limitation of 
just back referencing the first 10 types.

The gcc mangling allows arbitrary back references and is closer to what 
some have proposed. It yields

_ZZN13testexpansion1sIZNS0_IZNS0_IZNS0_IZNS0_IiEEDaT_E6ResultEES1_S2_E6ResultEES1_S2_E6ResultEES1_S2_E6ResultEES1_S2_EN6Result3fooEv

which is only 39 characters longer than the compressed version of the D 
symbol. It uses a much smaller character set, though.

This is my translation of the test case to C++14:

namespace testexpansion {

template<class T>
struct S
{
     void foo(){}
};

template<class T>
auto testexpansion_s(T t)
{
#ifdef bad
     struct Result
	{
		void foo(){}
	};
	return Result();
#else
     return S<T>();
#endif
}

void xmain()
{
     auto x = s(s(s(s(s(1)))));
     x.foo();
}

}


More information about the Digitalmars-d mailing list