Need a way to get compressed mangling of a symbol.

Adam D. Ruppe destructionator at gmail.com
Tue Jul 16 06:58:08 PDT 2013


I'm looking at the dmd source now...

The compression is done in the backend, file cgobj.c

The conditions are:


#define LIBIDMAX 128
     if (len > LIBIDMAX)
     {
         // Attempt to compress the name
         name2 = id_compress(name, len);
  // snip
         if (len2 > LIBIDMAX)            // still too long
         {
             /* Form md5 digest of the name and store it in the
              * last 32 bytes of the name.
              */

// snip impl, open the source to see specific details




/******************************************
  * Compress an identifier.
  * Format: if ASCII, then it's just the char
  *      if high bit set, then it's a length/offset pair
  * Returns:
  *      malloc'd compressed identifier
  */

char *id_compress(char *id, int idlen)
{



The implementation, same source file, looks like it compresses by 
looking for longest duplicate strings and then removes them, 
using the offset instead.



The reason I snipped the implementations here is the backend is 
under a more restrictive license so I don't want to get into 
copying that. But with just what I've said here combined with 
guess+check against dmd's output it might be enough to do a clean 
room implementation.



Or if Walter can give us permission to copy/paste this into a D 
file we could use id directly.


More information about the Digitalmars-d-learn mailing list