Any library with string encoding/decoding support?

Adam D. Ruppe destructionator at gmail.com
Mon Jan 20 05:46:15 PST 2014


On Monday, 20 January 2014 at 08:33:09 UTC, ilya-stromberg wrote:
> Do you know any library with string encoding/decoding support? 
> I need more encodings than provides `std.encoding`.

I did one that does a little bit more decoding, but no encoding 
support at all. (I wrote it for my web scraper and email reader 
so all i cared about was getting it to utf8)

https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff/blob/master/characterencodings.d

auto s = convertToUtf8(your_raw_data, "current_encoding");


if you want something full featured, GNU iconv isn't hard to use 
from D


import core.stdc.errno;
extern(C) {
         alias void* iconv_t;
         iconv_t iconv_open(const char *tocode, const char 
*fromcode);
         int iconv_close(iconv_t cd);

      pragma(lib, "iconv");

        size_t iconv(iconv_t cd,
                     char **inbuf, size_t *inbytesleft,
                     char **outbuf, size_t *outbytesleft);
}

     auto i = iconv_open("UTF-8", toStringz("CP1252"));
     if(i == cast(void*) -1) throw new Exception("iconv open 
failed");
     scope(exit) iconv_close(i);

     /* get input pointer and length ready */
     /* Allocate an output buffer with 4x the size of the input 
buffer */
     // keep the output buffer around as a slice and get a pointer 
to it for the lib
     auto startingOutputBuffer = new char(content.length * 4];
     char* outputBuffer = startingOutputBuffer.ptr;

     while(inputLength) {
         auto ret = iconv(i, &input, &inputLength, &outputBuffer, 
&outputLength);
         if(ret == -1) {
                // check errno. errno == 84 means wrong charset
         }
     }

    // number of bytes remaining in the output buffer is the size 
here
    // so we do original buffer size minus remaining buffer size
    outputLength = (content.length * 4) - outputLength;

    // then slice it to get the result
    string convertedContent = startingOutputBuffer[0 .. 
outputLength];




Note that iconv i think is GPL licensed.


More information about the Digitalmars-d-learn mailing list