Improving D's support of code-pages

Regan Heath regan at netmail.co.nz
Mon Aug 20 07:37:57 PDT 2007


Kirk McDonald wrote:
> Leandro Lucarella wrote:
>> Kirk McDonald, el 18 de agosto a las 14:33 me escribiste:
>>
>>> char[] decode(ubyte[] str, string encoding, string error="strict");
>>> wchar[] wdecode(ubyte[] str, string encoding, string error="strict");
>>> dchar[] ddecode(ubyte[] str, string encoding, string error="strict");
>>
>>
>> Why isn't error an enum instead of a string?
>>
> 
> Perhaps it would be useful to allow the user to define new 
> error-handlers somehow, and provide a callback for them. (Python allows 
> something like this.) This would allow you to, for instance, provide a 
> different replacement character than the one provided by "replace".

Not a bad idea.

I would like to suggest alternate function signatures:

//The error code for the callback
enum DecodeMode { ..no idea what goes here.. }

//The callback types
typedef char function(DecodeMode,char) DecodeCHandler;
typedef wchar function(DecodeMode,wchar) DecodeWHandler;
typedef dchar function(DecodeMode,dchar) DecodeDHandler;

//The decode functions
uint decode(byte[] str, char[] dst, string encoding, DecodeCHandler 
handler);
uint decode(byte[] str, wchar[] dst, string encoding, DecodeWHandler 
handler);
uint decode(byte[] str, dchar[] dst, string encoding, DecodeDHandler 
handler);

Technically 'char' in C is a signed byte, not an unsigned one therefore 
byte[] is more accurate.

I think you still want to use an enum to represent the cases the 
callback needs to handle (assuming there is more than one) the same 
handler function could be used for both encode and decode then.

I think you want to pass the destination buffers, allowing 
re-use/preallocation for efficiency.

I think you either return the resulting length of the destination data, 
or perhaps pass "dst" as 'ref' and change the length internally*.  Not 
sure what you would return if you did that.

(* changing length should never cause deallocation of buffer)

Regan



More information about the Digitalmars-d mailing list