Parsing D files with non-unicode characters

Arun Chandrasekaran aruncxy at gmail.com
Wed Nov 7 05:33:20 UTC 2018


On Tuesday, 6 November 2018 at 17:21:13 UTC, Jonathan Marler 
wrote:
> On Monday, 5 November 2018 at 23:50:46 UTC, Arun Chandrasekaran 
> wrote:
>> I'm converting a large amount of header files from C to D 
>> using DStep and I'm stuck at 
>> https://github.com/jacob-carlborg/dstep/issues/215
>>
>> https://dlang.org/spec/intro.html shows that ASCII and UTF 
>> char formats are accepted. How do I go about converting a 
>> large code base like this?
>>
>> Is this a bug in D to reject non-unicode chars in comments?
>>
>> Arun
>
> So you have code that has characters that are neither ascii nor 
> unicode?  What encoding is it using?  And what characters does 
> it contain that can't be represented with unicode?

I was not able to find the character encoding. file -i said 
unknown-8bit. Ultimately https://github.com/BYVoid/uchardet 
helped me to determine the charset, it was SHIFT-JIS.


More information about the Digitalmars-d mailing list