Parsing D files with non-unicode characters

Jonathan M Davis newsgroup.d at jmdavisprog.com
Tue Nov 6 01:41:26 UTC 2018


On Monday, November 5, 2018 6:19:17 PM MST Roland Hadinger via Digitalmars-d 
wrote:
> On Tuesday, 6 November 2018 at 00:48:34 UTC, Arun Chandrasekaran
>
> wrote:
> > Thanks! Can't we preserve the comments? Comments are
> > invaluable, especially on the headerfiles. We generate
> > documentation using doxygen.
>
> If by 'preserve' you mean 'keep the non-UTF-8 encoding as-is',
> then no, what I suggested wouldn't work.

If I understand correctly, non-UTF isn't legal in D source files, so that's
just plain not possible period. They will have to be converted to Unicode in
order to be in a D source file even in comments. If the characters are legal
in some other encoding, then the encoding will need to be correctly detected
and converted to Unicode somehow. If they're just invalid, then arguably,
there really isn't anything to preserve anyway.

- Jonathan M Davis





More information about the Digitalmars-d mailing list