Parsing a UTF-16LE file line by line, BUG?
    Nestor via Digitalmars-d-learn 
    digitalmars-d-learn at puremagic.com
       
    Sat Jan 28 07:40:24 PST 2017
    
    
  
On Friday, 27 January 2017 at 04:26:31 UTC, Era Scarecrow wrote:
>  Skipping the BOM is just a matter of skipping the first two 
> bytes identifying it...
AFAIK in some cases the BOM takes up to 4 bytes (FOR UTF-32), so 
when input encoding is unknown one must perform some kind of 
detection in order to apply the correct transcoding later. I 
thought by now dmd had this functionality built-in and exposed, 
since the compiler itself seems to do it for source code units.
    
    
More information about the Digitalmars-d-learn
mailing list