Prevent opening binary/other garbage files

helxi brucewayneshit at gmail.com
Sat Sep 29 23:46:26 UTC 2018


On Saturday, 29 September 2018 at 16:01:18 UTC, Adam D. Ruppe 
wrote:
> On Saturday, 29 September 2018 at 15:52:30 UTC, helxi wrote:
>> I'm writing a utility that checks for specific keyword(s) 
>> found in the files in a given directory recursively. What's 
>> the best strategy to avoid opening a bin file or some sort of 
>> garbage dump? Check encoding of the given file?
>
> Simplest might be to read the first few bytes (like couple 
> hundred probably) and if any of them are < 32 && != '\t' && != 
> '\r' && != '\n' && != 0, there's a good chance it is a binary 
> file.
>
> Text files are frequently going to have tabs and newlines, but 
> not so frequently other low bytes.
>
> If you do find a bunch of 0's, but not the other values, you 
> might have a utf-16 file.
>
Thanks. Would you say 
https://dlang.org/library/std/encoding/get_bom.html is useful in 
this context?


More information about the Digitalmars-d-learn mailing list