Splitting up large dirty file

Jon Degenhardt jond at noreply.com
Thu May 17 23:21:25 UTC 2018


On Thursday, 17 May 2018 at 20:08:09 UTC, Dennis wrote:
> On Wednesday, 16 May 2018 at 15:47:29 UTC, Jon Degenhardt wrote:
>> If you write it in the style of my earlier example and use 
>> counters and if-tests it will work. byLine by itself won't try 
>> to interpret the characters (won't auto-decode them), so it 
>> won't trigger an exception if there are invalid utf-8 
>> characters.
>
> When printing to stdout it seems to skip any validation, but 
> writing to a file does give an exception:
>
> ```
>     auto inputStream = (args.length < 2 || args[1] == "-") ? 
> stdin : args[1].File;
> 	auto outputFile = new File("output.txt");
>     foreach (line; inputStream.byLine(KeepTerminator.yes)) 
> outputFile.write(line);
> ```
> std.exception.ErrnoException at C:\D\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(2877):  (No error)
>
> According to the documentation, byLine can throw an 
> UTFException so relying on the fact that it doesn't in some 
> cases doesn't seem like a good idea.

Instead of:

      auto outputFile = new File("output.txt");

try:

     auto outputFile = File("output.txt", "w");

That works for me. The second arg ("w") opens the file for write. 
When I omit it, I also get an exception, as the default open mode 
is for read:

  * If file does not exist:  Cannot open file `output.txt' in mode 
`rb' (No such file or directory)
  * If file does exist:   (Bad file descriptor)

The second error presumably occurs when writing.

As an aside - I agree with one of your bigger picture 
observations: It would be preferable to have more control over 
utf-8 error handling behavior at the application level.


More information about the Digitalmars-d-learn mailing list