Splitting up large dirty file
Jon Degenhardt
jond at noreply.com
Thu May 17 23:21:25 UTC 2018
On Thursday, 17 May 2018 at 20:08:09 UTC, Dennis wrote:
> On Wednesday, 16 May 2018 at 15:47:29 UTC, Jon Degenhardt wrote:
>> If you write it in the style of my earlier example and use
>> counters and if-tests it will work. byLine by itself won't try
>> to interpret the characters (won't auto-decode them), so it
>> won't trigger an exception if there are invalid utf-8
>> characters.
>
> When printing to stdout it seems to skip any validation, but
> writing to a file does give an exception:
>
> ```
> auto inputStream = (args.length < 2 || args[1] == "-") ?
> stdin : args[1].File;
> auto outputFile = new File("output.txt");
> foreach (line; inputStream.byLine(KeepTerminator.yes))
> outputFile.write(line);
> ```
> std.exception.ErrnoException at C:\D\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(2877): (No error)
>
> According to the documentation, byLine can throw an
> UTFException so relying on the fact that it doesn't in some
> cases doesn't seem like a good idea.
Instead of:
auto outputFile = new File("output.txt");
try:
auto outputFile = File("output.txt", "w");
That works for me. The second arg ("w") opens the file for write.
When I omit it, I also get an exception, as the default open mode
is for read:
* If file does not exist: Cannot open file `output.txt' in mode
`rb' (No such file or directory)
* If file does exist: (Bad file descriptor)
The second error presumably occurs when writing.
As an aside - I agree with one of your bigger picture
observations: It would be preferable to have more control over
utf-8 error handling behavior at the application level.
More information about the Digitalmars-d-learn
mailing list