What's the simplest way to read a file token by token?

Sun Aug 11 06:48:16 PDT 2013

Thank you so much, that's exactly the kind of reply I was seeking!

On Sunday, 11 August 2013 at 00:13:10 UTC, Jonathan M Davis wrote:
> On Saturday, August 10, 2013 19:34:20 Carl Sturtivant wrote:
>> On Saturday, 10 August 2013 at 17:09:29 UTC, Carl Sturtivant
>> 
>> wrote:
>> > What's the simplest way in D to read a file token by token,
>> > where the read tokens are D strings, and they are separated 
>> > in
>> > the file by arbitrary non-zero amounts of white space
>> > (including spaces, tabs and newlines at a minimum)?
>> 
>> I couldn't find a function that did just this, and various 
>> ways I
>> implemented it seemed too complex. Is there such a function in 
>> a
>> D library?
>
> If you have a string (or any range of dchar) already, you can 
> use
> std.algorith.splitter:
>
> import std.algorithm;
>
> void main()
> {
>     auto str = "hello world    goodbye charlie.";
>     assert(equal(splitter(str),
>            ["hello", "world", "goodbye", "charlie."]));
> }
>
> However, reading from a file is quite a bit more problematic, 
> as we don't have
> proper stream stuff yet (we're still waiting on std.io to be 
> finished so that we
> can have that). And that means that what we have for reading 
> files is a lot
> less flexible. In general, you're probably going to be reading 
> it in line by
> line with std.stdio.byLine, in chunks of bytes via 
> std.stdio.byChunk, or all
> at once with std.file.readText.
>
> Something that does what you want could certainly be built on 
> top of either
> byLine or byChunk without a lot of effort, but it obviously 
> doesn't work right
> out of the box. readText will work great (since you can just 
> use splitter on
> its result), but it does mean reading the entire file in at 
> once. Still, in
> most cases, that's what I'd do. It's only going to be a problem 
> if the file is
> going to be particularly large, and since splitter is just 
> slicing the string
> that you give it (rather than copying it), you shouldn't end up 
> with the file
> in memory more than once.
>
> At some point, we will have full, range-compatible stream 
> support in Phobos,
> and the situation will definitely improve, but for now, those 
> are probably your
> best options.
>
> - Jonathan M Davis