Another new io library

Thu Feb 18 15:52:56 PST 2016

On Wednesday, 17 February 2016 at 06:45:41 UTC, Steven 
Schveighoffer wrote:
> It's no secret that I've been looking to create an updated io 
> library for phobos. In fact, I've been working on one on and 
> off since 2011 (ouch).
>
> ...

Hi everyone, it's been a while.

I wanted to chime in on the streams-as-ranges thing, since I've 
thought about this quite a bit in the past and discussed it with 
Wyatt outside of the forum.

Steve: My apologies in advance if I a misunderstood any of the 
functionality of your IO library.  I haven't read any of the 
documentation, just this thread, and I my time is over-committed 
as usual.

Anyhow...

I believe that when I am dealing with streams, >90% of the time I 
am dealing with data that is *structured* and *heterogeneous*.  
Here are some use-cases:
1. Parsing/writing configuration files (ex: XML, TOML, etc)
2. Parsing/writing messages from some protocol, possibly over a 
network socket (or sockets).  Example: I am writing a PostgreSQL 
client and need to deserialize messages: 
http://www.postgresql.org/docs/9.2/static/protocol-message-formats.html
3. Serializing/deserializing some data structures to/from disk.  
Example: I am writing a game and I need to implement save/load 
functionality.
4. Serializing/deserializing tabular data to/from disk (ex: .CSV 
files).
5. Reading/writing binary data, such as images or video, from/to 
disk.  This will probably involve doing a bunch of (3), which is 
kind of like (2), but followed by large homogenous arrays of some 
data (ex: pixels).
6. Receiving unstructured user input.  This is my <10%.

Note that (6) is likely to happen eventually but also likely to 
be minuscule: why are we receiving user input?  Maybe it's just 
to store it for retrieval later.  BUT, maybe we actually want it 
to DO something.  If we want it to do something, then we need to 
structure it before code will be able to operate on it.

(5) is a mix of structured heterogeneous data and structured 
homogenous data.  In aggregate, this is structured heterogeneous 
data, because you need to do parsing to figure out where the 
arrays of homogeneous data start and end (and what they *mean*).

This is why I think it will be much more important to have at 
least these two interfaces take front-and-center:
A.  The presence of a .popAs!(...) operation (mentioned by Wyatt 
in this thread, IIRC) for simple deserialization, and maybe for 
other miscellaneous things like structured user interaction.
B.  The ability to attach parsers to streams easily.  This might 
be as easy as coercing the input stream into the basic encoding 
that the parser expects (ex: char/wchar/dchar Ranges for 
compilers, or maybe ubyte Ranges for our PostgreSQL client's 
network layer), though it might need (A) to help a bit first if 
the encoding isn't known in advance (text files can be 
represented in sooo many ways!  isn't it fabulous!).

I understand that most unsuspecting programmers will arrive at a 
stream library expecting to immediately see an InputRange 
interface.  This /probably/ is not what they really want at the 
end of the day.  So, I think it will be very important for any 
such library to concisely and convincingly explain the design 
methodology and rationale early and aggressively.  Neglect to do 
this, and the library and it's documentation will become a 
frustration and a violation of expectations (an "astonishment").  
Do it right, and the library's documentation will become a 
teaching tool that leaves visitors feeling enlightened and 
empowered.

Of course, I have to wonder if someone else has contrasting 
experiences with stream use-cases.  Maybe they really would be 
frustrated with a range-agnostic design.  I don't want to 
alienate this hypothetical individual either, so if this is you, 
then please share your experiences.

I hope this helps and is worth making a bunch of you read a wall 
of text ;)

- Chad