[RFC] I/O and Buffer Range

Steven Schveighoffer schveiguy at yahoo.com
Thu Jan 16 07:55:07 PST 2014


On Tue, 07 Jan 2014 05:04:07 -0500, Dmitry Olshansky  
<dmitry.olsh at gmail.com> wrote:

> 07-Jan-2014 11:59, Jason White пишет:
>> On Monday, 6 January 2014 at 10:26:27 UTC, Dmitry Olshansky wrote:
>>> Ok, now I see. In my eye though serialization completely hides raw
>>> stream write.
>>>
>>> So:
>>> struct SomeStream{
>>>     void write(const(ubyte)[] data...);
>>> }
>>>
>>> struct Serializer(Stream){
>>>     void write(T)(T value); //calls stream.write inside of it
>>> private:
>>>     Stream stream;
>>> }
>>
>> I was thinking it should also have "alias stream this;", but maybe
>> that's not the best thing to do for a serializer.
>>
>> I concede, I've s/(read|write)Data/\1/g on
>>
>>      https://github.com/jasonwhite/io/blob/master/src/io/file.d
>>
>> and it now works on Windows with useful exception messages.
>
> Cool, got to steal sysErrorString too! :)
>
>>> Actually these objects do just fine, since OS does the locking (or
>>> makes sure of something equivalent). If your stream is TLS there is no
>>> need for extra locking at all (no interleaving of I/O calls is
>>> possible) regardless of its kind.
>>>
>>> Shared instances would need locking as 2 threads may request some
>>> operation, and as OS locks only on per sys-call basis something cruel
>>> may happen in the code that deals with buffering etc.
>>
>> Oh yeah, you're right.
>>
>> As a side note: I would love to get a kick-ass I/O stream package into
>> Phobos. It could replace std.stream as well as std.stdio.
>
> Then our goals are aligned. Be sure to take a peek at (if you haven't  
> already):
> https://github.com/schveiguy/phobos/blob/new-io/std/io.d

Yes, I'm gearing up to revisit that after a long D hiatus, and I came  
across this thread.

At this point, I really really like the ideas that you have in this. It  
solves an issue that I struggled with, and my solution was quite clunky.

I am thinking of this layout for streams/buffers:

1. Unbuffered stream used for raw i/o, based on a class hierarchy (which I  
have pretty much written)
2. Buffer like you have, based on a struct, with specific primitives. It's  
job is to collect data from the underlying stream, and present it to  
consumers as a random-access buffer.
3. Filter that has access to transform the buffer data/copy it.
4. Ranges that use the buffer/filter to process/present the data.

The problem I struggled with is the presentation of UTF data of any format  
as char[] wchar[] or dchar[]. 2 things need to happen. First is that the  
data needs to be post-processed to perform any necessary byte swapping.  
The second is to transcode the data into the correct width.

In this way, you can process UTF data of any type (I even have code to  
detect the encoding and automatically process it), and then use it in a  
way that makes sense for your code.

My solution was to paste in a "processing" delegate into the class  
hierarchy of buffered streams that allowed one read/write access to the  
buffer. But it's clunky, and difficult to deal with in a generalized  
fashion.

But the idea of using a buffer in between the stream and the range, and  
possibly bolting together multiple transformations in a clean way, makes  
this problem easy to solve, and I think it is closer to the vision  
Andrei/Walter have.

I also like the idea of "pinning" the data instead of my mechanism of  
using a delegate (which was similar but not as general). It also has  
better opportunities for optimization.

Other ideas that came to me that buffer filters could represent:

* compression/decompression
* encryption

I am going to study your code some more and see how I can update my code  
to use it. I still need to maintain the std.stdio.File interface, and  
Walter is insistent that the initial state of stdout/err/in must be  
synchronous with C (which kind of sucks, but I have plans on how to make  
it not be so bad).

There is still a lot of work left to do, but I think one of the hard parts  
is done, namely dealing with UTF transcoding. The remaining sticky part is  
dealing with shared. But with structs, this should make things much easier.

One question, is there a reason a buffer type has to be a range at all? I  
can see where it's easy to make it a range, but I don't see higher-level  
code using the range primitives when dealing with chunks of a stream.

-Steve


More information about the Digitalmars-d mailing list