iopipe alpha 0.0.1 version

Sat Oct 21 10:33:20 UTC 2017

On 10/19/2017 03:12 PM, Steven Schveighoffer wrote:
> On 10/19/17 7:13 AM, Martin Nowak wrote:
>> On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:
>>> What would be nice is a mechanism to detect this situation, since the
>>> above is both un- at safe and incorrect code.
>>>
>>> Possibly you could instrument a window with a mechanism to check to see
>>> if it's still correct on every access, to be used when compiled in
>>> non-release mode for checking program correctness.
>>>
>>> But in terms of @safe code in release mode, I think the only option is
>>> really to rely on the GC or reference counting to allow the window to
>>> still exist.
>>
>> We should definitely find a @nogc solution to this, but it's a good
>> litmus test for the RC compiler support I'll work on.
>> Why do IOPipe have to hand over the window to the caller?
>> They could just implement the RandomAccessRange interface themselves.
>>
>> Instead of
>> ```d
>> auto w = f.window();
>> f.extend(random());
>> w[0];
>> ```
>> you could only do
>> ```d
>> f[0];
>> f.extend(random());
>> f[0]; // bug, but no memory corruption
>> ```
> 
> So the idea here (If I understand correctly) is to encapsulate the
> window into the pipe, such that you don't need to access the buffer
> separately? I'm not quite sure because of that last comment. If f[0] is
> equivalent to previous code f.window[0], then the second f[0] is not a
> bug, it's valid, and accessing the first element of the window (which
> may have moved).

The above sample with the window is a bug and memory corruption because
of iterator/window invalidation by extend.
If you didn't thought of the invalidation, then the latter example would
still be a bug to you, but not a memory corruption.

>> This problem seems to be very similar to the Range vs. Iterators
>> difference, the former can perform bounds checks on indexing, the later
>> are inherently unsafe (with expensive runtime debug checks e.g. in VC++).
> 
> But ranges have this same problem.
> 
> For instance:
> const(char[])[] lines = stdin.byLine.array;
> 
> Here, since byLine uses GC buffering, it's @safe (but wrong). If non-GC
> buffers are used, then it's not @safe.
> 
> I think as long as the windows are backed by GC data, it should be
> @safe. In this sense, your choice of buffering scheme can make something
> @safe or not @safe. I'm OK with that, as long as iopipes can be @safe in
> some way (and that happens to be the default).
> 
>> Similarly always accessing the buffer through IOPipe would allow cheap
>> bounds checking, and sure you could still offer IOPipe.ptr for unsafe
>> code.
> 
> It's an interesting idea to simply make the iopipe the window, not just
> for @safety reasons:
> 
> 1. this means the iopipe itself *is* a random access range, allowing it
> to automatically fit into existing algorithms.
> 2. Existing random-access ranges can be easily shoehorned into being
> ranges (I already did it with arrays, and it's not much harder with
> popFrontN). Alternatively, code that uses iopipes can simply check for
> the existence of iopipe-like methods, and use them if they are present.
> 3. Less verbose usage, and more uniform access. For instance if an
> iopipe defines opIndex, then iopipe.window[0] and iopipe[0] are possibly
> different things, which would be confusing.
> 
> Some downsides however:
> 
> 1. iopipes can be complex and windows are not. They were a fixed view of
> the current buffer. The idea that I can fetch a window of data from an
> iopipe and then deal simply with that part of the data was attractive.

You could still have a window internally and just forward to that.

> 2. The iopipe is generally not copyable once usage begins. In other
> words, the feature of ranges that you can copy them and they just work,
> would be difficult to replicate in iopipe.

That's a general problem. Unique ownership is really useful, but most
phobos range methods don't care, and assume copying is implicit saving.
Not too nice and I guess this will bite us again with RC/Unique/Weak.

The current workaround for this is `refRange`.

> A possible way forward could be:
> 
> * iopipe is a random-access range (not necessarily a forward range).
> * iopipe.window returns a non-extendable window of the buffer itself,
> which is a forward/random-access range. If backed by the GC or some form
> of RC, everything is @safe.
> * Functions which now take iopipes could be adjusted to take
> random-access ranges, and if they are also iopipes, could use the extend
> features to get more data.
> * iopipe.release(size_t) could be hooked by popFrontN. I don't like the
> idea of supporting slicing on iopipes, for the non-forward aspect of
> iopipe. Much better to have an internal hook that modifies the range
> in-place.
> 
> This would make iopipes fit right into the range hierarchy, and
> therefore could be integrated easily into Phobos.

I made an interesting experiment with buffered input ranges quite a
while ago.
https://gist.github.com/MartinNowak/1257196

This would use popFront to fetch new data and ref-counts a list of
buffers depending on older saved ranges still using earlier buffers.
With a bit of creative use, the existing Range primitives could be used
to implement infinite look-ahead.

auto beg = rng.save;
auto end = rng.find("bla");
auto window = beg[0 .. end]; // get a random access window

The main problem with this has been, that the many implicit copies (e.g.
in foreach) bump the reference-count, so the RC buffer release would
often not work.
Could be avoided by making them non-copyable, but again phobos and
foreach currently don't support this hybrid of input (consuming) and
forward (saveable) range.

-Martin