stream interfaces - with ranges

Fri May 18 10:30:32 PDT 2012

On 05/18/12 17:43, kenji hara wrote:
> 2012/5/19 Artur Skawina <art.08.09 at gmail.com>:
>> On 05/18/12 15:51, kenji hara wrote:
>>> OK. If reading bytes from underlying device failed, your 'fronts' can
>>> return empty slice. I understood.
>>> But, It is still *not efficient*. The returned slice will specifies a
>>> buffer controlled by underlying device. If you want to gather bytes
>>> into one chunk, you must copy bytes from returned slice to your chunk.
>>> We should reduce copying memories as much as possible.
>>
>> Depends if your input range supports zero-copy or not. IOW you avoid
>> the copy iff the range can somehow write the data directly to the caller
>> provided buffer. This can be true eg for file reads, where you can tell
>> the read(2) syscall to write into the user buffer. But what if you need to
>> buffer the stream? An intermediate buffer can become necessary anyway.
>> But, as i said before, i agree that a caller-provided-buffer-interface
>> is useful.
>>
>>   E[] fronts();
>>   void fronts(ref E[]);
>>
>> And one can be implemented in terms of the other, ie:
>>
>>  E[] fronts[] { E[] els; fronts(els); return els; }
>>  void fronts(ref E[] e) { e[] = fronts()[]; }
> 
> The flaw of your design is, the memory to store read bytes/elements is
> allocated by the lower layer.

It's a feature. :)

> E.g. If you want to construct linked list of some some elements, you
> must copy elements from returned slice to new allocated node. I think
> it is still inefficient.
> 
>> depending on which is more efficient. A range can provide
>>
>>  enum bool HasBuffer = 0 || 1;
>>
>> so that the user can pick the more suited alternative.
> 
> I think fewer primitives as possible is better design than adding
> extra/optional primitives.

If you pick just one scheme, then you will end up with an unnecessary
copy sometimes. Or using non-std APIs. Again, I'm saying *both* caller-
owned-buffer *and* range-owned-buffer interfaces should be defined.
Otherwise, code that needs decent performance will not be able to use
the pure range API, and will not interoperate well with "std" code.

> How many primitives in your interface design?

Multi-element versions of front, popFront and puts. I think this
was enough to get things working; this is the tested and proven part.

Then there's 'available' and 'free', so that it's possible to 
avoid blocking. And 'allocate' and 'release', for zero-copy output
streams. But i don't remember if i've actually used these parts, i
don't think i needed them.
This is all from memory, as the last time i worked on this was a while
ago, just before i ran into:

   http://www.digitalmars.com/d/archives/digitalmars/D/dtors_in_shared_structs_fail_to_compile_157978.html

...

>>> And, 'put' primitive in output range concept doesn't support non-blocikng write.
>>> 'put' should consume *all* of given data and write it  to underlying
>>> device, then it would block.
>>
>> True, a write-as-much-as-possible-but not-more primitive is needed.
>>
>>   size_t puts(E[], size_t atleast=size_t.max);
>>
>> or something like that. (Doing it this way allows for explicit
>> non-blocking 'puts', ie '(written=puts(els, 0))==0' means EAGAIN.)
>>
>>> Therefore, whole of range concept doesn't cover non-blocking I/O.
> 
> I can agree for the signatures. but the names 'fronts' and 'puts' are
> a little too similar.

The names are bad, i know... If anybody has better suggestions... (and
almost any other names would be better :) )

artur