let's talk about output ranges
Adam D. Ruppe
destructionator at gmail.com
Thu Feb 6 10:48:22 PST 2014
On Thursday, 6 February 2014 at 18:10:53 UTC, H. S. Teoh wrote:
> Would it make sense to have a .full method/property analogous
> to input ranges' .empty? Perhaps something like:
That could be ok but I agree that having to check is a pain.
Besides, what are you going to do if it is full? Suppose I call
char[1] b;
toUpper("lol", staticSink(b[]));
The best toUpper could do is truncate or throw.... and I think
that would be better encoded in the range itself so the caller
can decide.
auto got = toUpper("lol", staticSinkTruncating(b[]));
if(got.length < "lol".length) {
// we could perhaps call it again to process the rest
}
(put in the truncating one would just check its own length and
noop if it is full)
Otherwise, throwing range violations/out of memory exceptions are
what would most likely happen anyway.
> One thing that
> the current output range API doesn't do very well is chaining.
Indeed. In my other post, I just wrote about finish. Finish
serves to flush the buffer (digests or compression algorithms for
example might need to be padded to block size), could finalize
things (suppose an appender which just puts the pieces into a
static array, then calls join all at once at the end), and could
also just generally return the result.
There are some cases where returning from finish doesn't make
sense, such as if you sunk to a file, you wouldn't keep an array
of the contents around... but finish is still potentially useful
in that it could close the file or release a lock. (Of course,
dtors could do that too. But destructors can never return data to
the user - that's where finish is special.)
Anyway, not all output ranges would offer finish and not all
would return T[]. But not all input ranges offer opSlice either
so we're still in analogous territory.
> This is a big usability hindrance. Ideally we'd want to write
> something
> like:
>
> auto result = "mystring".toUpper(ArcOutputRange!string())
> .translate("abc", "def");
>
> But I'm not sure how this can be made to work.
hmmm.... finish doesn't account for all that.... well, I guess it
could by returning a range.
tbh toUpper might be better as a higher-order input range. Like
alias toUpper = map!charToUpper(...). Those chain, they don't
allocate, and they are well-defined right now.
Then at the end we build the result lazily and just put it all at
once into the output range.
"mystring".toUpper.translate("abc","def").array(ArcOutputRange!string());
Yeah, I actually think that's the way to go. And calling .array
at the end is nothing new to Phobos anyway. I'd be a bit weird
doing it with toUpper but I think it really is the best fit.
(BTW I would be PISSED of toUpper actually changed like this.
It'd break a bunch of code and I don't really care that toUpper
allocates. I want it to just work. But we could offer equivalent
functionality via per-character functions and map so we don't
have to break code to offer the new options.)
> So we should extend put() to take an index, then?
that would work.
> An allocator is definitely not an output range!
yup, and I don't think a static array is either. A static array
is neither an input range, since you can't do a = a[1..$]. But
offering easy getters for such is easy and it rox.
> into a data sink should not care what an allocator is; they
> should take an output range.
Actually, I think they should generate lazy input ranges whenever
possible. Then only at the end do we send it to the output range.
It's just input ranges aren't allowed to allocate, that would
kill their complexity guarantee, so we need an example of a
function which *must* allocate up front.
They want the random access output range. Otherwise we can just
put at the end.
> Let
> stdout do the buffering, and let toLower send the data to stdout
> directly. Calling an allocator from toLower essentially amounts
> to buffering the data twice.
yes
> They should probably be *always* passed by ref, otherwise you
> could end up with some pathological behaviour of data from
> multiple sources overwriting each other because they were
> operating on copies of output ranges instead of references to a
> single one.
That won't necessarily work though, you can't have a ref default
parameter. But we can use pimpl or something to force a regular
struct to be a ref item. Lazy initialization can be surprising,
but we deal with that already with array slices so I think it is
ok.
> Also, delegates and function pointers should be treated as
> output ranges as well (Phobos should define .put and whatever
> other needed methods for them via UFCS).
Yes, indeed.
> Doesn't solve the case where you call some library function
> that throws, though. :-(
at least there's nothrow if it is really that important to us.
More information about the Digitalmars-d
mailing list