Range Redesign: Empty Ranges

H. S. Teoh hsteoh at qfbox.info
Wed Mar 6 17:32:17 UTC 2024


On Wed, Mar 06, 2024 at 04:47:02PM +0000, Steven Schveighoffer via Digitalmars-d wrote:
[...]
> The only tricky aspect is ranges that are references
> (classes/pointers).  Neither of those to me should be supported IMO,
> you can always wrap such a thing in a range harness.
[...]

Every time this topic comes up, class-based ranges become the whipping
boy of range design woes.  Actually, they serve an extremely important
role: type erasure, which is critical when you have code like this:

	auto myRangeFunc(R,S)(R range1, S range2) {
		if (runtimeDecision()) {
			return range1;
		} else {
			return range2;
		}
	}

This generally will not compile because R and S are different types,
even if both conform to a common underlying range API, like an input
range. To remedy this situation, class-based range wrappers come to the
rescue:

	auto myRangeFunc(R,S)(R range1, S range2) {
		if (runtimeDecision()) {
			return inputRangeObject(range1);
		} else {
			return inputRangeObject(range2);
		}
	}

Note that you can't use a struct wrapper here, because R and S have
different ABIs; the only way to correctly forward range methods to R or
S is to use overridden base class methods. IOW, the type erasure of R
and S is unavoidable for this code to work.

//

Also, class-based ranges are sometimes necessary for practical reasons,
like in this one program I had, that has an UFCS chain consisting of
hundreds of components (not all written out explicitly in the same
function, of course, but that's what the underlying instantiation looks
like).  It generated ridiculously large symbols that crashed the
compiler. Eventually the bug I filed prodded Rainer to implement symbol
compression in DMD. But even then, the symbols were still ridiculously
huge -- because the outermost template instantiation had to encode the
types of every one of the hundreds of components.  As a result, symbol
compression or not, compile times were still ridiculously slow because
the compiler still had to copy those symbols around -- their length grew
quadratically with every component added to the chain.

Inserting a call to .inputRangeObject in the middle of the chain
dramatically cut down the size of the resulting symbols, because it
effectively erased all the types preceding it, resulting in much saner
codegen afterwards.


T

-- 
Leather is waterproof.  Ever see a cow with an umbrella?


More information about the Digitalmars-d mailing list