RFC on range design for D2
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Wed Sep 10 08:47:24 PDT 2008
superdan wrote:
> Andrei Alexandrescu Wrote:
>
>> bearophile wrote:
>>> Few benchmarks, appending ints, note this is a worst-case situation (other benchmarks are less dramatic). Just many appends, followed by the "release":
>>>
>>> benchmark 10, N=10_000_000:
>>> Array append: 0.813 s
>>> ArrayBuilder: 0.159 s
>>> ArrayAppender: 1.056 s
>>>
>>> benchmark 10, N=100_000_000:
>>> Array append: 10.887 s
>>> ArrayBuilder: 1.477 s
>>> ArrayAppender: 13.305 s
>> That's odd. The array appender should never, by definition, do
>> significantly worse than the straight array append. I think some other
>> factor intervened (e.g. swapping). Also don't forget to compile with -O
>> -release -inline and to test several times after a warmup run. I adapted
>> your code obtaining the numbers below:
>>
>> benchmark 10, N=10000000:
>> Array append: 0.69 s
>> ArrayAppender: 0.19 s
>>
>> benchmark 10, N=25000000:
>> Array append: 2.06 s
>> ArrayAppender: 0.82 s
>>
>> benchmark 10, N=50000000:
>> Array append: 4.28 s
>> ArrayAppender: 1.75 s
>>
>> benchmark 10, N=75000000:
>> Array append: 9.62 s
>> ArrayAppender: 5.8 s
>>
>> benchmark 10, N=100000000:
>> Array append: 11.35 s
>> ArrayAppender: 6.20 s
>>
>>
>> Andrei
>
> arrayappender is simple as dumb. compare and contrast with the ginormous arraybuilder. yet works great. why? because it is on the right thing. the common case. on the uncommon case it does whatever to get the job done. don't matter if it's rare.
>
> not sure anybody saw the irony. it was bearophile advocating simplicity eh. allegedly andre's the complexity guy. code seems to tell nother story.
>
> btw doods i figured indexed access is slower than access via pointer. so i recoded arrayappender to only use pointers. there's some half a second savings for the large case. small arrays don't feel it.
>
> struct ArrayAppender(T)
> {
> private T* _begin;
> private T* _end;
> private T* _eos;
>
> this(T[] array)
> {
> _begin = array.ptr;
> _end = _begin + array.length;
> if (_begin) _eos = _begin + .capacity(_begin) / T.sizeof;
> }
>
> size_t capacity() const { return _eos - _begin; }
>
> void putNext(T item)
> {
> if (_end < _eos)
> {
> *_end++ = item;
> }
> else
> {
> auto tmp = _begin[0 .. _end - _begin];
> tmp ~= item;
> _begin = tmp.ptr;
> _end = _begin + tmp.length;
> _eos = _begin + .capacity(_begin) / T.sizeof;
> }
> }
>
> T[] releaseArray()
> {
> auto result = _begin[0 .. _end - _begin];
> _begin = _end = _eos = null;
> return result;
> }
> }
Thanks! Can't hurt. Guess I'll integrate your code if you don't mind.
I gave it some more thought and I have a theory for the root of the
issue. My implementation assumes there's exponential (multiplicative)
increase of capacity in the built-in ~=. I hope Walter wouldn't do
anything else. If there are differences in growth strategies between D1
and D2, that could explain the difference between bearophile's
benchmarks and mine.
Andrei
More information about the Digitalmars-d-announce
mailing list