Splitting up large dirty file

Mon May 21 22:11:42 UTC 2018

On Monday, 21 May 2018 at 17:42:19 UTC, Jonathan M Davis wrote:
> On Monday, May 21, 2018 15:00:09 Dennis via Digitalmars-d-learn 
> wrote:
> drop is range-based, so if you give it a string, it's going to 
> decode because of the whole auto-decoding mess with 
> std.range.primitives.front and popFront.

In this case I used drop to drop lines, not characters. The 
exception was thrown by the joiner it turns out.

On Monday, 21 May 2018 at 17:42:19 UTC, Jonathan M Davis wrote:
>> I find Exceptions in range code hard to interpret.
>
> Well, if you just look at the stack trace, it should tell you. 
> I don't see why ranges would be any worse than any other code 
> except for maybe the fact that it's typical to chain a lot of 
> calls, and you frequently end up with wrapper types in the 
> stack trace that you're not necessarily familiar with.

Exactly that: stack trace full of weird mangled names of template 
functions, lambdas etc. And because of lazy evaluation and chains 
of range functions, the line number doesn't easily show who the 
culprit is.

On Monday, 21 May 2018 at 17:42:19 UTC, Jonathan M Davis wrote:
> In many cases, ranges will be pretty much the same as writing 
> loops, and in others, the abstraction is worth the cost.

 From the benchmarking I did, I found that ranges are easily an 
order of magnitude slower even with compiler optimizations:

https://run.dlang.io/gist/5f243ca5ba80d958c0bc16d5b73f2934?compiler=ldc&args=-O3%20-release

```
LDC -O3 -release
              Range   Procedural
Stringtest: ["267ns", "11ns"]
Numbertest: ["393ns", "153ns"]

DMD -O -inline -release
               Range   Procedural
Stringtest: ["329ns", "8ns"]
Numbertest: ["1237ns", "282ns"]
```

This first range test is an opcode scanner I wrote for an 
assembler. The range code is very nice and it works, but it 
needlessly allocates a new string. So I switched to a procedural 
version, which runs (and compiles) faster. This procedural 
version did have some bugs initially though.

The second test is a simple number calculation. I thought that 
the range code inlines to roughly the same procedural code so it 
could be optimized the same, but there remains a factor 2 gap. I 
don't know where the difficulty is, but I did notice that 
switching the maximum number from int to enum makes the 
procedural version 0 ns (calculated at compile time) while LDC 
can't deduce the outcome in the range version (which still runs 
for >300 ns).