Text editing [Was: Re: #line decoder]

Sergey Gromov snake.scaly at gmail.com
Fri Sep 26 18:03:24 PDT 2008


In article <gbgem1$1qvo$1 at digitalmars.com>, bearophileHUGS at lycos.com 
says...
> Updated timings:
> Timings, data2.txt, warm timings, best of 3:
>   loader1:  23.05 s
>   loader2:   3.00 s
>   loader3:  44.79 s
>   loader4:  39.28 s
>   loader5:  21.31 s
>   loader6:   7.20 s
>   loader7:   7.51 s
>   loader8:   8.45 s
>   loader9:   5.46 s
>   loader10:  3.73 s
>   loader10b: 3.88 s
>   loader11: 82.54 s
>   loader12: 38.87 s

You've intrigued me so I did some experimenting, too.  I won't post my 
bad tries but the better one.

import std.stream: BufferedFile;
import std.string: split;
import std.gc: enable, disable;
import rope: Rope;
void main()
{
    // disable();
    auto fin = new BufferedFile("data2.txt");
    alias Rope!(string) SRope;
    SRope[] result;
    foreach (el; split(cast(string) fin.readLine()))
    {
        result ~= new SRope;
        result[$-1] ~= el;
    }
    foreach (char[] line; fin)
        foreach (id, el; split(cast(string) line))
            result[id] ~= el;
    auto month = result[0].get();
    auto day = result[1].get();
    auto num = result[2].get();
    // enable();
}

The rope thing is my version of an ArrayBuilder ;) :

module rope;
class Rope(T)
{
    typeof(this) opCatAssign(T el)
    {
        if (pool >= current.length)
        {
            chunks ~= current;
            current = new T[current.length * 2];
            pool = 0;
        }
        current[pool++] = el;
        return this;
    }
    T[] get()
    {
        T[] result;
        foreach (c; chunks)
            result ~= c;
        return result ~ current[0 .. pool];
    }
    this()
    {
        current = new T[16];
        pool = 0;
    }
    private
    {
        T[][] chunks;
        T[] current;
        size_t pool;
    }
}

The timings are:
2.77s  the loader as it is posted here
2.99s  with garbage collector disabled (uncomment commented)
3.2s   Python version

On a side note, I've got a lot of weird timings today.  It turned out 
that on many occasions, when I had very bad timings, most of the 
processing time were spent on a closing brace of a function, or on an 
std.file.read obviously not reading the file but doing some memory 
handling.  Something definitely goes wrong sometimes.


More information about the Digitalmars-d-announce mailing list