tolf and detab

bearophile bearophileHUGS at lycos.com
Sun Aug 8 12:32:50 PDT 2010


Walter Bright:
> bearophile wrote:
> > In the D code I have added an idup to make the comparison more fair, because
> > in the Python code the "line" is a true newly allocated line, you can safely
> > use it as dictionary key.
> 
> So it is with byLine, too. You've burdened D with double the amount of allocations.

I think you are wrong two times:

1) byLine() doesn't return a newly allocated line, you can see it with this small program:

import std.stdio: File, writeln;

void main(string[] args) {
    char[][] lines;
    auto file = File(args[1]);
    foreach (rawLine; file.byLine()) {
        writeln(rawLine.ptr);
        lines ~= rawLine;
    }
    file.close();
}


Its output shows that all "strings" (char[]) share the same pointer:

14E5E00
14E5E00
14E5E00
14E5E00
14E5E00
14E5E00
14E5E00
...


2) You can't use the result of rawLine() as string key for an associative array, as you I have said you can in Python. Currently you can, but according to Andrei this is a bug. And if it's not a bug then I'll reopen this closed bug 4474:

http://d.puremagic.com/issues/show_bug.cgi?id=4474


> Also, I object in general to this method of making things "more fair". Using a
> less efficient approach in X because Y cannot use such an approach is not a
> legitimate comparison.

I generally agree, but this it not the case.
In some situations you indeed don't need a newly allocated string for each loop, because for example you just want to read them and process them and not change/store them. You can't do this in Python, but this is not what I want to test. As I have explained in bug 4474 this behaviour is useful but it is acceptable only if explicitly requested by the programmer, and not as default one. The language is safe, as Andrei explains there, because you are supposed to idup the char[] to use it as key for an associative array (if your associative array is declared as int[char[]] then it can accept such rawLine() as keys, but you can clearly see those aren't strings. This is why I have closed bug 4474).

Bye,
bearophile


More information about the Digitalmars-d mailing list