tolf and detab

Jonathan M Davis jmdavisprog at gmail.com
Sat Aug 7 21:04:16 PDT 2010


On Friday 06 August 2010 18:50:52 Andrei Alexandrescu wrote:
> On 08/06/2010 08:34 PM, Walter Bright wrote:
> > I wrote these two trivial utilities for the purpose of canonicalizing
> > source code before checkins and to deal with FreeBSD's inability to deal
> > with CRLF line endings, and because I can never figure out the right
> > settings for git to make it do the canonicalization.
> > 
> > tolf - converts LF, CR, and CRLF line endings to LF.
> > 
> > detab - converts all tabs to the correct number of spaces. Assumes tabs
> > are 8 column tabs. Removes trailing whitespace from lines.
> > 
> > Posted here just in case someone wonders what they are.
> 
> [snip]
> 
> Nice, though they don't account for multiline string literals.
> 
> A good exercise would be rewriting these tools in idiomatic D2 and
> assess the differences.
> 
> 
> Andrei

I didn't try and worry about multiline string literals, but here are my more 
idiomatic solutions:



detab:

/* Replace tabs with spaces, and remove trailing whitespace from lines.
  */

import std.conv;
import std.file;
import std.stdio;
import std.string;

void main(string[] args)
{
    const int tabSize = to!int(args[1]);
    foreach(f; args[2 .. $])
        removeTabs(tabSize, f);
}


void removeTabs(int tabSize, string fileName)
{
    auto file = File(fileName);
    string[] output;

    foreach(line; file.byLine())
    {
        int lastTab = 0;

        while(lastTab != -1)
        {
            const int tab = line.indexOf('\t');

            if(tab == -1)
                break;

            const int numSpaces = tabSize - tab % tabSize;

            line = line[0 .. tab] ~ repeat(" ", numSpaces) ~ line[tab + 1 .. $];

            lastTab = tab + numSpaces;
        }

        output ~= line.idup;
    }

    std.file.write(fileName, output.join("\n"));
}

-------------------------------------------

The three differences between mine and Walter's are that mine takes the tab size 
as the first argumen,t it doesn't put a newline at the end of the file, and it 
writes the file even if it changed (you could test for that, but when using 
byLine(), it's a bit harder). Interestingly enough, from the few tests that I 
ran, mine seems to be somewhat faster. I also happen to think that the code is 
clearer (it's certainly shorter), though that might be up for debate.

-------------------------------------------



tolf:

/* Replace line endings with LF
  */

import std.file;
import std.string;

void main(string[] args)
{
    foreach(f; args[1 .. $])
        fixEndLines(f);
}

void fixEndLines(string fileName)
{
    auto fileStr = std.file.readText(fileName);
    auto result = fileStr.replace("\r\n", "\n").replace("\r", "\n");

    std.file.write(fileName, result);
}

-------------------------------------------

This version is ludicrously simple. And it was also faster than Walter's in the 
few tests that I ran. In either case, I think that it is definitely clearer code.


I would have thought that being more idomatic would have resulted in slower code 
than what Walter did, but interestingly enough, both programs are faster with my 
code. They might take more memory though. I'm not quite sure how to check that. 
In any cases, you wanted some idiomatic D2 solutions, so there you go.

- Jonathan M Davis


More information about the Digitalmars-d mailing list