tolf and detab

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Sat Aug 7 21:59:50 PDT 2010


On 08/07/2010 11:04 PM, Jonathan M Davis wrote:
> On Friday 06 August 2010 18:50:52 Andrei Alexandrescu wrote:
>> A good exercise would be rewriting these tools in idiomatic D2 and
>> assess the differences.
>>
>>
>> Andrei
>
> I didn't try and worry about multiline string literals, but here are my more
> idiomatic solutions:
>
>
>
> detab:
>
> /* Replace tabs with spaces, and remove trailing whitespace from lines.
>    */
>
> import std.conv;
> import std.file;
> import std.stdio;
> import std.string;
>
> void main(string[] args)
> {
>      const int tabSize = to!int(args[1]);
>      foreach(f; args[2 .. $])
>          removeTabs(tabSize, f);
> }
>
>
> void removeTabs(int tabSize, string fileName)
> {
>      auto file = File(fileName);
>      string[] output;
>
>      foreach(line; file.byLine())
>      {
>          int lastTab = 0;
>
>          while(lastTab != -1)
>          {
>              const int tab = line.indexOf('\t');
>
>              if(tab == -1)
>                  break;
>
>              const int numSpaces = tabSize - tab % tabSize;
>
>              line = line[0 .. tab] ~ repeat(" ", numSpaces) ~ line[tab + 1 .. $];
>
>              lastTab = tab + numSpaces;
>          }
>
>          output ~= line.idup;
>      }
>
>      std.file.write(fileName, output.join("\n"));
> }

Very nice. Here's how I'd improve removeTabs:

#!/home/andrei/bin/rdmd
import std.conv;
import std.file;
import std.getopt;
import std.stdio;
import std.string;

void main(string[] args)
{
     uint tabSize = 8;
     getopt(args, "tabsize|t", &tabSize);
     foreach(f; args[1 .. $])
         removeTabs(tabSize, f);
}

void removeTabs(int tabSize, string fileName)
{
     auto file = File(fileName);
     string output;
     bool changed;

     foreach(line; file.byLine(File.KeepTerminator.yes))
     {
         int lastTab = 0;

         while(lastTab != -1)
         {
             const tab = line.indexOf('\t');
             if(tab == -1)
                 break;
             const numSpaces = tabSize - tab % tabSize;
             line = line[0 .. tab] ~ repeat(" ", numSpaces) ~ line[tab + 
1 .. $];
             lastTab = tab + numSpaces;
             changed = true;
         }

         output ~= line;
     }

     file.close();
     if (changed)
         std.file.write(fileName, output);
}

> -------------------------------------------
>
> The three differences between mine and Walter's are that mine takes the tab size
> as the first argumen,t it doesn't put a newline at the end of the file, and it
> writes the file even if it changed (you could test for that, but when using
> byLine(), it's a bit harder). Interestingly enough, from the few tests that I
> ran, mine seems to be somewhat faster. I also happen to think that the code is
> clearer (it's certainly shorter), though that might be up for debate.
>
> -------------------------------------------
>
>
>
> tolf:
>
> /* Replace line endings with LF
>    */
>
> import std.file;
> import std.string;
>
> void main(string[] args)
> {
>      foreach(f; args[1 .. $])
>          fixEndLines(f);
> }
>
> void fixEndLines(string fileName)
> {
>      auto fileStr = std.file.readText(fileName);
>      auto result = fileStr.replace("\r\n", "\n").replace("\r", "\n");
>
>      std.file.write(fileName, result);
> }
>
> -------------------------------------------
>
> This version is ludicrously simple. And it was also faster than Walter's in the
> few tests that I ran. In either case, I think that it is definitely clearer code.

Very nice! You may as well guard the write with an if (result != 
fileStr). With control source etc. in the mix it's always polite to not 
touch files unless you are actually modifying them.

This makes me think we should have a range that detects and replaces 
patterns lazily and on the fly. I've always thought that loading entire 
files in memory and working on them is "cheating" in some sense, and a 
range would help with replacing patterns in streams.

> I would have thought that being more idomatic would have resulted in slower code
> than what Walter did, but interestingly enough, both programs are faster with my
> code. They might take more memory though. I'm not quite sure how to check that.
> In any cases, you wanted some idiomatic D2 solutions, so there you go.

Looking very good, thanks. I think we should have a feature these and a 
few others as examples on the website.


Andrei


More information about the Digitalmars-d mailing list