tolf and detab
Jonathan M Davis
jmdavisprog at gmail.com
Sat Aug 7 22:18:07 PDT 2010
On Saturday 07 August 2010 21:59:50 Andrei Alexandrescu wrote:
> Very nice. Here's how I'd improve removeTabs:
>
> #!/home/andrei/bin/rdmd
> import std.conv;
> import std.file;
> import std.getopt;
> import std.stdio;
> import std.string;
>
> void main(string[] args)
> {
> uint tabSize = 8;
> getopt(args, "tabsize|t", &tabSize);
> foreach(f; args[1 .. $])
> removeTabs(tabSize, f);
> }
>
> void removeTabs(int tabSize, string fileName)
> {
> auto file = File(fileName);
> string output;
> bool changed;
>
> foreach(line; file.byLine(File.KeepTerminator.yes))
> {
> int lastTab = 0;
>
> while(lastTab != -1)
> {
> const tab = line.indexOf('\t');
> if(tab == -1)
> break;
> const numSpaces = tabSize - tab % tabSize;
> line = line[0 .. tab] ~ repeat(" ", numSpaces) ~ line[tab +
> 1 .. $];
> lastTab = tab + numSpaces;
> changed = true;
> }
>
> output ~= line;
> }
>
> file.close();
> if (changed)
> std.file.write(fileName, output);
> }
Ah. I needed to close the file. I pretty much always just use readText(), so I
didn't catch that. Also, it does look like detecting whether the file changed was
a bit simpler than I thought that it would be. Quite simple really. Thanks.
> Very nice! You may as well guard the write with an if (result !=
> fileStr). With control source etc. in the mix it's always polite to not
> touch files unless you are actually modifying them.
Yes. That would be good. It's the kind of thing that I forget - probably because
most of the code that I write generates new files rather than updating pre-
existing ones.
>
> This makes me think we should have a range that detects and replaces
> patterns lazily and on the fly. I've always thought that loading entire
> files in memory and working on them is "cheating" in some sense, and a
> range would help with replacing patterns in streams.
It would certainly be nice to have a way to reasonably process with ranges
without having to load the whole thing into memory at once. Most of the time, I
wouldn't care too much, but if you start processing large files, having the whole
thing in memory could be a problem (especially if you have multiple versions of
it which were created along the way as you were manipulating it). Haskell does
lazy loading of files by default and doesn't load the data until you read the
appropriate part of the string. It shouldn't be all that hard to do something
similar with D and ranges. The hard port would be trying to do all of it in a
way that makes it so that all of the processing of the file's data doesn't have
to load it all into memory (let alone load it multiple times). I'm not sure that
you could do that without explicitly processing a file line by line, writing it
to disk after each line is processed, since you could be doing an arbitrary set
of operations on the data. It could be interesting to try and find a solution for
that though.
>
> Looking very good, thanks. I think we should have a feature these and a
> few others as examples on the website.
Well, I for one, much prefer the ability to program in a manner that's closer to
telling the computer to do what I want rather than having to tell it how to do
what I want (the replace end-of-line character program being a prime example).
It makes life much simpler. Ranges certainly help a lot in that regard too. And
having good example code of how to program that way could help encourage people
to program that way and use std.range and std.algorithm and their ilk rather
than trying more low-level solutions which aren't as easy to understand.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list