Search a file skiping whitespace

Willy Martinez wmartinez at thisisnotmyrealemail.com
Sat Jul 16 13:41:11 PDT 2011


== Quote from Dmitry Olshansky (dmitry.olsh at gmail.com)'s article
> If you wish to avoid storing all of this in an array by using e.g.
> filter _and_  use Boyer-Moore search on it then: No, you can't do that.
> The reason is that filter is ForwardRange with an important consequence
> that you can't look at arbitrary Nth element in O(1). And Boyer-Moore
> requires such and access to be anywhere efficient.
> Why doesn't filter not provide O(1) random access ? Because to get Nth
> element you'd need to check at least N (and potentially unlimited)
> number of elements before in case they get filtered out.
> > Any help?
> If I'd had this sort of problem I'd use something along the lines:
> auto file = File("yourfile");
> foreach( line; file.ByLine)
> {
>      auto onlyDigitis = array(filter!((x){   return !isWhite(x);
> })(line)); // this copies all digits to a new array
>      auto result = find(onlyDigits, ... ); //your query here
>      ///....
> }
> > Thanks

I don't mind storing it in memory. Each .txt file is around 20MB so the filtered
string should be even smaller.

Still, calling array gives this error:

..\..\src\phobos\std\algorithm.d(3252): Error: function
std.algorithm.BoyerMooreFinder!(result,string).BoyerMooreFinder.beFound (string
haystack) is not callable using argument types (dchar[])
..\..\src\phobos\std\algorithm.d(3252): Error: cannot implicitly convert
expression (haystack) of type dchar[] to string
..\..\src\phobos\std\algorithm.d(3252): Error: cannot implicitly convert
expression (needle.beFound((__error))) of type string to dchar[] search_seq.d(13):
Error: template instance std.algorithm.find!(dchar[],result,string) error
instantiating



>From this code:

import std.algorithm;
import std.array;
import std.file;
import std.stdio;

void main(string[] args) {
	auto needle = boyerMooreFinder(args[1]);
	foreach (string name; dirEntries(".", SpanMode.shallow)) {
		if (name[$-3 .. $] == "txt") {
			writeln(name);
			string text = readText(name);
			auto haystack = array(filter!("a >= '0' && a <= '9'")(text));
			auto result = find(haystack, needle);
			writeln(result);
		}
	}
}


I'm using DMD 2.054 on Windows if that helps


More information about the Digitalmars-d-learn mailing list