Fuzzy string matching?

dsmith ds at nomail.com
Fri Jul 15 22:07:38 PDT 2011


Until recently, you could easily use std.regexp.search(target_string, find_string), but regexp is apparently no longer in phobos.  I seek a simple substitute.  std.algorithm.canFind might work, as
it is bool.

Maybe try something like:

foreach(str; strings)
    foreach(fls; system_files)
        if(std.algorithm.canFind(fls, str))          // usage needs verification
            str ~= ".ext";


== Repost the article of Jonathan M Davis (jmdavisProg at gmx.com)
== Posted at 2011/07/15 22:03 to digitalmars.D.learn

On Saturday 16 July 2011 01:17:36 Andrej Mitrovic wrote:
> Is there any such method in Phobos?
>
> I have to rename some files based on a string array of known names
> which need to be fuzzy-matched to file names and then rename the files
> to the matches.
>
> E.g.:
>
> string[] strings = ["food", "lamborghini", "architecture"]
>
> files on system:
> .\foo.ext
> .\lmbrghinione.ext
> .\archtwo.ext
>
> and if there's a fuzzy match then the matched files would be renamed to:
> .\food.ext
> .\lamborghini.ext
> .\architecture.ext
>
> Perhaps there's a C library I can use for this?

You can pass a comparator function to cmp to change how comparison is done,
but it's by character, so it'll only work in the case where the number of
characters is identical. Other than that, I'd be tempted to say that there
must be a function in std.range or std.algorithm that you could get to do it,
but I'd have to go over the list and really think about it. The fact that
you're effectively comparing the whole range at once instead of just
characters makes that a lot harder though.

- Jonathan M Davis



More information about the Digitalmars-d-learn mailing list