find regex in backward direction ?

Виталий Фадеев vital.fadeev at gmail.com
Sun Dec 20 04:33:21 UTC 2020


On Saturday, 19 December 2020 at 23:16:18 UTC, kdevel wrote:
> On Saturday, 19 December 2020 at 12:52:54 UTC, Виталий Фадеев 
> wrote:
>> Goal:
>>     size_t pos = findRegexBackward( r"abc"d );
>>     assert( pos == 4 );
>
>
> module LastOccurrence;
>
> size_t findRegexBackward_1 (dstring s, dstring pattern)
> {
>    import std.regex : matchAll;
>    auto results = matchAll (s, pattern);
>    if (results.empty)
>       throw new Exception ("could not match");
>    size_t siz;
>    foreach (rm; results)
>       siz = rm.pre.length;
>    return siz;
> }
>
> size_t findRegexBackward_2 (dstring s, dstring pattern)
> // this does not work with irreversible patterns ...
> {
>    import std.regex : matchFirst;
>    import std.array : array;
>    import std.range: retro;
>    auto result = matchFirst (s.retro.array, 
> pattern.retro.array);
>    if (result.empty)
>       throw new Exception ("could not match");
>    return result.post.length;
> }
>
> unittest {
>    import std.exception : assertThrown;
>    static foreach (f; [&findRegexBackward_1, 
> &findRegexBackward_2]) {
>       assert (f ("abc3abc7", r""d) == 8);
>       assert (f ("abc3abc7", r"abc"d) == 4);
>       assertThrown (f ("abc3abc7", r"abx"d));
>       assert (f ("abababababab", r"ab"d) == 10);
>    }
> }

Thanks.
But, not perfect.

We can't use reverse, becausу "ab\w" will be "w\ba" ( expect 
matching "abc". revesed is "cba" ).

> size_t findRegexBackward_2 (dstring s, dstring pattern)
> ...
>    assert (f ("abc3abc7", r"ab\w"d) == 4);
> ...

Of course, I using matchAll. But it scan all text in forward 
direction.

>   size_t findRegexBackward_1 (dstring s, dstring pattern)

     /** */
     size_t findRegexBackwardMatchCase( dstring s, dstring needle, 
out size_t matchedLength )
     {
         auto matches = matchAll( s, needle );
         if ( matches.empty )
         {
             return -1;
         }
         else
         {
             auto last = matches.front;
             foreach ( m; matches )
             {
                 last = m;
             }
             matchedLength = last.hit.length;
             return last.pre.length;
         }
     }

Thank!
Fastest solution wanted!

May be... some like a "RightToLeft" in Win32 API...

https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regexoptions?view=net-5.0#System_Text_RegularExpressions_RegexOptions_RightToLeft

but how on Linux? MS-regex and Linux-regex is identical ?



More information about the Digitalmars-d-learn mailing list