[Issue 9634] std.regex.ctRegex chokes on (?:a+)

d-bugmail at puremagic.com d-bugmail at puremagic.com
Wed Apr 17 13:18:13 PDT 2013


http://d.puremagic.com/issues/show_bug.cgi?id=9634



--- Comment #2 from Dmitry Olshansky <dmitry.olsh at gmail.com> 2013-04-17 13:18:11 PDT ---
(In reply to comment #1)
> Created an attachment (id=1200) [details]
> Stripped down regex parser that shows the bug
> 

I think I've pinpointed the issue in an even smaller test case (~90 LOCs).
It directly relates to dealing with arrays of structs at CTFE.

Interesting point is that the assertion in main passes in the code below if you
switch 'Bytecode' struct that is nothing more then one int to simply int. Hence
my thought about structs being the trigger.


struct Bytecode
{
    int raw;
}

struct Parser
{
    dchar _current;
    bool empty;
    string pat;      
    Bytecode[] ir;     

    this(string pattern)
    {
        pat = pattern;
        next();
        uint fix;//fixup pointer
        for(;;)
        {
            switch(current)
            {
            case '(':
                next();
                fix = cast(uint)ir.length;
                assert(current == '?');
                next();
                assert(current == ':');                    
                ir ~= Bytecode(-1);
                next();                    
                break;
            case ')': //CRITICAL POINT: the last closing paren
                //return; // up to this point generated bytecode is the same
                next();
                //return; //still OK
                { //CRITICAL POINT                    
                    size_t cnt = ir.length-fix-1;
                    //even simple write loop is failing with awful results
                    for(size_t i = 0; i < cnt; i++)
                    {
                        auto bc = Bytecode(i+10);
                        ir[fix+i] = bc;
                    }
                }
                return; // and here it differs            
            default:
                uint start = cast(uint)ir.length;
                ir ~= Bytecode(10*current);
                next();
                uint len = cast(uint)ir.length - start;
                next();
                ir ~= Bytecode(-4);
                ir ~= ir[start .. start+len];
                ir ~= Bytecode(-1);
            }
        }
    }

    @property dchar current(){ return _current; }

    bool next()
    {
        if(pat.length == 0)
        {
            empty = true;
            return false;
        }
        _current = pat[0];
        pat = pat[1..$];
        return true;
    }
}

public auto getIr(string pattern)
{
    auto ir = Parser(pattern).ir;    
    return ir;
}

void main()
{
    auto re = getIr("(?:a+)");
    static re2 = getIr("(?:a+)");  
//uncomment to see that it's a 3rd element of 2 arrays that differs
/*
    import std.stdio;
    writeln("RT version");
    writeln(re);
    writeln("\n\nCT version");
    writeln(re2);
*/
    assert(re == re2);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list