Issues with std.regex

MrAppleseed email at email.com
Sat Feb 16 13:33:27 PST 2013


On Saturday, 16 February 2013 at 20:35:48 UTC, H. S. Teoh wrote:
> On Sat, Feb 16, 2013 at 09:22:07PM +0100, MrAppleseed wrote:
>> Hey all,
>> 
>> I'm currently trying to port my small toy language I invented 
>> awhile
>> back in Java to D. However, a main part of my lexical analyzer 
>> was
>> regular expression matching, which I've been having issues 
>> with in
>> D. The regex expression in question is as follows:
>> 
>> [ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]
>> 
>> This works well enough in Java to produce a series of tokens 
>> that I
>> could then pass to my parser. But when I tried to port this 
>> into D,
>> I almost always get an error when using brackets, braces, or
>> parenthesis. I've tried several different combinations, have 
>> looked
>> through the std.regex library reference, have Googled this 
>> issue,
>> have tested my regular expression in several online-regex 
>> testers
>> (primarily http://regexpal.com/, and http://regexhelper.com/), 
>> and
>> have even looked it up in the book, "The D Programming 
>> Language"
>> (good book, by the way), yet I still can't get it working 
>> right.
>> Here's the code I've been using:
>> 
>> ...
>> auto tempCont = cast(char[])read(location, fileSize);
>> string contents = cast(string)tempCont;
>> auto reg = regex("[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]");
>
> The problem is that you're using D's double-quoted string 
> literal, which
> adds another level of interpretation to the \'s. What you 
> should do is
> to use the backtick string literal, which does *not* interpret
> backslashes:
>
> auto reg = regex(`[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]`);
>
> If you have trouble typing `, you can also use r"...", which 
> means the
> same thing.
>
> Hope this helps.
>
>
> --T

Thanks for the quick reply!

I replaced the double-quotes with backticks, compiled it with no 
problems, but on the first run I got a similar error:

std.regex.RegexException@/usr/include/dmd/phobos/std/regex.d(1942): 
invalid escape sequence
Pattern with error: `[ 0-9a-zA-Z.*=+-;()\"` <--HERE-- 
`\'[]<>,{}^#/\\]`

After removing the invalid escape sequence, I compiled it, once 
again with no problems, and attempted to run it, but I got the 
same error as before:

std.regex.RegexException@/usr/include/dmd/phobos/std/regex.d(1942): 
wrong CodepointSet
Pattern with error: `[ 0-9a-zA-Z.*=+-;()"'[]` <--HERE-- 
`<>,{}^#/\\]`

(Entire error here: http://pastebin.com/Su9XzbXW)


More information about the Digitalmars-d-learn mailing list