Issues with std.regex
MrAppleseed
email at email.com
Sat Feb 16 13:33:27 PST 2013
On Saturday, 16 February 2013 at 20:35:48 UTC, H. S. Teoh wrote:
> On Sat, Feb 16, 2013 at 09:22:07PM +0100, MrAppleseed wrote:
>> Hey all,
>>
>> I'm currently trying to port my small toy language I invented
>> awhile
>> back in Java to D. However, a main part of my lexical analyzer
>> was
>> regular expression matching, which I've been having issues
>> with in
>> D. The regex expression in question is as follows:
>>
>> [ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]
>>
>> This works well enough in Java to produce a series of tokens
>> that I
>> could then pass to my parser. But when I tried to port this
>> into D,
>> I almost always get an error when using brackets, braces, or
>> parenthesis. I've tried several different combinations, have
>> looked
>> through the std.regex library reference, have Googled this
>> issue,
>> have tested my regular expression in several online-regex
>> testers
>> (primarily http://regexpal.com/, and http://regexhelper.com/),
>> and
>> have even looked it up in the book, "The D Programming
>> Language"
>> (good book, by the way), yet I still can't get it working
>> right.
>> Here's the code I've been using:
>>
>> ...
>> auto tempCont = cast(char[])read(location, fileSize);
>> string contents = cast(string)tempCont;
>> auto reg = regex("[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]");
>
> The problem is that you're using D's double-quoted string
> literal, which
> adds another level of interpretation to the \'s. What you
> should do is
> to use the backtick string literal, which does *not* interpret
> backslashes:
>
> auto reg = regex(`[ 0-9a-zA-Z.*=+-;()\"\'\[\]<>,{}^#/\\]`);
>
> If you have trouble typing `, you can also use r"...", which
> means the
> same thing.
>
> Hope this helps.
>
>
> --T
Thanks for the quick reply!
I replaced the double-quotes with backticks, compiled it with no
problems, but on the first run I got a similar error:
std.regex.RegexException@/usr/include/dmd/phobos/std/regex.d(1942):
invalid escape sequence
Pattern with error: `[ 0-9a-zA-Z.*=+-;()\"` <--HERE--
`\'[]<>,{}^#/\\]`
After removing the invalid escape sequence, I compiled it, once
again with no problems, and attempted to run it, but I got the
same error as before:
std.regex.RegexException@/usr/include/dmd/phobos/std/regex.d(1942):
wrong CodepointSet
Pattern with error: `[ 0-9a-zA-Z.*=+-;()"'[]` <--HERE--
`<>,{}^#/\\]`
(Entire error here: http://pastebin.com/Su9XzbXW)
More information about the Digitalmars-d-learn
mailing list