DMD 1.021 and 2.004 releases

Kirk McDonald kirklin.mcdonald at gmail.com
Tue Sep 11 12:09:58 PDT 2007


Jascha Wetzel wrote:
> Jari-Matti Mäkelä wrote:
> 
>> Kirk McDonald wrote:
>>
>>> Walter Bright wrote:
>>>
>>>> Stewart Gordon wrote:
>>>>
>>>>> Maybe.  But still, nested comments are probably likely to be supported
>>>>> by more code editors than such an unusual feature as delimited 
>>>>> strings.
>>>>
>>>>
>>>> Delimited strings are standard practice in Perl. C++0x is getting
>>>> delimited strings. Code editors that can't handle them are going to
>>>> become rapidly obsolete.
>>>>
>>>> The more unusual feature is the token delimited strings.
>>>
>>> Which, since there's no nesting going on, are actually very easy to
>>> match. The Pygments lexer matches them with the following regex:
>>>
>>> q"([a-zA-Z_]\w*)\n.*?\n\1"
>>
>>
>> It's great to see Pygments handles so many possible syntaxes. 
>> Unfortunately
>> backreferences are not part of regular expressions. I've noticed two 
>> kinds
>> of problems in tools:
>>
>> a) some can't handle backreferences, but provide support for nested 
>> comments
>> as a special case. So comments are no problem then, but all delimited
>> strings are.
>>
>> b) some lexers handles both nested comments and delimited strings, but 
>> all
>> delimiters must be enumerated in the language definition. Even worse, 
>> some
>> highlighters only handle delimited comments, not strings.
>>
>> Maybe the new features (= one saves on average < 5 characters of 
>> typing per
>> string) are more important than tool support? Maybe all tools should be
>> rewritten in Python & Pygments?
> 
> 
> D's delimited strings can (luckily) be scanned with regular languages, 
> because the enclosing double quotes are required. else the lexical 
> structure wouldn't even be context free and a nightmare for 
> automatically generated lexers.
> therefore you can match q"[^"]*" and check the delimiters during 
> (context sensitive) semantic analysis.

Is the following a valid string?

q"/foo " bar/"

The grammar does not make it clear. The Pygments lexer treats it as 
though it is, under the assumption that the string continues until the 
first matching /" is found.

Walter also said, in another branch of the thread, that this is not valid:

q"/foo/bar/"

Since it isn't all /that/ hard to match these examples, I wonder why 
they are disallowed. Just to simplify the lexer that much more?

And, ah! I have found a bug in the Pygments lexer already:

auto a = q"/foo/";
auto b = q"/bar/";

Everything from the opening of the first string literal to the end of 
the second is highlighted. Oops. I have a fix for the lexer, dsource 
will be updated at some point.

-- 
Kirk McDonald
http://kirkmcdonald.blogspot.com
Pyd: Connecting D and Python
http://pyd.dsource.org



More information about the Digitalmars-d-announce mailing list