Looking for champion - std.lang.d.lex

Walter Bright newshound2 at digitalmars.com
Mon Oct 25 00:36:42 PDT 2010


Nick Sabalausky wrote:
> "Walter Bright" <newshound2 at digitalmars.com> wrote in message 
> news:ia34up$ldb$1 at digitalmars.com...
>> In the regexp code, I provided special regexes for email addresses and 
>> URLs. Those are hard to get right, so it's a large convenience to provide 
>> them.
>>
>> Also, many literals can be fairly complex, and evaluating them can produce 
>> errors (such as integer overflow in the numeric literals). Having canned 
>> ones makes it much quicker for a user to get going.
>>
> 
> I'm not sure what exectly you're suggesting in these two paragraphs? (Or 
> just commenting?)

Does Goldie's lexer not convert numeric literals to integer values?


>> I'm guessing that a numeric literal is returned as a string. Is this 
>> string allocated on the heap? If so, it's a performance problem. Storage 
>> allocation costs figure large when trying to lex millions of lines.
>>
> 
> Good point. I've just checked and there is allocation going on for each 
> terminal lexed. But thanks to D's awesomeness, I can easily fix that to just 
> use a slice of the original source string. I'll do that...

Are all tokens returned as strings?


>> Long files aren't a problem. That's why we have .di files! I worry more 
>> about clutter.
> 
> I really find long files to be a pain to read and edit. It would be nice if 
> #5005 ( http://d.puremagic.com/issues/show_bug.cgi?id=5005 ) could get done. 
> Then, modules with a lot of code could be broken down as appropriate for 
> their maintainers without having to bother the users with the "module 
> blah.all" workaround (which Goldie currently uses, but I realize isn't 
> normal Phobos style). AIUI, .di files don't really solve that.
> 
> There is one other other minor related issue, though. One of my big 
> principles for Goldie is flexibility. So in addition to the basic API that 
> most people would use, I like to expose lower-level APIs for people who 
> might want to sidestep certain parts of Goldie, or provide other 
> less-typical but potentially useful things. But such things shouldn't be 
> automatically imported for typical users, so that sort of stuff would be 
> best left to a separate-but-related module.

If I may suggest, leave the low level stuff out of the api until demand for it 
justifies it. It's hard to predict just what will be useful, so I suggest 
conservatism rather than kitchen sink. It can always be added later, but it's 
really hard to remove.


> Maybe it's just too late over here for me, but can you be more specific on 
> "clutter"? Do you mean like API clutter?

That too, but I meant a clutter of files. Long files aren't a problem with D.


More information about the Digitalmars-d mailing list