Let's stop parser Hell
Roman D. Boiko
rb at d-coding.com
Thu Jul 5 13:02:15 PDT 2012
On Thursday, 5 July 2012 at 19:54:39 UTC, Philippe Sigaud wrote:
> On Thu, Jul 5, 2012 at 8:28 PM, Andrei Alexandrescu
> <SeeWebsiteForEmail at erdani.org> wrote:
>
>> I'll be glad to buy for you any book you might feel you need
>> for this.
>> Again, there are few things more important for D right now
>> than exploiting
>> its unmatched-by-competition features to great ends. I don't
>> want the lack
>> of educational material to hold you down. Please continue
>> working on this
>> and let me know of what you need.
>
> That's nice of you, if a bit extreme for a public mailing list
> :)
> Andrei, money is no problem :)
> Interest in the field of parsing is no problem.
> Interest in D future is no problem.
> Having a demanding job, and three children, is a problem. No,
> scratch
> that, you know what I mean.
I have four, from 1 to 7 years old... Wouldn't call them a
problem, though :)))
> But hey, Roman is doing interesting things on keyword parsing
> right
> now, and changing the parser generated by Pegged is not
> difficult. We
> will see where this thread lead. (Roman, you should send your
> results
> here, because I'm still taken aback by the built-in AA speed
> compared
> to linear array look-up for 100 keywords).
Well, I wouldn't call those "results" yet. Just some benchmarks.
Here they are:
isKeyword_Dummy (baseline): 427 [microsec] total, 28 [nanosec /
lookup].
isKeyword_Dictionary: 1190 [microsec] total, 214 [nanosec /
lookup].
isKeyword_SwitchByLengthThenByChar: 466 [microsec] total, 83
[nanosec / lookup].
isKeyword_BinaryArrayLookup: 2722 [microsec] total, 490 [nanosec
/ lookup].
isKeyword_LinearArrayLookup: 13822 [microsec] total, 2490
[nanosec / lookup].
isKeyword_UnicodeTrie: 1317 [microsec] total, 237 [nanosec /
lookup].
isKeyword_UnicodeTrieBoolLookup: 1072 [microsec] total, 193
[nanosec / lookup].
Total: 22949 identifiers + 5551 keywords.
isKeyword_Dummy (baseline): 2738 [microsec] total, 50 [nanosec /
lookup].
isKeyword_Dictionary: 4247 [microsec] total, 242 [nanosec /
lookup].
isKeyword_SwitchByLengthThenByChar: 1593 [microsec] total, 91
[nanosec / lookup].
isKeyword_BinaryArrayLookup: 14351 [microsec] total, 820 [nanosec
/ lookup].
isKeyword_LinearArrayLookup: 59564 [microsec] total, 3405
[nanosec / lookup].
isKeyword_UnicodeTrie: 4167 [microsec] total, 238 [nanosec /
lookup].
isKeyword_UnicodeTrieBoolLookup: 3466 [microsec] total, 198
[nanosec / lookup].
Total: 104183 identifiers + 17488 keywords.
> As Dmitry said, we might hit a CTFE wall: memory consumption,
> computation speed, ...
> (*channelling Andrei*: then we will correct whatever makes CTFE
> a
> problem. Right)
>
> Philippe
>
> (Hesitating between 'The Art of the Metaobject Protocol' and
> 'Compilers, Techniques and Tools', right now)
More information about the Digitalmars-d
mailing list