Lexer in D
Dmitry Olshansky
dmitry.olsh at gmail.com
Sat Mar 2 09:33:08 PST 2013
02-Mar-2013 18:53, Namespace пишет:
>> That would be slower as canFind is linear search.
>> Simpler way is use first letter and length to select a bucket of
>> keywords.
>
> I think I get it.
> Did you mean something like this?
>
No-no and no.
Just translate list of keywords/types to a bunch of switch-cases that
will give you near optimal performance. Use CTFE to construct that
string of D code.
In fact I proposed to do just 2 levels of switches: first switch by
length then by first char.
Even simple if you don't want to go for CTFE code-generation is to make
plain array of array. The length of array is exactly 256 i.e.
Even this should have decent speed:
string[][256] keywords = [
null,
null,
...
//at index 'i'
['int', 'if', 'invariant', 'idouble', 'ifloat' ...]
//array of strings starting with char for each
...
];
bool isKeyword(string value)
{
string[] toSearch = keywords[value[0]];
return toSearch.canFind(value);
}
Better yet only store [1..$] parts of keywords and search for value[1..$].
Even faster is to make the same for each possible length of keyword.
> [code]
> immutable ubyte[char] typeMap;
> immutable ubyte[char] keywordMap;
>
> static this() {
> typeMap = [
> 'd' : 7, 'l' : 16, 'i' : 12, 'u' : 20, 'b' : 0, 'f' : 10, 'r' :
> 17, 'v' : 25, 'c' : 2, 's' : 18, 'w' : 26
> ];
>
> keywordMap = [
> '_' : 0, 'a' : 6, 'b' : 12, 'c' : 14, 'd' : 20, 'e' : 26, 'f' :
> 30, 'g' : 36, 'i' : 37, 'l' : 45,
> 'm' : 46, 'n' : 49, 'o' : 52, 'p' : 54, 'r' : 60, 's' : 62, 't'
> : 69, 'u' : 77, 'v' : 79, 'w' : 81
> ];
> }
>
> bool isKeyword(string value) pure nothrow {
> if (value[0] !in keywordMap) return false;
>
> foreach (string kword; keywords[keywordMap[value[0]] .. $]) {
> if (kword == value) return true;
> if (kword[0] != value[0]) return false;
> }
>
> return false;
> }
>
> bool isType(string value) pure nothrow {
> if (value[0] !in typeMap) return false;
>
> foreach (string type; types[typeMap[value[0]] .. $]) {
> if (type == value) return true;
> if (type[0] != value[0]) return false;
> }
>
> return false;
> }
> [/code]
>
> I'm now by ~230 msecs.
--
Dmitry Olshansky
More information about the Digitalmars-d-learn
mailing list