Request for comments: std.d.lexer
Brian Schott
briancschott at gmail.com
Sun Jan 27 16:53:02 PST 2013
On Sunday, 27 January 2013 at 23:49:11 UTC, Walter Bright wrote:
> On 1/27/2013 1:39 PM, Brian Schott wrote:
>> The file name is accepted for eventual error reporting
>> purposes.
>
> Use an OutputRange for that.
I think you misunderstand. The file name is so that if you pass
in "foo.d" the lexer can say "Error: unterminated string literal
beginning on line 123 of foo.d". It's not so that error messagaes
will be written to a file of that name.
On the topic of performance, I realized that the numbers posted
previously were actually for a debug build. Fail.
For whatever reason, the current version of the lexer code isn't
triggering my heisenbug[1] and I was able to build with -release
-inline -O.
Here's what avgtime has to say:
$ avgtime -q -h -r 200 dscanner --tokenCount
../phobos/std/datetime.d
------------------------
Total time (ms): 51409.8
Repetitions : 200
Sample mode : 250 (169 ocurrences)
Median time : 255.57
Avg time : 257.049
Std dev. : 4.39338
Minimum : 252.931
Maximum : 278.658
95% conf.int. : [248.438, 265.66] e = 8.61087
99% conf.int. : [245.733, 268.366] e = 11.3166
EstimatedAvg95%: [256.44, 257.658] e = 0.608881
EstimatedAvg99%: [256.249, 257.849] e = 0.800205
Histogram :
msecs: count normalized bar
250: 169 ########################################
260: 22 #####
270: 9 ##
Which works out to 1,327,784 tokens per second on my Ivy Bridge
i7.
I created a small program that demangles the output of valgrind
so that tools like KCachegrind can display profiling information
more clearly. It's now on the wiki[2]
The bottleneck in std.d.lexer as it stands is the appender
instances that assemble Token.value during iteration and front()
on the array of char[]. (As I'm sure everyone expected)
[1]
http://forum.dlang.org/thread/bug-9353-3@http.d.puremagic.com%2Fissues%2F
[2] http://wiki.dlang.org/Other_Dev_Tools
More information about the Digitalmars-d
mailing list