Proposal for SentinelInputRange

Thu Feb 28 06:47:42 PST 2013

On 02/28/13 15:20, Jacob Carlborg wrote:
> On 2013-02-28 15:08, Artur Skawina wrote:
> 
>> Having said that, I've used this approach in a D lexer, and it does not really
>> matter in practice - avoiding the length (or '\0' sentinel) check makes a
>> <~1ms difference when lexing "datetime.d" sized objects (1.5Mbytes+, 460k+ tokens).
>> Which is practically irrelevant both in an IDE context and a compiler context
>> - other processing will be be orders of magnitude more expensive. An IDE doesn't
>> need to re-lex the whole file after every key press and 1ms won't make any
>> difference for a compiler run.
> 
> It's not about lexing a single file like std.datetime. We're takling be able to fast lex, I don't know, 100 or 1000 of files like std.datetime.

Define "fast". Lexing std.datetime takes at most ~10-20ms (possibly a single-digit
ms number, but i'd need to write some code to check the actual number). Smaller
objects take proportionally less. Meaning you'll be I/O bound, even /one/ (disk)
cache miss will have more impact then these kind of optimizations. 
Lexing a hundred small files or one 100x as big file is basically the same operation;
the difference will be in I/O + setup/teardown costs, which will be /outside/ the
lexer, so aren't affected by how it accesses input.

artur