WordCount performance

Georg Wrede georg at nospam.org
Thu Mar 27 18:13:59 PDT 2008


Walter Bright wrote:
> Georg Wrede wrote:
> 
>> Hmm. What kind of situations would need multiple threads 
>> simultaneously reading from stdin? And if there aren't any, wouldn't 
>> locking be like wearing a life vest on land?
> 
> If it was unsynchronized, if you had two threads reading lines from the 
> input, the contents of the lines would be interleaved. With syncing, 
> each thread will get complete lines.

Yes. Now, my problem was, it should be pretty uncommon to need to have 
two (or of course more) threads reading the same input within the same 
program (and hugely more uncommon for stdin, since the _entry_ to stdin 
is implemented in a _single_ process anyway in the operating system, 
even if several simultaneous processes were to feed it data). Therefore 
there is no way a process can aquire data at a rate faster than that of 
a single processor. So, it would only be relevant to an application that

  - is disturbed by having the input mistakenly be split into smaller 
than one-line pieces

  - is happy with a line at a time (i.e. lines are self-sufficient) vs. 
needing to split it into logical entities potentially larger than a line 
(like a parser), any of which may be given to one of the threads.

A programmer even dreaming of implementing this, most likely will 
implement the input itself as a single thread, dispatching the data 
between multiple threads. The assumptions being

  - reading stdin is conceptually ill suited to multithreading

  - consuming the data is slower than reading [and dispatching] it

----

Anyhow, I've never heard even rumors of multiple threads reading _stdin_.

No programmer contemplating such would be newbie enough to actually try 
it. Besides, he'd never _expect_ reading stdio to be thread_safe, 
therefore he'd not even check the documentation.

All in all, I still think it's wearing a life vest on land. And crapping 
up throughput.

>> And isspace, is that really the right place to check for ascii?
> 
> It has to be, because its argument is an int, not a char.

Ah, ok.

(Not to be a perennial PITA,) but the original context of your statement 
was in a comparison between D and C++. Now, this might have given the 
casual reader the assumption that C++ doesn't have this int thing here, 
making C++ faster. You may want to elaborate on that for the regular reader.




More information about the Digitalmars-d mailing list