RFC: naming for FrontTransversal and Transversal ranges
Robert Jacques
sandford at jhu.edu
Sat May 2 10:19:09 PDT 2009
On Sat, 02 May 2009 04:17:29 -0400, Don <nospam at nospam.com> wrote:
> Bill Baxter wrote:
>> On Fri, May 1, 2009 at 5:36 PM, bearophile <bearophileHUGS at lycos.com>
>> wrote:
>>> Bill Baxter:
>>>> Much more often the discussion on the numpy list takes the form of
>>>> "how do I make this loop faster" becuase loops are slow in Python so
>>>> you have to come up with clever transformations to turn your loop into
>>>> array ops. This is thankfully a problem that D array libs do not
>>>> have. If you think of it as a loop, go ahead and implement it as a
>>>> loop.
>>> Sigh! Already today, and even more tomorrow, this is often false for D
>>> too. In my computer I have a cheap GPU that is sleeping while my D
>>> code runs. Even my other core sleeps. And I am using one core at 32
>>> bits only.
>>> You will need ways to data-parallelize and other forms of parallel
>>> processing. So maybe nornmal loops will not cuti it.
>> Yeh. If you want to use multiple cores you've got a whole 'nother can
>> o worms. But at least I find that today most apps seem get by just
>> fine using a single core. Strange though, aren't you the guy always
>> telling us how being able to express your algorithm clearly is often
>> more important than raw performance?
>> --bb
>
> I confess to being mighty skeptical about the whole multi-threaded,
> multi-core thing. I think we're going to find that there's only two
> practical uses of multi-core:
> (1) embarressingly-parallel operations; and
> (2) process-level concurrency.
> I just don't believe that apps have as much opportunity for parallelism
> as people seem to think. There's just too many dependencies.
> Sure, you can (say) with a game, split your AI into a seperate core from
> your graphics stuff, but that's only applicable for 2-4 cores. It
> doesn't work for 100+ cores.
>
> (Which is why I think that broadening the opportunity for case (1) is
> the most promising avenue for actually using a host of cores).
Actually, AI is mostly embarrassingly-parallel. The issue is the mostly
part of that statement, which is why optimized reader/writer locks and STM
are showing up in game engines. And really, >90% of the CPU time has been
physics, which is both embarrassingly-parallel and being off-loaded onto
the GPU/multi-cores.
More information about the Digitalmars-d
mailing list