Are Gigantic Associative Arrays Now Possible?
Laeeth Isharc via Digitalmars-d
digitalmars-d at puremagic.com
Thu Mar 23 00:12:29 PDT 2017
On Wednesday, 22 March 2017 at 20:00:56 UTC, dlangPupil wrote:
> Hello to all! As a (D, and coding and forum) newbie I'm
> learning about D's associative arrays (AAs), and their tiny
> latency regardless of AA size. Cool!
>
> One practical limitation on the practical maximum AA size is
> memory/disk paging. But maybe this limit could be overcome
> with the latest SSDs, whose nonvolatile memory can be addressed
> like RAM.
>
> The article below says that Intel Optane SSDs:
> -allow reads and writes can on individual bytes.
> -have a latency 10x of DRAM (but AAs' latency is so low that
> this might not matter in many cases).
> -currently offer 375GB of "RAM" for $1,500.
> -will support up to 3 TB on 2 socket Xeon systems (48TB on
> 4-socket).
> -will be supplemented with Optane DIMMs in the future.
>
> Some questions that arise are...
>
> 1) Wouldn't using such "RAM" eliminate any paging issue for
> super-gigantic AAs?
> 2) What other bottlenecks could arise for gigantic AAs, e.g.,
> garbage collection?
> 3) Would an append-only data design mitigate GC or other
> bottlenecks?
> 4) Has anyone tried this out?
>
> What a coup if D could "be the first" lang to make this
> practical. Thanks.
>
> https://arstechnica.com/information-technology/2017/03/intels-first-optane-ssd-375gb-that-you-can-also-use-as-ram/
Hi.
I am very interested in this topic, although I don't really have
answers for your questions.
See the acm article which I mentioned last year I think on forum.
https://queue.acm.org/detail.cfm?id=2874238
"For the entire careers of most practicing computer scientists, a
fundamental observation has consistently held true: CPUs are
significantly more performant and more expensive than I/O
devices. The fact that CPUs can process data at extremely high
rates, while simultaneously servicing multiple I/O devices, has
had a sweeping impact on the design of both hardware and software
for systems of all sizes, for pretty much as long as we've been
building them.
This assumption, however, is in the process of being completely
invalidated.
The arrival of high-speed, non-volatile storage devices,
typically referred to as Storage Class Memories (SCM), is likely
the most significant architectural change that datacenter and
software designers will face in the foreseeable future. SCMs are
increasingly part of server systems, and they constitute a
massive change: the cost of an SCM, at $3-5k, easily exceeds that
of a many-core CPU ($1-2k), and the performance of an SCM
(hundreds of thousands of I/O operations per second) is such that
one or more entire many-core CPUs are required to saturate it.
This change has profound effects:
1. The age-old assumption that I/O is slow and computation is
fast is no longer true: this invalidates decades of design
decisions that are deeply embedded in today's systems.
2. The relative performance of layers in systems has changed by a
factor of a thousand times over a very short time: this requires
rapid adaptation throughout the systems software stack.
3. Piles of existing enterprise datacenter
infrastructure—hardware and software—are about to become useless
(or, at least, very inefficient): SCMs require rethinking the
compute/storage balance and architecture from the ground up.
"
It's a massive relative price shock - storage vs CPU - and whole
structures around it will need to be utterly transformed. Intel
say it's the biggest technological breakthrough since the
internet.
I'm good at recognising moments of exhaustion in old trends and
the beginning of new ones, and I've thought for some time now
that people were complacently squandering CPU performance just as
conditions were tending to favour it becoming important again.
https://www.quora.com/Why-is-Python-so-popular-despite-being-so-slow/answer/Laeeth-Isharc?srid=35gE
I don't know how data structures and file systems should adapt to
this. But I do think the prospective return on efficient code
just went up a lot - as far as I can see, this shift is very good
for D.
More information about the Digitalmars-d
mailing list