Are Gigantic Associative Arrays Now Possible?

Fri Mar 24 20:41:35 PDT 2017

On Friday, 24 March 2017 at 17:48:35 UTC, H. S. Teoh wrote:

> (In my case, though, B-trees may not represent much of an 
> improvement, because I'm dealing with high-dimensional data 
> that cannot be easily linearized to take maximum advantage of 
> B-tree locality. So at some level I still need some kind of 
> hash-like structure to work with my data. But it will probably 
> have some tree-like structure to it, because of the 
> (high-dimensional) locality it exhibits.)
>
>
> T

Hi T,

Your problem is intriguing and definitely stretching my mind!  
I'll be factoring your ideas into my app design as I go along.

Some techniques that might be relevant to your app, if only for 
relative performance comparisons, might be:

	Using metadata in lieu of actual data to maximize the number of 
rows "represented" in the caches.
	Using one or more columnstores, both the intra- and extra-cache, 
to allow transformations of one or more fields of one or more 
rows with extremely small read, computation and write costs.
	Scaling the app horizontally, if possible.
	Using stored procedures on a a SQL NoSQL or NewSQL DBMS to 
harness the DBMS's bulk-processing and high-throughput 
capabilities.

I'd love to hear whatever details you can share about your app.  
Alternatively I made a list of a dozen or so questions that would 
help me think about how to approach your problem. If you're 
interested in pursuing either avenue, let me know!  Thanks again