Optimizing Java using D

Sat Jul 5 08:08:08 PDT 2014

On Saturday, 5 July 2014 at 14:20:33 UTC, Dmitry Olshansky wrote:
> Provision some extra space in each record. DBs do this all the 
> time, regardless of layout.

Which means waste of space you complained about just below.

Besides, you understand this is not a solution: one byte more 
than that reserve can provide, and the reordering should be done. 
Stuff like FAT was created exactly because linear storages are 
terribly inefficient unless they are read-only.

> It's funny to see how an _ability_ to avoid indirection can be 
> framed as a problem. Nothing prevents having an array of 
> pointers to structs or just using classes if desired.

Pointers to structs? Manually add indirection for sorting? That's 
reinventing the wheel. Indeed better (and simpler) using classes 
instead. They provide the same functionality and without using 
unsafe features.

> On the contrary sorting an array of pairs of integers in say 
> Java is shamefully inefficient. No wonder packed objects 
> (=value types)  are coming to Java, to address these sorts of 
> need.

For pair of integers, you can use long and sort an array of 
longs. But if your structure is any more complex (or has 
nontrivial comparison algorithm), one should really consider 
using classes first.

> Ehm. One can always redesign struct layout to push stuff out 
> through an indirection. And it's not like structs grow 
> organically on their own ;)

You mean structs with pointers? That violates the whole principle 
of structs IMHO. Ideally, when you assign/copy a struct, you 
assign/copy a *value*. Usage of pointers would add undue aliasing 
to the copy process, leading to bugs.

> In the same vane design of ref-to-instance is not scalable with 
> millions of items scattered across the memory causing awful 
> cache miss rate (as the indirection got to be crossed to 
> compare things), not to mention the memory wasted per each 
> object.

It depends. With classes, you have to read and write only small 
portions of memory - read only fields to compare, write only 
references to swap. Even with cache misses, this is still very 
small and efficient operations. With structs, you have to read 
and write much bigger regions of memory - to swap two items, you 
have to read both of them fully, then write both of them fully 
into new locations - which would make caching totally useless, 
whether structs are located sequentially or not. It's like 
copying a large file - the cache gets exhausted instantly, and 
doesn't speed things up at all.