Chains

Mon Sep 16 08:00:25 UTC 2019

Anyone have a better way to accomplish the following task:

An algorithm computes various values for a graph structure. The 
graph contains many loops and nested loops. Each node has a chain 
associated with it, which is how it is connected to the rest of 
the graph.

The problem is to save memory because many nodes will have 
overlapping chains, there is no reason to store them all 
separately as it will waste several orders of magnitude of space.

The idea is to create arrays of arrays and to reuse parts of the 
arrays that will be common and to slice parts of the arrays and 
append them when necessary. Each Chain acts like a normal array 
in that it can be indexed as if it were a single array, but, in 
fact, it is a sequence of arrays.

Two chains may have parts that refer to the same data and so 
"connect" but there will never be more than 1 part in memory, so 
no wasted space.

It's starting to get a little more complicated than just wasting 
memory(which will limit the size of the input space and usually 
the input space will be quite large).

normally one would have something like

A - [ArrayA]
B - [ArrayB]
C - [ArrayC]

where the arrays may have sub-arrays in common. With a chain we 
have

A - [ChainA] = [Array1, ArrayQ, ArrayN]
B - [ChainB] = [Array2, ArrayQ, Array3]
C - [ChainC] = [Array1, ArrayN, Array2]

And the chain's can refer to common data to reduce the overlap 
from N to 1. (any duplicate arrays in the chain will be 
singletons. E.g., ArrayQ used in ChainA and ChainB are references 
to the same array. In the first case they will be duplicated. In 
my algorithms I always know when a array will be duplicated but 
the order of creation is chaotic which makes it more difficult 
than I'd like(and I'm not sure how correct the code is)

The problem is not so much the structure as it is dealing with 
corner cases in overlap detection.

Ideally all this would be handled under the hood and the user 
would not know any difference and would just see a chain as an 
array. The problem with that approach, as best I can tell, is 
that it would be a big performance hit to have to find duplicates 
because it would require searching the arrays. Currently I deal 
with the overlapping of the chains outside the structure tailored 
to the algorithm and it seems to work ok but is not very general.

One can think of this as sort of linearized a graph per node. 
Each node in the graph has a potential path to any other node. 
One simply lists all paths(it can even deal with loops by 
referencing the chain itself(which requires mimicking an array or 
to have some type of recursive structure).. but in doing this 
there are lots of paths with lots of redundancy that needs to be 
dealt with.

struct sChain(T)
{
	T[][] Parts;

	// Indexes parts like one large array
	auto opIndex(int i)
	{
		int c = 0;
		for(int k = 0; k < Parts.length; k++)
		{
			if (c <= i && i < c + Parts[k].length)
			{
				return Parts[k][i - c];

			}
			c += Parts[k].length;
		}
		assert(0, "Element not found");
	}

	auto FindIndex(alias D)(T v)
	{
		for(int i = 0; i < Length; i++)
		{
			if (D(v, this[i])) return i;
		}
		return -1;
	}

	auto FindPartOnIndex(int i)
	{
		int c = 0;
		for(int k = 0; k < Parts.length; k++)
		{
			if (c <= i && i < c + Parts[k].length)
			{
				return k;
			}
		}
		return -1;
	}

	auto Length(int m = -1)
	{
		if (m == -1) m = Parts.length;
		int c = 0;
		for(int k = 0; k < m; k++)
		{
			c += Parts[k].length;
		}
		return c;
	}

	auto NumParts()
	{
		return Parts.length;
	}
	auto AddPart(T[] p = null)
	{
		if (p)
			Parts ~= p;
		else
			Parts ~= new T[0];
	}

	auto opCast(T : bool)()
	{
		return !!Parts;
	}

	// Appends element to last part
	auto opOpAssign(string op : "~")(T r)
	{
		if (!Parts) Parts ~= new T[0];
		Parts[$-1] ~= r;	
	}

	// Appends part to parts
	auto opOpAssign(string op : "~")(T[] r)
	{
		//if (!Parts) Parts ~= new T[0];
		Parts ~= r;
	}

	// Appends array parts to parts
	auto opOpAssign(string op : "~")(typeof(this) r)
	{
		for(int k = 0; k < r.Parts.length; k++)
		{
			Parts ~= r.Parts[k];
		}

	}

}