standard ranges

Thu Jun 28 03:22:05 PDT 2012

On Thursday, 28 June 2012 at 10:02:59 UTC, Roman D. Boiko wrote:
> On Thursday, 28 June 2012 at 09:58:02 UTC, Roman D. Boiko wrote:
>> Pedantically speaking, it is possible to index a string with 
>> about 50-51% memory overhead to get random access in 0(1) 
>> time. Best-performing algorithms can do random access in about 
>> 35-50 nanoseconds per operation for strings up to tens of 
>> megabytes. For bigger strings (tested up to 1GB) or when some 
>> other memory-intensive calculations are performed 
>> simultaneously, random access takes up to 200 nanoseconds due 
>> to memory-access resolution process.
>
> Just a remark, indexing would take O(N) operations and N/B 
> memory transfers where N = str.length and B is size of cache 
> buffer.

That being said, I would be against switching from string 
representation as arrays. Such switch would hardly help us solve 
any problems of practical importance better (by a significant 
degree) than they have to be solved with current design.

However, a struct could be created for indexing which I mentioned 
in two previous posts to give efficient random access for narrow 
strings (and arbitrary variable-length data stored consequently 
in arrays) without any significant overhead.

Respective algorithms are called Rank and Select, and there exist 
many variations of them (with different trade-offs, but some of 
them are arguably better than others).

I have investigated this question quite deeply in the last two 
weeks, because similar algorithms would be useful in my DCT 
project. If nobody else will implement them before me, I will 
eventually do that myself. It is just a matter of finding some 
free time, likely a week or two.