standard ranges

Thu Jun 28 04:38:21 PDT 2012

On Thursday, 28 June 2012 at 09:58:02 UTC, Roman D. Boiko wrote:
> Pedantically speaking, it is possible to index a string with 
> about 50-51% memory overhead to get random access in 0(1) time. 
> Best-performing algorithms can do random access in about 35-50 
> nanoseconds per operation for strings up to tens of megabytes. 
> For bigger strings (tested up to 1GB) or when some other 
> memory-intensive calculations are performed simultaneously, 
> random access takes up to 200 nanoseconds due to memory-access 
> resolution process.
This would support both random access to characters by their code 
point index in a string and determining code point index by code 
unit index.

If only the former is needed, space overhead decreases to 25% for 
1K and <15% for 16K-1G string sizes (measured in number of code 
units, which is twice the number of bytes for wstring). Strings 
up to 2^64 code units would be supported.

This would also improve access speed significantly (by 10% for 
small strings and about twice for large).