[review] new string type (take 2)

Steven Schveighoffer schveiguy at yahoo.com
Sat Jan 15 12:32:47 PST 2011


On Sat, 15 Jan 2011 13:19:34 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> On 1/15/11 11:15 CST, Steven Schveighoffer wrote:
>> On Fri, 14 Jan 2011 12:03:28 -0500, Andrei Alexandrescu
>> <SeeWebsiteForEmail at erdani.org> wrote:
>>
>>> On 1/14/11 5:06 AM, Steven Schveighoffer wrote:
>>>> On Thu, 13 Jan 2011 23:03:35 -0500, Steven Wawryk  
>>>> <stevenw at acres.com.au>
>>>> wrote:
>>>>
>>>>> Because at the code-point level it *isn't* a random-access range and
>>>>> the index makes no sense at the code-point level, only at the
>>>>> code-unit level. It's encouraging the confusion of 2 distinctly
>>>>> different abstractions or "views" of the same data. All the slicing
>>>>> and indexing you're artificially putting in the code-point range is
>>>>> already available in the code-unit range, and its only benefit is to
>>>>> allow the user to save typing ".data".
>>>>
>>>> I respectfully disagree. A stream built on fixed-sized units, but with
>>>> variable length elements, where you can determine the start of an
>>>> element in O(1) time given a random index absolutely provides
>>>> random-access. It just doesn't provide length.
>>>
>>> I equally respectfully disagree. I think random access is defined as
>>> accessing the ith element in O(1) time. That's not the case here.
>>
>> I'd actually go as far as saying that any container you can do a quick
>> lookup given an index (O(lgn) or better) is random-access. For example,
>> a dictionary file of english words sorted alphabetically can look up any
>> word definition in O(lgn) time, but you can't look up the nth word in
>> O(lgn) time.
>
> Here you are trying to equate a dictionary (map/hashtable) with an  
> otherwise unstructured variable length encoded array. I don't think  
> there is much similarity.

I see a lot of similarity.  Both have sparsely populated indexes.  Both  
cannot determine their length unless you save it separately.  Both are  
encoded on an array of fixed-length elements.  Both store their data  
sorted by index.

The difference is that the index is a size_t for one and a string for the  
other.

>
>> What do you call this type of container if not random access? Indexed
>> maybe? Whatever term you call it, I'd call it that, and say opIndex
>> belongs for that container.
>>
>> -Steve
>
> I'm not sure how you can call it, but it's definitely not random access.

The former complaint was "because this != random access, opIndex doesn't  
belong."

If we can get past the terminology of what this is, I think at least  
opSlice belongs and I still think opIndex belongs.  Can you not see the  
value of 'give me the element at index X' giving you back a full element  
and not just a code-unit?

If opIndex doesn't belong unless it's a random access container (i.e.  
array), then we should get rid of the operator for hashes and sorted trees.

-Steve


More information about the Digitalmars-d mailing list