Why Strings as Classes?

Tue Aug 26 19:23:47 PDT 2008

Nick Sabalausky Wrote:

> "superdan" <super at dan.org> wrote in message 
> news:g921nb$2qqq$1 at digitalmars.com...
> > Nick Sabalausky Wrote:
> >
> >> "Denis Koroskin" <2korden at gmail.com> wrote in message
> >> news:op.ugie28dxo7cclz at proton.creatstudio.intranet...
> >> > On Tue, 26 Aug 2008 23:29:33 +0400, superdan <super at dan.org> wrote:
> >> > [snip]
> >> >>> The D spec certainly doesn't make any guarantees about
> >> >>> the time/memory complexity of opIndex; it's up to the implementing 
> >> >>> class
> >> >>> to do so.
> >> >>
> >> >> it don't indeed. it should. that's a problem with the spec.
> >> >>
> >> >
> >> > I agree. You can't rely on function invokation, i.e. the following 
> >> > might
> >> > be slow as death:
> >> >
> >> > auto n = collection.at(i);
> >> > auto len = collection.length();
> >> >
> >> > but index operations and properties getters should be real-time and 
> >> > have
> >> > O(1) complexity by design.
> >> >
> >> > auto n = collection[i];
> >> > auto len = collection.length;
> >>
> >> I disagree. That strategy strikes me as a very clear example of breaking
> >> encapsulation by having implementation details dictate certain aspects of
> >> the API. At the very least, that will make the API overly rigid, 
> >> hindering
> >> future changes that could otherwise have been non-breaking,
> >> behind-the-scenes changes.
> >
> > take this:
> >
> > auto c = new Customer;
> > c.loadCustomerInfo("123-12-1234");
> >
> > that's all cool. there's no performance guarantee other than some best 
> > effort kinda thing `we won't sabotage things'. if you come with faster or 
> > slower methods to load customers, no problem. coz noone assumes any.
> >
> > now take sort. sort says. my input is a range that supports indexing and 
> > swapping independent of the range size. if you don't have that just let me 
> > know and i'll use a totally different method. just don't pretend.
> >
> 
> Choosing a sort method is a separate task from the actual sorting.

thot you were the one big on abstraction and encapsulation and all those good things. as a user i want to sort stuff. i let the library choose what's best for the collection at hand.

sort(stuff)
{
    1) figure out best algo for stuff
    2) have at it
}

i don't want to make that decision outside. for example i like one sort routine. not quicksort, heapsort, quicksort_with_median_of_5, or god forbid bubblesort.

> Any sort 
> that expects to be able to index will still work correctly if given a 
> collection that has O(n) or worse indexing, just as it will still work 
> correctly if given an O(n) or worse comparison delegate.

i disagree but now that christopher gave me a black eye guess i have to shut up.

> It's up to the 
> caller of the sort function to know that an O(n log n) sort (for instance) 
> is only O(n log n) if the indexing and comparison are both O(1). And then 
> it's up to them to decide if they still want to send it a linked list or a 
> complex comparison. The sort shouldn't crap out at compile-time (or runtime) 
> just because some novice might not know that doing a generalized bubble sort 
> on a linked list scales poorly.

it should coz there's an obvious good choice. there's no good tradeoff in that. there never will be a case to call the bad sort on the bad range.

> If you want automatic choosing of an appropriate sort algorithm (which is 
> certainly a good thing to have), then that can be done at a separate, 
> optional, level of abstraction using function overloading, template 
> specialization, RTTI, etc. That way you're not imposing arbitrary 
> restrictions on anyone.

i think u missed a lil point i was making.

All collections: implement nth()
Indexable collections: implement opIndex

is all. there is no restriction. just use nth.

> > with that indexing and swapping complexity ain't implementation detail. 
> > they're part of the spec. guess stepanov's main contribution was to 
> > clarify that.
> >
> 
> When I called indexing an implementation detail, I was referring to the 
> collection itself. The method of indexing *is* an implementation detail of 
> the collection.

not when it gets composed in higher level ops.

> It should not be considered an implementation detail of the 
> sort algorithm since it's encapsulated in the collection and thus hidden 
> away from the sort algorithm. If you make a rule that collections with cheap 
> indexing are indexed via opIndex and collections with expensive indexing are 
> indexed via a function, then you've just defined the API in terms of the 
> collection's implementation (and introduced an unnecessary inconsistency 
> into the API).

no. there is consistency. nth() is consistent across - o(n) or better indexing.

i think u have it wrong when u think of "cheap" as if it were "fewer machine instructions". no. it's about asymptotic complexity and that does matter. 

> If a sort function is desired that only accepts collections with O(1) 
> indexing, then that can be accomplished at a higher level of abstraction 
> (using function overloading, RTTI, etc.) without getting in the way when 
> such a guarantee is not needed.

exactly. nth() and opIndex() fit the bill. what's there not to love?

> >> For realtime code, I can see the benefit to what you're saying. Although 
> >> in
> >> many cases only part of a program needs to be realtime, and for the rest 
> >> of
> >> the program's code I'd hate to have to sacrifice the encapsulation 
> >> benefits.
> >
> > realtime has nothin' to do with it.
> 
> For code that needs to run in realtime, I agree with Denis Koroskin that it 
> could be helpful to be able to look at a piece of code and have some sort of 
> guarantee that there is no behind-the-scenes overloading going on that is 
> any more complex than the operators' default behaviors. But for code that 
> doesn't need to finish within a maximum amount of time, that becomes less 
> important and the encapsulation/syntactic-consistency gained from the use of 
> such things becomes a more worthy pursuit. That's what I was saying about 
> realtime.

i disagree but am in a rush now. guess i can't convince u.

> > encapsulation ain't broken by making complexity part of the reqs. any more 
> > than any req ain't breakin' encapsulation. if it looks like a problem then 
> > encapsulation was misdesigned and needs change.
> >
> > case in point. all containers should provide 'nth' say is it's o(n) or 
> > better. then there's a subclass of container that is indexed_container. 
> > that provides opIndex and says it's o(log n) or better. it also provides 
> > 'nth' by just forwarding to opIndex. faster than o(n) ain't a problem.
> >
> > but forcing a list to blurt something for opIndex - that's just bad 
> > design.
> 
> I agree that not all collections should implement an opIndex. Anything 
> without a natural sequence or mapping should lack opIndex (such as a tree or 
> graph). But forcing the user of a collection that *does* have a natural 
> sequence (like a linked list) to use function-call-syntax instead of 
> standard indexing-syntax just because the collection is implemented in a way 
> that causes indexing to be less scalable than other collections - that's bad 
> design.

no. it's great design. because it's not lyin'. you want o(1) indexing you say a[n]. you are ok with o(n) indexing you say a.nth(n). this is how generic code works, with consistent notation. not with lyin'.

> The way I see it, "group[n]" means "Get the nth element of group". Not "Get 
> the element at location group.ptr + (n * sizeof(group_base_type)) or 
> something else that's just as scalable."

no need to get that low. just say o(1) and understand o(1) has nothing to do with the count of assembler ops.

> In plain C, those are one and the 
> same. But when you start talking about generic collections, encapsulation 
> and interface versus implementation, they are very different: the former is 
> interface, the latter is implementation. 

so now would u say stl has a poor design? because it's all about stuff that you consider badly designed.