A case for opImplicitCast: making string search work better

Steven Schveighoffer schveiguy at yahoo.com
Fri May 15 11:33:45 PDT 2009


On Fri, 15 May 2009 10:30:17 -0400, grauzone <none at example.net> wrote:

>> to return a pair struct, but still, what could be simpler than  
>> returning an index?  It's easy to construct the value you want (before  
>> or after), and if you both multiple values, that is also possible (and  
>> probably results in simpler code).
>
> All what you can do with the index is
> 1. compare it against the length of the searched string to test if the  
> search was successful
> 2. slice the searched string
> 3. do something rather special
>
> What else would you do? You'd just have to store the searched string as  
> a temporary, and then you'd slice the searched string (for 2.), or  
> compare it against the length of the searched string. You always have to  
> keep the searched string in a temporary. That's rather unpractical. Oh  
> sure, if you _really_ need the index (for 3.), then directly returning  
> an index is of course the best way.
>
> With my approach, you don't need to grab the passed searched string  
> again. All of these can be done in a single, trivial expression (for 3.  
> getting the index only). Actually, compared to your approach, this would  
> just eliminate the trivial but annoying slicing code after the search  
> call, that'd you'd type in... what, 90% of all cases?

I hadn't thought of the case where you are calling *on* a temporary, I  
always had in mind that the source string was already declared, this is a  
good point.  The only drawback in this case is you are constructing  
information you sometimes do not need or care about.  If all you want is  
whether it succeeded or not, then you don't need two ranges constructed  
and returned.  But therein lies a fundamental tradeoff that cannot be  
avoided.  The very basic information you get is the index, and with that,  
you can construct any larger pieces from the pieces you have, but not  
always easily, and not without repeating identifiers.

I like your approach, but with the single return type, not out  
parameters.  Having out parameters would be a deal breaker.

I'd prefer not to have two strings but a string that has an identified  
pivot point.  You could generate the desired left and right hand sides  
dynamically, and it would work without any changes to the current syntax.

for example:

struct partition(R)
{
    R range;
    uint pivot;

    R lhs() {return range[0..pivot];}
    R rhs() {return range[pivot..$];}
    bool found() {return pivot < range.length;}
}

partition!string indexOf(string haystack, dchar needle);

usage:

string s = str.find("hi").rhs; // or .lhs or .found or .pivot

> Maybe a struct would work fine too. But I don't like it, because the  
> programmer had to look up the struct members first. He had to memorize  
> the struct members, and couldn't tell what the function returns just by  
> looking at the function signature.

If this were implemented, the return type would be very common.  At some  
point you have to look up everything (what's a "range"?).

-Steve



More information about the Digitalmars-d mailing list