RFC: Case-Insensitive Strings (And usually they really do*have*case)

Nick Sabalausky a at a.a
Mon Jan 10 12:22:23 PST 2011


"Jonathan M Davis" <jmdavisProg at gmx.com> wrote in message 
news:mailman.538.1294690510.4748.digitalmars-d at puremagic.com...
> On Monday, January 10, 2011 10:46:55 Nick Sabalausky wrote:
>> "Jim" <bitcirkel at yahoo.com> wrote in message
>> news:igfado$11g3$1 at digitalmars.com...
>>
>> >> While writing and dealing with all that code I realized something: 
>> >> While
>> >> programmers are usually heavily conditioned to think of 
>> >> case-sensitivity
>> >> as
>> >> an attribute of the comparison, it's very frequent that the deciding
>> >> factor
>> >> in which comparison to use is *not* the comparison itself but *what*
>> >> gets compared. And in those cases, you have to use the awful strategy
>> >> of "relying
>> >> on convention" to make sure you get it right in *every* place that
>> >> particular data gets compared.
>> >
>> > You have a point. Your case-sensitivity-aware string types will 
>> > guarantee
>> > correctness in a large and complex program. I like that. Ideally 
>> > though,
>> > they would only be compile-time constraints (i.e. not carrying any 
>> > other
>> > data).
>>
>> Not carrying any other data means not caching the lowercase version, 
>> which
>> means recreating the lowercase version more than necessary. So it's the
>> classic speed vs. space tradeoff. I would think there would be cases 
>> where
>> they get compared enough for that to make a difference, although I 
>> suppose
>> we'd really need benchmarks to see. OTOH, there are certainly cases (such
>> as my original motivating case) where the extra space is not an issue at
>> all.
>
> Why is caching necessary? Shouldn't you just be able to use 
> std.string.icmp()
> for comparisons internally, avoiding any copying or caching? That 
> shouldn't need
> to duplicate anything. Or do you need to cache the lower-case version for
> something other than comparison?
>

Anything involving toHash (such as using Insensitive as an AA key) requires 
the use of a lower-case version. For anything else, you're probably right, 
icmp should be fine (Although I'd like to do a benchmark of icmp vs regular 
string comparison).




More information about the Digitalmars-d mailing list