string comparison

Jonathan M Davis jmdavisProg at gmx.com
Mon Dec 20 17:50:45 PST 2010


On Monday, December 20, 2010 16:45:20 doubleagent wrote:
> > Okay. I don't know what the actual code looks like
> 
> Here.
> 
> import std.stdio, std.string;
> 
> void main() {
>         uint[string] dictionary; // v[k], so string->uint
>         foreach (line; stdin.byLine()) {
>                 // break sentence into words
>                 // Add each word in the sentence to the vocabulary
>                 foreach (word; splitter(strip(line))) {
>                         if (word in dictionary) continue; // nothing to do
>                         auto newId = dictionary.length;
>                         dictionary[word] = newId;
>                         writefln("%s\t%s", newId, word);
>                 }
>         }
> }
> 
> > ...
> 
> Okay, suppose you're right.  The behavior is still incorrect because the
> associative array has allowed two identical keys...identical because the
> only difference between two strings which I care about are the contents of
> their character arrays.

Array comparison cares about the contents of the array. It may shortcut 
comparisons if lengths differ or if they point to the same point in memory and 
have the same length, but array comparison is all about comparing their 
elements.

In this case, you'd have two arrays/strings which point to the same point in 
memory but have different lengths. Because their lengths differ, they'd be deemed 
unequal. If you managed to try and put a string in the associative array which 
has the same length as one that you already inserted, then they'll be considered 
equal, since their lengths are identical and they point to same point in memory, 
so in that case, I would expect the original value to be replaced with the new 
one. But other than that, the keys will be considered unequal in spite of the 
fact that they point to the same place in memory.

The real problem here is that associative arrays currently allow non-immutable 
keys. Once that's fixed, then it won't be a problem anymore.

> > Also, it
> > would be _really_ annoying to have to mark variables mutable all over the
> > place as you would inevitably have to do.
> 
> Obviously your other points are valid, but I haven't found this to be true
> (Clojure is pure joy).  Maybe you're right because D is a systems language
> and mutability needs to be preferred, however after only a day or two of
> exposure to this language that assumption also appears to be wrong.  Take
> a look at Walter's first attempted patch to bug 2954: 13 lines altered to
> explicitly include immutable, and 4 altered to treat variables as const:
> http://www.dsource.org/projects/dmd/changeset/749
> 
> But I'm willing to admit that my exposure is limited, and that particular
> example is a little biased.

Most programmers don't use const even in languages that have it. And with many 
programmers programming primarily in languages like Java or C# which don't 
really have const (IIRC, C# has more of a const than Java, but it's still pretty 
limited), many, many programmers never use const and see no value in it. So, for 
most programmers, mutable variables will be the norm, and they'll likely only 
use const or immutable if they have to. There are plenty of C++ programmers who 
will seek to use const (and possibly immutable) heavily, but they're definitely 
not the norm. And, of course, there are plenty of other languages out there with 
const or immutable types of one sort or another (particularly most functional 
languages), but those aren't the types of languages that most programmers use. 
The result is that most beginning D programmers will be looking for mutable to 
be the norm, and forcing const and/or immutable on them could be seriously off-
putting.

Now, most code which is going to actually use const and immutable is likely to 
be a fair mix of mutable, const, and immutable - especially if you don't try to 
make everything immutable at the cost of efficiency like you'd typically get in a 
functional language. That being the case, regardless of whether mutable, const, 
or immutable is the default, you're going to have to mark a fair number of 
variables as something other than the default. So, making const or immutable the 
default would likely not save any typing, and it would annoy a _lot_ of 
programmers.

So, the overall gain of making const or immutable the default is pretty minimal 
if not outright negative.

Personally, I use const and immutable a lot, but I still  wouldn't want const or 
immutable to be the default.

- Jonathan M Davis


More information about the Digitalmars-d-learn mailing list