string comparison
Jonathan M Davis
jmdavisProg at gmx.com
Mon Dec 20 17:50:45 PST 2010
On Monday, December 20, 2010 16:45:20 doubleagent wrote:
> > Okay. I don't know what the actual code looks like
>
> Here.
>
> import std.stdio, std.string;
>
> void main() {
> uint[string] dictionary; // v[k], so string->uint
> foreach (line; stdin.byLine()) {
> // break sentence into words
> // Add each word in the sentence to the vocabulary
> foreach (word; splitter(strip(line))) {
> if (word in dictionary) continue; // nothing to do
> auto newId = dictionary.length;
> dictionary[word] = newId;
> writefln("%s\t%s", newId, word);
> }
> }
> }
>
> > ...
>
> Okay, suppose you're right. The behavior is still incorrect because the
> associative array has allowed two identical keys...identical because the
> only difference between two strings which I care about are the contents of
> their character arrays.
Array comparison cares about the contents of the array. It may shortcut
comparisons if lengths differ or if they point to the same point in memory and
have the same length, but array comparison is all about comparing their
elements.
In this case, you'd have two arrays/strings which point to the same point in
memory but have different lengths. Because their lengths differ, they'd be deemed
unequal. If you managed to try and put a string in the associative array which
has the same length as one that you already inserted, then they'll be considered
equal, since their lengths are identical and they point to same point in memory,
so in that case, I would expect the original value to be replaced with the new
one. But other than that, the keys will be considered unequal in spite of the
fact that they point to the same place in memory.
The real problem here is that associative arrays currently allow non-immutable
keys. Once that's fixed, then it won't be a problem anymore.
> > Also, it
> > would be _really_ annoying to have to mark variables mutable all over the
> > place as you would inevitably have to do.
>
> Obviously your other points are valid, but I haven't found this to be true
> (Clojure is pure joy). Maybe you're right because D is a systems language
> and mutability needs to be preferred, however after only a day or two of
> exposure to this language that assumption also appears to be wrong. Take
> a look at Walter's first attempted patch to bug 2954: 13 lines altered to
> explicitly include immutable, and 4 altered to treat variables as const:
> http://www.dsource.org/projects/dmd/changeset/749
>
> But I'm willing to admit that my exposure is limited, and that particular
> example is a little biased.
Most programmers don't use const even in languages that have it. And with many
programmers programming primarily in languages like Java or C# which don't
really have const (IIRC, C# has more of a const than Java, but it's still pretty
limited), many, many programmers never use const and see no value in it. So, for
most programmers, mutable variables will be the norm, and they'll likely only
use const or immutable if they have to. There are plenty of C++ programmers who
will seek to use const (and possibly immutable) heavily, but they're definitely
not the norm. And, of course, there are plenty of other languages out there with
const or immutable types of one sort or another (particularly most functional
languages), but those aren't the types of languages that most programmers use.
The result is that most beginning D programmers will be looking for mutable to
be the norm, and forcing const and/or immutable on them could be seriously off-
putting.
Now, most code which is going to actually use const and immutable is likely to
be a fair mix of mutable, const, and immutable - especially if you don't try to
make everything immutable at the cost of efficiency like you'd typically get in a
functional language. That being the case, regardless of whether mutable, const,
or immutable is the default, you're going to have to mark a fair number of
variables as something other than the default. So, making const or immutable the
default would likely not save any typing, and it would annoy a _lot_ of
programmers.
So, the overall gain of making const or immutable the default is pretty minimal
if not outright negative.
Personally, I use const and immutable a lot, but I still wouldn't want const or
immutable to be the default.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list