pu$ᅵle
Jonathan M Davis
jmdavisprog at gmail.com
Sun Jul 18 16:02:10 PDT 2010
On Sunday 18 July 2010 10:59:21 strtr wrote:
> I totally agree that putting a cast there is probably not really a solution
> (or legal).
> Warnings for all non-dchar types.
> Is there anybody using foreach(c;chars) || foreach(char c;chars) correctly
> (which couldn't be done with ubytes)?
As soon as some wants to process code units (for whatever reason) instead of
code points, then using char and wchar makes sense. Now, I suppose that you
could use ubyte and ushort in such circumstances, but I'm sure that _someone_
will be looking to do it, and (there's a decent chance that phobos does it) I
don't think that it would go over very well to give them lots of warnings.
The issue, of course, is that the common case is that anything other than dchar
in a foreach over string types would be a logic error in your code. D does a lot
to make things safer, but I don't think that there are very many cases where
things like this are special-cased in order to stop errors. The programmer is
expected to have some clue as to what they're doing, and the general trend in D
from what I can tell is to not use a type unless you have to, so it would be
perfectly normal to expect the programmer to have really meant char or wchar if
they put it explicitly.
I don't know. The truth is that on the one hand, programmers _need_ to
understand how D deals with strings and unicode, or they _will_ have bugs.
There's no getting around that. So, cases where someone who knows what they're
doing is likely to screw up on (like forgetting the type on the foreach) should
have warnings associated with them if it's reasonable. However, expecting the
compiler to catch each and every instance that a programmer is likely to shoot
themself in the foot with unicode and strings is not particularly reasonable.
The compiler can't always save the programmer from their own ignorance or
stupidity. If anything, that would indicate that making errors _easier_ in code
which someone who doesn't understand how D deals with unicode would write would
be a good idea.
It should be the case that competent D programmers will be able to use strings
easily. But it's likely better if the ones who don't know what they're doing
shoot themselves in the foot earlier rather than sooner so that they learn what
they need to learn about unicode and _become_ competent D programmers.
A competent D programmer will not put an explicit char in a foreach loop unless
that's what they really mean. The only issue there is that char could be a type
for dchar. But that sort of typo would be rather hard to defend against in
general. So, certainly on the surface, it would seem overkill to effectively
disallow char and wchar in foreach loops and force ubyte and ushort.
Still, this is an area which isn't all that hard to screw up on, so I don't know
what the best solution is. When it comes down to it, you can't always hold the
programmers hand. They need to be informed and responsible. But on the other
hand, you do want to make it harder for them to make stupid mistakes, since even
competent programmers do make stupid mistakes at least some of the time.
A warning for a foreach loop over strings where the element type is not specified
is a start. If you have a solid suggestion which would reduce errors in the
common case without unduly restraing folks who really know what they're doing,
then create a bug report for it with the severity of enhancement. Walter and
company will decide what works best with what they intend for D. Your suggestion
may or may not be implemented, but it's worth a try.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list