pu$ï¿œle

Jonathan M Davis jmdavisprog at gmail.com
Sun Jul 18 16:02:10 PDT 2010


On Sunday 18 July 2010 10:59:21 strtr wrote:
> I totally agree that putting a cast there is probably not really a solution
> (or legal).
> Warnings for all non-dchar types.
> Is there anybody using foreach(c;chars) || foreach(char c;chars) correctly
> (which couldn't be done with ubytes)?

As soon as some wants to process code units (for whatever reason) instead of 
code points, then using char and wchar makes sense. Now, I suppose that you 
could use ubyte and ushort in such circumstances, but I'm sure that _someone_ 
will be looking to do it, and (there's a decent chance that phobos does it) I 
don't think that it would go over very well to give them lots of warnings.

The issue, of course, is that the common case is that anything other than dchar 
in a foreach over string types would be a logic error in your code. D does a lot 
to make things safer, but I don't think that there are very many cases where 
things like this are special-cased in order to stop errors. The programmer is 
expected to have some clue as to what they're doing, and the general trend in D 
from what I can tell is to not use a type unless you have to, so it would be 
perfectly normal to expect the programmer to have really meant char or wchar if 
they put it explicitly.

I don't know. The truth is that on the one hand, programmers _need_ to 
understand how D deals with strings and unicode, or they _will_ have bugs. 
There's no getting around that. So, cases where someone who knows what they're 
doing is likely to screw up on (like forgetting the type on the foreach)  should 
have warnings associated with them if it's reasonable. However, expecting the 
compiler to catch each and every instance that a programmer is likely to shoot 
themself in the foot with unicode and strings is not particularly reasonable. 
The compiler can't always save the programmer from their own ignorance or 
stupidity. If anything, that would indicate that making errors _easier_ in code 
which someone who doesn't understand how D deals with unicode would write would 
be a good idea.

It should be the case that competent D programmers will be able to use strings 
easily. But it's likely better if the ones who don't know what they're doing 
shoot themselves in the foot earlier rather than sooner so that they learn what 
they need to learn about unicode and _become_ competent D programmers.

A competent D programmer will not put an explicit char in a foreach loop unless 
that's what they really mean. The only issue there is that char could be a type 
for dchar. But that sort of typo would be rather hard to defend against in 
general. So, certainly on the surface, it would seem overkill to effectively 
disallow char and wchar in foreach loops and force ubyte and ushort.

Still, this is an area which isn't all that hard to screw up on, so I don't know 
what the best solution is. When it comes down to it, you can't always hold the 
programmers hand. They need to be informed and responsible. But on the other 
hand, you do want to make it harder for them to make stupid mistakes, since even 
competent programmers do make stupid mistakes at least some of the time.

A warning for a foreach loop over strings where the element type is not specified 
is a start. If you have a solid suggestion which would reduce errors in the 
common case without unduly restraing folks who really know what they're doing, 
then create a bug report for it with the severity of enhancement. Walter and 
company will decide what works best with what they intend for D. Your suggestion 
may or may not be implemented, but it's worth a try.

- Jonathan M Davis


More information about the Digitalmars-d-learn mailing list