pu$�le

strtr strtr at sp.am
Sun Jul 18 17:15:15 PDT 2010


== Quote from Jonathan M Davis (jmdavisprog at gmail.com)'s article
> On Sunday 18 July 2010 10:59:21 strtr wrote:
> > I totally agree that putting a cast there is probably not really a solution
> > (or legal).
> > Warnings for all non-dchar types.
> > Is there anybody using foreach(c;chars) || foreach(char c;chars) correctly
> > (which couldn't be done with ubytes)?
> As soon as some wants to process code units (for whatever reason) instead of
> code points, then using char and wchar makes sense. Now, I suppose that you
> could use ubyte and ushort in such circumstances, but I'm sure that _someone_
> will be looking to do it, and (there's a decent chance that phobos does it) I
> don't think that it would go over very well to give them lots of warnings.
> The issue, of course, is that the common case is that anything other than dchar
> in a foreach over string types would be a logic error in your code. D does a lot
> to make things safer, but I don't think that there are very many cases where
> things like this are special-cased in order to stop errors. The programmer is
> expected to have some clue as to what they're doing, and the general trend in D
> from what I can tell is to not use a type unless you have to, so it would be
> perfectly normal to expect the programmer to have really meant char or wchar if
> they put it explicitly.
> I don't know. The truth is that on the one hand, programmers _need_ to
> understand how D deals with strings and unicode, or they _will_ have bugs.
> There's no getting around that. So, cases where someone who knows what they're
> doing is likely to screw up on (like forgetting the type on the foreach)  should
> have warnings associated with them if it's reasonable. However, expecting the
> compiler to catch each and every instance that a programmer is likely to shoot
> themself in the foot with unicode and strings is not particularly reasonable.
> The compiler can't always save the programmer from their own ignorance or
> stupidity. If anything, that would indicate that making errors _easier_ in code
> which someone who doesn't understand how D deals with unicode would write would
> be a good idea.
> It should be the case that competent D programmers will be able to use strings
> easily. But it's likely better if the ones who don't know what they're doing
> shoot themselves in the foot earlier rather than sooner so that they learn what
> they need to learn about unicode and _become_ competent D programmers.

I actually knew about unicode, but I mistakenly thought a char to be a code point
(thus variable in size).
Somehow I missed any documentation telling me otherwise.
Now that I look for it it actually says:
char | 	unsigned 8 bit UTF-8

Maybe some stronger pointers in the documentation would help.

> A competent D programmer will not put an explicit char in a foreach loop unless
> that's what they really mean. The only issue there is that char could be a type
> for dchar. But that sort of typo would be rather hard to defend against in
> general. So, certainly on the surface, it would seem overkill to effectively
> disallow char and wchar in foreach loops and force ubyte and ushort.
> Still, this is an area which isn't all that hard to screw up on, so I don't know
> what the best solution is. When it comes down to it, you can't always hold the
> programmers hand. They need to be informed and responsible. But on the other
> hand, you do want to make it harder for them to make stupid mistakes, since even
> competent programmers do make stupid mistakes at least some of the time.
> A warning for a foreach loop over strings where the element type is not specified
> is a start. If you have a solid suggestion which would reduce errors in the
> common case without unduly restraing folks who really know what they're doing,
> then create a bug report for it with the severity of enhancement. Walter and
> company will decide what works best with what they intend for D. Your suggestion
> may or may not be implemented, but it's worth a try.
> - Jonathan M Davis

I agree with your bug-report.


More information about the Digitalmars-d-learn mailing list