Why foreach(c; someString) must yield dchar
Jonathan M Davis
jmdavisprog at gmail.com
Thu Aug 19 12:19:12 PDT 2010
On Thursday, August 19, 2010 07:15:30 dsimcha wrote:
> == Quote from dsimcha (dsimcha at yahoo.com)'s article
>
> > I've been hacking in Phobos and parallelfuture and I've come to the
> > conclusion that having typeof(c) in the expression foreach(c;
> > string.init) not be a dchar is simply ridiculous. I don't care how much
> > existing code gets broken, this needs to be fixed.
>
> Here's another good one. This one uses Lockstep, which is in the SVN
> version of std.range and is designed to provide syntactic sugar for
> iterating over multiple ranges in lockstep via foreach.
>
> string str1, str2;
> foreach(c1, c2; lockstep(str1, str2)) {}
>
> // c1, c2 are dchars since Lockstep relies on range primitives.
>
> foreach(c; str1) {}
> // c is a char since the regular foreach loop doesn't use range
> // primitives.
>
> I'm starting to think the inconsistency between ranges and foreach is
> really the worst part. When viewed in isolation, Andrei's changes to
> std.range to make ElementType!string == dchar, etc. were definitely the
> right thing to do. However, if we can't fix foreach, it might be a good
> idea to undo them because in this case I think such a ridiculous, bug
> producing inconsistency is worse than doing The Wrong Thing consistently.
Okay. Maybe this is what we do:
1. Make it a warning if not outright error to use foreach with any char or wchar
array (be they mutable, const, or immutable) without indicating the type. So,
foreach(c; mystring)
{
//...
}
would become illegal. You'd have to give the type for c. This would solve the
problem where someone forgets to put the type. Since odds are that they wanted
dchar anyway, the extra characters aren't really extra for most people. And the
few who actually wanted char or wchar can just put the type. It shouldn't be a
big deal. A programmer can still foolishly put char or wchar when what they
actually need a dchar for what they're doing, but at least then it's a
deliberate error due to ignorance rather than someone who knows what they're
doing making a simple mistake. This will also catch errors in generic algorithms
that end up trying to use foreach without giving the type.
2. Ditch ElementType in favor of something more like ExactElemType and
ConceptElemType where ExactElemType is the actual type in the array/range and
ConceptElemType is the type that is conceptually in the array/range. So, for
most types, those two will be the same, but for string types, ExactElemType will
be char, wchar, or dchar, while ConceptElemType will always be dchar. So, the
algorithms that don't care about what the elements mean can just use
ExactElemType while those that do care about what the elements mean use
ConceptElemType.
I'm not sure that this is the best solution. However, the fact that string and
wstring are arrays but can't always be treated as arrays is pretty much
inescapable as long as they're arrays. It seems like no matter what we do, you
either lose the ability to treat strings as arrays or you have to special case
them all over the place. If they were structs that gave access to their
underlying array for array operations and gave range operations for normal use
(possibly along with a function for giving you the nth element, though it
couldn't truly be random access unless it were a dstring), then maybe we could
get this to work better. But we're dealing the inherent problem that the
container holds one type conceptually and a completely different type in reality.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list