Why foreach(c; someString) must yield dchar

Jonathan M Davis jmdavisprog at gmail.com
Thu Aug 19 12:19:12 PDT 2010


On Thursday, August 19, 2010 07:15:30 dsimcha wrote:
> == Quote from dsimcha (dsimcha at yahoo.com)'s article
> 
> > I've been hacking in Phobos and parallelfuture and I've come to the
> > conclusion that having typeof(c) in the expression foreach(c;
> > string.init) not be a dchar is simply ridiculous.  I don't care how much
> > existing code gets broken, this needs to be fixed.
> 
> Here's another good one.  This one uses Lockstep, which is in the SVN
> version of std.range and is designed to provide syntactic sugar for
> iterating over multiple ranges in lockstep via foreach.
> 
> string str1, str2;
> foreach(c1, c2; lockstep(str1, str2)) {}
> 
> // c1, c2 are dchars since Lockstep relies on range primitives.
> 
> foreach(c; str1) {}
> // c is a char since the regular foreach loop doesn't use range
> // primitives.
> 
> I'm starting to think the inconsistency between ranges and foreach is
> really the worst part.  When viewed in isolation, Andrei's changes to
> std.range to make ElementType!string == dchar, etc. were definitely the
> right thing to do.  However, if we can't fix foreach, it might be a good
> idea to undo them because in this case I think such a ridiculous, bug
> producing inconsistency is worse than doing The Wrong Thing consistently.

Okay. Maybe this is what we do:

1. Make it a warning if not outright error to use foreach with any char or wchar 
array (be they mutable, const, or immutable) without indicating the type. So,

foreach(c; mystring)
{
    //...
}

would become illegal. You'd have to give the type for c. This would solve the 
problem where someone forgets to put the type. Since odds are that they wanted 
dchar anyway, the extra characters aren't really extra for most people. And the 
few who actually wanted char or wchar can just put the type. It shouldn't be a 
big deal. A programmer can still foolishly put char or wchar when what they 
actually need a dchar for what they're doing, but at least then it's a 
deliberate error due to ignorance rather than someone who knows what they're 
doing making a simple mistake. This will also catch errors in generic algorithms 
that end up trying to use foreach without giving the type.

2. Ditch ElementType in favor of something more like ExactElemType and 
ConceptElemType where ExactElemType is the actual type in the array/range and 
ConceptElemType is the type that is conceptually in the array/range. So, for 
most types, those two will be the same, but for string types, ExactElemType will 
be char, wchar, or dchar, while ConceptElemType will always be dchar. So, the 
algorithms that don't care about what the elements mean can just use 
ExactElemType while those that do care about what the elements mean use 
ConceptElemType.

I'm not sure that this is the best solution. However, the fact that string and 
wstring are arrays but can't always be treated as arrays is pretty much 
inescapable as long as they're arrays. It seems like no matter what we do, you 
either lose the ability to treat strings as arrays or you have to special case 
them all over the place. If they were structs that gave access to their 
underlying array for array operations and gave range operations for normal use 
(possibly along with a function for giving you the nth element, though it 
couldn't truly be random access unless it were a dstring), then maybe we could 
get this to work better. But we're dealing the inherent problem that the 
container holds one type conceptually and a completely different type in reality.

- Jonathan M Davis


More information about the Digitalmars-d mailing list