Ranges
spir
denis.spir at gmail.com
Fri Mar 18 03:32:35 PDT 2011
On 03/18/2011 10:29 AM, Peter Alexander wrote:
> On 13/03/11 12:05 AM, Jonathan M Davis wrote:
>> So, when you're using a range of char[] or wchar[], you're really using a range
>> of dchar. These ranges are bi-directional. They can't be sliced, and they can't
>> be indexed (since doing so would likely be invalid). This generally works very
>> well. It's exactly what you want in most cases. The problem is that that means
>> that the range that you're iterating over is effectively of a different type
>> than
>> the original char[] or wchar[].
>
> This has to be the worst language design decision /ever/.
>
> You can't just mess around with fundamental principles like "the first element
> in an array of T has type T" for the sake of a minor convenience. How are we
> supposed to do generic programming if common sense reasoning about types
> doesn't hold?
>
> This is just std::vector<bool> from C++ all over again. Can we not learn from
> mistakes of the past?
I partially agree, but. Compare with a simple ascii text: you could iterate
over it chars (=codes=bytes), words, lines... Or according to specific schemes
for your app (eg reverse order, every number in it, every word at start of
line...). A piece of is not only a stream of codes.
The problem is there is no good decision, in the case of char[] or wchar[]. We
should have to choose a kind of "natural" sense of what it means to iterate
over a text, but there no such thing. What does it *mean*? What is the natural
unit of a text?
Bytes or words are code units which mean nothing. Code units (<-> dchars) are
not guaranteed to mean anything neither (as shown by past discussion: a code
unit may be the base 'a', the following one be the composite '^', both in "â").
Code unit do not represent "characters" in the common sense. So, it is very
clear that implicitely iterating over dchars is a wrong choice. But what else?
I would rather get rid of wchar and dchar and deal with plain stream of bytes
supposed to represent utf8. Until we get a good solution to operate at the
level of "human" characters.
Denis
--
_________________
vita es estrany
spir.wikidot.com
More information about the Digitalmars-d-learn
mailing list