string is rarely useful as a function argument

Chad J chadjoan at __spam.is.bad__gmail.com
Sun Jan 1 11:01:35 PST 2012


On 01/01/2012 10:39 AM, Timon Gehr wrote:
> On 01/01/2012 04:13 PM, Chad J wrote:
>> On 01/01/2012 07:59 AM, Timon Gehr wrote:
>>> On 01/01/2012 05:53 AM, Chad J wrote:
>>>>
>>>> If you haven't been educated about unicode or how D handles it, you
>>>> might write this:
>>>>
>>>> char[] str;
>>>> ... load str ...
>>>> for ( int i = 0; i<   str.length; i++ )
>>>> {
>>>>       font.render(str[i]); // Ewww.
>>>>       ...
>>>> }
>>>>
>>>
>>> That actually looks like a bug that might happen in real world code.
>>> What is the signature of font.render?
>>
>> In my mind it's defined something like this:
>>
>> class Font
>> {
>>   ...
>>
>>      /** Render the given code point at
>>          the current (x,y) cursor position. */
>>      void render( dchar c )
>>      {
>>          ...
>>      }
>> }
>>
>> (Of course I don't know minute details like where the "cursor position"
>> comes from, but I figure it doesn't matter.)
>>
>> I probably wrote some code like that loop a very long time ago, but I
>> probably don't have that code around anymore, or at least not easily
>> findable.
> 
> I think the main issue here is that char implicitly converts to dchar:
> This is an implicit reinterpret-cast that is nonsensical if the
> character is outside the ascii-range.

I agree.

Perhaps the compiler should insert a check on the 8th bit in cases like
these?

I suppose it's possible someone could declare a bunch of individual
char's and then start manipulating code units that way, and such an 8th
bit check could thwart those manipulations, but I would also counter
that such low manipulations should be done on ubyte's instead.

I don't know how much this would help though.  Seems like too little,
too late.

The bigger problem is that a char is being taken from a char[] and
thereby loses its context as (potentially) being part of a larger
codepoint.


More information about the Digitalmars-d mailing list