Safer casts

Yigal Chripun yigal100 at gmail.com
Sat May 10 10:53:37 PDT 2008


Yigal Chripun wrote:
> IMHO, your reply makes perfect sense for C/C++ but not for D.
> specifically because D has other facilities to handle those cases.
> a dchar (or [w]char) _must_ always contain valid data. if you need to
> store other encodings you can use ubyte instead which does not limit you
> to a specific bit pattern (this is why D has it in the first place...)
> the above example of slices can be easily dealt with since (unlike in
> C/C++) D arrays know their length. this is similar to the fact that D
> checks bounds on arrays and throws on error (in debug mode) whereas
> C/C++ does not. IMO, the D implementation itself (both the compiler and
> the runtime) need to make sure chars are always valid. this should not
> be something optional added via a library.
> 
> I agree with you notion of levels, I just think D provides for much
> better facilities for low-level coding compared to using unsafe C/C++
> conventions.
> 
>     int n;
>     dchar c = cast(dchar)n;
>     dchar d = cast!(dchar)n;
> 
> in the above code, the second one should be used and it might throw. the
> first simply does not make any sense and should produce a compiler error
> because you cannot convert an int value to a dchar (unless it's a one
> digit int)
> 
> <off topic rant>
> What worries me most about D is the fact that D becomes an extension to C++.
> The whole idea behind D was to create a new language without all the
> baggage and backward compatibility issues of C++.
> I don't want a slightly more readable version of C++ since I'll get that
> with C++0x.
> c++ programmers want D to have a D "stl" and a D boost. that's wrong!
> STL is badly designed and employs massive amounts of black magic that
> ordinary people do not understand. (I suffer at work while writing in
> C++). in what world does it make sense to mark an abstract method with
> "=0;" at the end, especially when the method is horizontally long and
> that gets to be off screen!
> D should be written with a D mindset which should be the best
> ingredients extracted from all those languages D got its influences
> from: java, C#, python, ruby, c/c++, etc. Tango is a good example of
> designing such a new D mindset, IMO. Phobos is not, since it's merely C
> code written with D syntax, with all those new shiny code Andrei added
> which is C++ code written with D syntax. I appreciate his great
> expertise in C++, but I already can use C++ libraries in C++ without
> learning a new language. D needs to be better. *much* better.
> </rant>
> 
> --Yigal

I think I misread your example so I want to clarify:
chars should contain only valid utf code points and not any other
bit-pattern. since code-points need to be ordered in specific ways it
make sense that the D standard library would provide methods that
validate and/or fix utf strings.
However, any other encoding must use ubyte arrays instead.

What if:

   int num = ...;
   dchar ch = cast(dchar)num;
   dchar ch1 = cast!(dchar)num;

ch would contain the bit pattern of the num-th code-point in the utf
standard (throwing for numbers not in the utf encoding table)
and the second cast would operate on the bit level (like an
reinterpret_cast) and throw if the resulting dchar bit pattern is not valid.

also, I'm leaning towards reporting all cast run time errors with
exceptions (it's more consistent, since you cannot return null for
primitives). no need for that special return null case. (if D had
attributes, i would have suggested making a suppress-exceptions
attribute for that purpose)

--Yigal



More information about the Digitalmars-d mailing list