Implicit enum conversions are a stupid PITA

Thu Mar 25 13:10:14 PDT 2010

yigal chripun wrote:
> Walter Bright Wrote:
> 
>> Yigal Chripun wrote:
>>> Walter Bright Wrote:
>>>> Pascal has explicit casts. The integer to character one is CHR(i), the 
>>>> character to integer is ORD(c).
>>> I meant implicit, sorry about that. The pascal way is definitely the
>>> correct way. what's the semantics in your opinion of ('f' + 3) ? what
>>> about ('?' + 4)? making such arithmetic valid is wrong.
>> Yes, that is exactly the opinion of Pascal. As I said, I've programmed in 
>> Pascal, suffered as it blasted my kingdom, and I don't wish to do that
>> again. I see no use in pretending '?' does not have a numerical value that
>> is very useful to manipulate.
>> 
> 
> '?' indeed does *not* have a single numerical value that identiies it in a
> unique manner. You can map it to different numeric values based on encoding
> and even within the same encoding this doesn't always hold. See normalization
> in Unicode for different encodings for the same character.

That's true, '?' can have different encodings, such as for EBCDIC and RADIX50. 
Those formats are dead, however, and ASCII has won. D is specifically a Unicode 
language (a superset of ASCII) and '?' has a single defined value for it.

Yes, Unicode has some oddities about it, and the poor programmer using those 
characters will have to deal with it, but that does not change that quoted 
character literals are always the same numerical value. '?' is not going to 
change to another one tomorrow or in any conceivable future incarnation of Unicode.

>> Naturally, either everyone invents their own aliases (like they do in C
>> with its indeterminate int sizes), or they are standardized, in which case
>> we're back to pretty much exactly the same point we are at now. I don't see
>> where anything was accomplished.
>> 
> Not true. say I'm using my own proprietary hardware and I want to have
> bits!24. How would I do that in current D?

You'd be on your own with that. I had a discussion recently with a person who 
defended C's notion of compiler defined integer sizes, pointing out that this 
enabled compliant C compilers to be written for DSLs with 32 bit bytes. That is 
pedantically correct, compliant C compilers were written for it. Unfortunately, 
practically no C applications could be ported to it without extensive modification!

For your 24 bit machine, you will be forced to write all your own custom 
software, even if the D specification supported it.

> what if new hardware adds support
> for larger vector ops and 512bit registers, will we now need to extend the
> language with another type?

D will do something to accommodate it, obviously we don't know what that will be 
until we see what those types are and what they do. What I don't see is using 
512 bit ints for normal use.

>>> char and relatives should be for text only per Unicode, (perhaps a better
>>>  name is code-point).
>> There have been many proposals to try and hide the fact that UTF-8 is
>> really a multibyte encoding, but that makes for some pretty inefficient
>> code in too many cases.
> 
> I'm not saying we should hide that, on the contrary, the compiler should
> enforce unicode and other encodings should use a bits type instead. a
> [w|d]char must always contain a valid unicode value. calling char[] a string
> is wrong since it is actually an array of code-points which is not always a
> valid encoding. a dchar[] is however a valid string since each individual
> dchar contains a full code-unit.

Conceptually, I agree, it's wrong, but it's not practical to force the issue.

>>> enum should be an enumeration type. You can find an excellent
>>> strongly-typed design in Java 5.0
>> Those enums are far more heavyweight - they are a syntactic sugar around a
>> class type complete with methods, interfaces, constructors, etc. They
>> aren't even compile time constants! If you need those in D, it wouldn't be
>> hard at all to make a library class template that does the same thing.
>> 
> 
> They aren't that heavy weight. Instead of assigning an int to each symbol you
> assign a pointer address which is the same size.

No, it's not the same. A compile time constant has many advantages over a 
runtime one. Java creates an inner class for each enum member, not just the enum 
itself! It's heavyweight.

> Regarding the compile time
> property: for an int type: const int a = 5; //compile time the same should
> apply to enums as well.

Having a runtime pointer to an inner class for each enum value is far from the 
advantages of a compile time constant.

> The problem with the library solution is that it can't provide the syntax
> sugar for this.

It can get pretty close. Java has poor abstraction facilities, and so building 
it into the language was the only solution.