Implicit enum conversions are a stupid PITA

Thu Mar 25 12:31:54 PDT 2010

Walter Bright Wrote:

> Yigal Chripun wrote:
> > Walter Bright Wrote:
> >> Pascal has explicit casts. The integer to character one is CHR(i), the
> >> character to integer is ORD(c).
> > I meant implicit, sorry about that. The pascal way is definitely the correct
> > way. what's the semantics in your opinion of ('f' + 3) ? what about ('?' +
> > 4)? making such arithmetic valid is wrong.
> 
> Yes, that is exactly the opinion of Pascal. As I said, I've programmed in 
> Pascal, suffered as it blasted my kingdom, and I don't wish to do that again. I 
> see no use in pretending '?' does not have a numerical value that is very useful 
> to manipulate.
> 

'?' indeed does *not* have a single numerical value that identiies it in a unique manner. You can map it to different numeric values based on encoding and even within the same encoding this doesn't always hold. See normalization in Unicode for different encodings for the same character.

> > I'm sure that the first Pascal
> > versions had problems which caused you to ditch that language (they where
> > fixed later).
> 
> They weren't compiler bugs I was wrestling with. They were fundamental design 
> decisions of the language.
> 
> > I doubt it though that this had a large impact on Pascal's
> > problems.
> 
> I don't agree. Pascal was a useless language as designed. This meant that every 
> vendor added many incompatible extensions. Anyone who used Pascal got locked 
> into a particular vendor. That killed it.
> 
> 
> >>> The fact that D has 12 integral types is a bad design, why do we need so
> >>> many built in types? to me this clearly shows a need to refactor this
> >>> aspect of D.
> >> Which would you get rid of? (13, I forgot bool!)
> >> 
> >> bool byte ubyte short ushort int uint long ulong char wchar dchar enum
> > 
> > you forgot the cent and ucent types and what about 256bit types?
> 
> They are reserved, not implemented, so I left them out. In or out, they don't 
> change the point.
> 
> 
> > Here's How I'd want it designed: First of, a Boolean type should not belong
> > to this list at all and shouldn't be treated as a numeric type. Second, there
> > really only few use-cases that are relevant
> > 
> > signed types for representing numbers: 1) unlimited integral type - int 2)
> > limited integral type  - int!(bits), e.g. int!16, int!8, etc.. 3) user
> > defined range: e.g. [0, infinity) for positive numbers, etc..
> > 
> > unsigned bit-packs: 4) bits!(size), e.g. bits!8, bits!32, etc..
> > 
> > of course you can define useful aliases, e.g. alias bits!8 Byte; alias
> > bits!16 Word; .. or you can define the aliases per the architecture, so that
> > Word above will be defined for the current arch (I don't know what's the
> > native word size on say ARM and other platforms)
> 
> People are going to quickly tire of writing:
> 
> bits!8 b;
> bits!16 s;
> 
> and are going to use aliases:
> 
>     alias bits!8 ubyte;
>     alias bits!16 ushort;
> 
> Naturally, either everyone invents their own aliases (like they do in C with its 
> indeterminate int sizes), or they are standardized, in which case we're back to 
> pretty much exactly the same point we are at now. I don't see where anything was 
> accomplished.
> 
Not true. say I'm using my own proprietary hardware and I want to have bits!24. How would I do that in current D? 
what if new hardware adds support for larger vector ops and 512bit registers, will we now need to extend the language with another type?

On the flip side of this, programmers almost always will need just an int since they need the mathematical notion of an integral type. 
Iit's prtty rare when programmers want something other than int and in those cases they'll define thir own types anyway since they know what their requirements are. 

> 
> > char and relatives should be for text only per Unicode, (perhaps a better
> > name is code-point).
> 
> There have been many proposals to try and hide the fact that UTF-8 is really a 
> multibyte encoding, but that makes for some pretty inefficient code in too many 
> cases.

I'm not saying we should hide that, on the contrary, the compiler should enforce unicode and other encodings should use a bits type instead. a [w|d]char must always contain a valid unicode value.
calling char[] a string is wrong since it is actually an array of code-points which is not always a valid encoding. a dchar[] is however a valid string since each individual dchar contains a full code-unit. 

> 
> > for other encodings use the above bit packs, e.g. alias
> > bits!7 Ascii; alias bits!8 ExtendedAscii; etc..
> > 
> > enum should be an enumeration type. You can find an excellent strongly-typed
> > design in Java 5.0
> 
> Those enums are far more heavyweight - they are a syntactic sugar around a class 
> type complete with methods, interfaces, constructors, etc. They aren't even 
> compile time constants! If you need those in D, it wouldn't be hard at all to 
> make a library class template that does the same thing.
> 

They aren't that heavy weight. Instead of assigning an int to each symbol you assign a pointer address which is the same size. 
Regarding the compile time property: 
for an int type: 
const int a = 5; //compile time
the same should apply to enums as well. 

The problem with the library solution is that it can't provide the syntax sugar for this.