Is this a bug or a VERY sneaky case?

rempas rempas at tutanota.com
Thu Dec 30 18:18:53 UTC 2021


On Thursday, 30 December 2021 at 16:17:28 UTC, WebFreak001 wrote:
> No actually I meant the u8, u16, etc. - if you stay consistent 
> it's fine, but most of the D ecosystem uses just the D native 
> types (ubyte, ushort, etc.) which have [guaranteed bit 
> widths](https://dlang.org/spec/type.html#basic-data-types) as 
> well.

Oh, these type are actually aliases so they will interact nicely 
with the language. So it's like "alias u8 = ubyte", "alias i8 = 
byte", "alias u16 = ushort" etc. A lot of other programming 
languages have used these names for their types because:
1. They are shorter
2. Look nicer (once you get used to them)
3. And for beginners (and for everyone actually as you can 
instantly notice), it is easier to know the size of each type (u8 
is unsigned 8bit (1byte) integer, i32 is a signed 32bit (4 byte) 
integer etc.)

> Once working with other code it's possible they could also have 
> custom definitions and then there are 3 or more different 
> aliases for something meaning the same thing.

My library will have no dependencies so it will not have to work 
with other code. These types will be the "official names" used 
for library development. People that will use the library don't 
have to use them of course (and that's the awesome thing).

> If you stay consistent it's fine - once you work with other 
> code which also has a template like this it starts to be 
> possible that there is gonna be more than one definition to do 
> the same simple thing.
>
> Also the name `is_same` is a little confusing because there is 
> also a `__traits(isSame, a, b)` which returns true if both 
> arguments are the same symbol. It does not do the `typeof(a) == 
> b` you are doing. (which I would btw not think of when I read 
> "is_same")

Again we will not work with other code HOWEVER, this definitions 
are intended (but of course not forced) to be used from library 
uses. So yeah, "is_same" is indeed not good. Do you suggest any 
other name (that is not to long)? I'm thinking about "same_type" 
and "is_type". The first one makes a lot of sense and the second 
one doesn't make so much sense but it is small and cool ;)

> No slices are basically `struct T[] { T* ptr; size_t length; }` 
> - it's returning pointer + length in a 16 byte struct (on 64 
> bit, possibly by returning via 2 registers) and does not 
> introduce any indirections. I think it's the best way to handle 
> more (or less) than one element pointers anywhere in D.

Yeah, this is exactly what I'm saying! You get back a struct 
(which I suppose is heavier vs a simple variable) and some times 
you don't need it (which is the case for the places I'm using 
pointers). Slices are amazing (even in the cases where I 
mentioned) where you need to take a specific part of a string so 
you do two things in one place. In this case then ok, slices just 
make things easier (with less chances of making bugs)!

> Slices do bounds checking and with that add more safety to your 
> program. You can disable it globally (unless in @safe code) but 
> I would recommend not doing so. It can be a performance issue 
> for algorithms that are on a very hot code path, like in big 
> loops. In these cases I would recommend using some kind of 
> `assert(maxValue < slice.length);` before your loop and to 
> disable single bounds checks inside the loop use `slice.ptr[i]` 
> instead of `slice[i]`.

Yeah, I know. Tbh, to me (and adding to what we already said), 
slices just seem of a more "beginner friendly" to actually do the 
same thing you would do with a pointer + a variable to holds its 
length. And the bounds-checking can actually be unnecessary since 
we will probably do that ourselves anyway because you will go out 
of bounds either because:
1. You don't know what you are doing. Which means that you don't 
pay attention to what you are writing (don't code drunk please) 
or you don't understand exactly what your code does (which may 
also be the case when you copy paste code online). In this case, 
there is a general problem that you should fix and having the 
bounds automatically checked for you, is not gonna fix the 
problem (probably)
2. A user input value (I'm talking about reading from the 
standard input) that was out of bounds. In that case you would 
probably want to tell the user that they gave a wrong input 
rather than stop the execution of the program. This is the same 
way, `to!byte(val)` will throw an exception if they conversion 
fails rather then give you a value which you can check against 
your original value to see if the conversion failed (and why). So 
yeah....

> Usually it is not necessary to do this unless you are working 
> on some low-level algorithms like sorting, unicode processing, 
> parsing, etc.

Yeah, makes sense.

> Additionally having the length with your array can often give 
> you performance improvements as you can just use a simple loop 
> and don't need to check every item in your array to be the null 
> terminator, which x86 processors can greatly speed up and 
> parallelize!

Yeah, I thought the same. Well, don't worry tho, I decided (even 
before making this talk) to replace every place in my code that 
returns "char*" to return my custom "str" so we will eliminate 
the pointers anyway and string will have a ".length" property 
(just like how D's string do).

Anyway is that I haven't wrote a lot of code (in general) and 
mostly I'm thinking about C ways of doing things except for times 
where I feel limited and I think about something that I need. I 
will probably upload the code in some weeks (maybe days If I'm 
not lazy???) so it would be nice for you guys actually read it 
and give me some review. Thanks a lot for your time replying to 
me and I wish you a happy new year!!! Of course I'm not implying 
that we should stop talking, reply me if you need more ;)


More information about the Digitalmars-d mailing list