Is this a bug or a VERY sneaky case?

Thu Dec 30 16:17:28 UTC 2021

On Thursday, 30 December 2021 at 08:26:17 UTC, rempas wrote:
> [...]
>
>> - use D's datatypes, not your own ones (if you want others to 
>> look/work on your code too it's better to use the common names 
>> for stuff) - but ofc staying consistent across your code is 
>> more important
>
> I suppose you mean about "str" right? In this case, I would 
> love using D's "string" if it wasn't immutable by default. It 
> pisses me A LOT when a language tries to "protect" me from 
> myself. There are a lot of other stuff that I would like 
> "string" to have but I wouldn't mind them so much if string was 
> mutable by default (or even better if string literals were 
> "char*" like C and could get automatically casted so I could 
> use char[] without the need of an ".dup"). Another thing is 
> that I'm making a library so people will probably read the 
> definition of "str" out of interest anyway and learn it if they 
> want to use the library as users. And even for people that want 
> to only contribute to a specific place in the code and not use 
> the library (which why would you do that anyway?), the way 
> "str" is used in the code, is similar to how "string" is used 
> (check how they both have a ".ptr" property to get the actual 
> pointer for example) so I don't really think that there is a 
> problem with that.

No actually I meant the u8, u16, etc. - if you stay consistent 
it's fine, but most of the D ecosystem uses just the D native 
types (ubyte, ushort, etc.) which have [guaranteed bit 
widths](https://dlang.org/spec/type.html#basic-data-types) as 
well. Once working with other code it's possible they could also 
have custom definitions and then there are 3 or more different 
aliases for something meaning the same thing.

>> - use `is(T == ubyte)` etc. instead of your custom 
>> `is_same!(val, ubyte)` (same reason as above, people need to 
>> read the definition of is_same first)
>
> This is a simple definition actually so why is it such of a big 
> deal? Also we shouldn't use "is(T == ubyte)" but instead 
> "is(typeof(val) == ubyte)" just like I'm doing it. This is 
> because in variadic functions, "T" will have different type for 
> each argument so it will not work (I made this mistake and 
> people told me so that's how I know). So why do we have to type 
> this much when we can automate this with a simple definition? 
> There is also one for checking if a type is a number (integer), 
> a floating point, a string (including my "str") etc. I don't 
> find these hard to learn and memorize so I don't find a reason 
> to not make my (our) life easier and just use "macros". This is 
> the main reason I use D and not 
> [Vox](https://github.com/MrSmith33/vox) in the first place (and 
> the fact that in Vox you cannot fully work with Variadic 
> functions yet).

If you stay consistent it's fine - once you work with other code 
which also has a template like this it starts to be possible that 
there is gonna be more than one definition to do the same simple 
thing.

Also the name `is_same` is a little confusing because there is 
also a `__traits(isSame, a, b)` which returns true if both 
arguments are the same symbol. It does not do the `typeof(a) == 
b` you are doing. (which I would btw not think of when I read 
"is_same")

>> - work with slices, not with pointers (if you plan to use your 
>> code from D, it's much cleaner and avoids bugs! does not need 
>> a trailing null terminator and works with @safe code)
>
> Slices are objects tho and this means paying a runtime cost 
> versus just using a variable. Also the only place I used 
> pointers are with C-type string (char* or u8* in my custom 
> "str") and one member (_count) in my custom "str" struct. And 
> all of this cases were checked very carefully and they are very 
> specific. People that will use the library should rarely need 
> to use pointers and this is what I try to do with my library 
> and why I don't use "libc". However! When it comes to the 
> actual library itself, I want to go as low level as possible 
> and have a library that is as performant as possible.

No slices are basically `struct T[] { T* ptr; size_t length; }` - 
it's returning pointer + length in a 16 byte struct (on 64 bit, 
possibly by returning via 2 registers) and does not introduce any 
indirections. I think it's the best way to handle more (or less) 
than one element pointers anywhere in D.

Slices do bounds checking and with that add more safety to your 
program. You can disable it globally (unless in @safe code) but I 
would recommend not doing so. It can be a performance issue for 
algorithms that are on a very hot code path, like in big loops. 
In these cases I would recommend using some kind of 
`assert(maxValue < slice.length);` before your loop and to 
disable single bounds checks inside the loop use `slice.ptr[i]` 
instead of `slice[i]`.

Usually it is not necessary to do this unless you are working on 
some low-level algorithms like sorting, unicode processing, 
parsing, etc. Additionally having the length with your array can 
often give you performance improvements as you can just use a 
simple loop and don't need to check every item in your array to 
be the null terminator, which x86 processors can greatly speed up 
and parallelize!