Proposal for new compiler built-ins: __CTX__ and __CONTEXT__

Andrej Mitrovic andrej.mitrovich at gmail.com
Thu Jun 22 17:18:25 UTC 2023


This isn't a DIP but I'd like to start this conversation to see 
what people think.

One pattern I often see in D libraries is the capturing of 
context of the call site, for example the file and line number 
and sometimes the function and module names.

There are many examples of this in the real world. Here's a quick 
search result of all the D modules containing `__FILE__` on 
Github: 
https://github.com/search?q=%22__FILE__%22+lang%3AD+&type=code

And here's similar results for `__FUNCTION__`: 
https://github.com/search?q=%22__FUNCTION__%22+lang%3AD+&type=code

They're often passed in pairs, or sometimes quadruplets.

I have a few things in mind:
- It's repetitive having to list out all of these context 
keywords and assign each one of them to their own free parameter.
- It's visually noisy seeing them and it can distract the reader.
- It's possible to accidentally pass the wrong argument to a 
parameter that is default initialized to `__FILE__` or 
`__LINE__`. Maybe your logging function takes a formatting string 
but you accidentally pass the formatting string to the parameter 
`string file = __FILE__` parameter.
- Once these parameters are received it's cumbersome to pass them 
around to other routines. Passing these downwards is a pretty 
common use-case of many libraries.
- It consumes stack space due to the need to pass context 
parameters on the stack, and might even eat up the EAX register 
(according to the ABI: 
https://dlang.org/spec/abi.html#parameters). This may or may not 
be a problem for you, depending on your use-case and depending on 
how many parameters you actually pass around.

We could simplify this and maybe even make it more flexible.

I'm envisioning a new type `__CTX__` which contains all of these 
different contexts which are currently separate keywords on: 
https://dlang.org/spec/expression.html#specialkeywords.

Here it is:

```D
struct __CTX__ {
     string file;
     string file_full_path;
     int line;
     string func;
     string pretty_func;
}
```

Having it defined as a struct serves a few objectives:
- It makes it easy to declare and use this type in user code.
- All library code will have one single compatible type they can 
easily pass around to each other.
- Makes it harder to confuse parameters. For example it's 
currently easy to pass a random string to a function expecting a 
`__FILE__`, because the parameter is a string.
- It makes it possible to choose whether to receive and pass this 
structure around by value or by reference.

We also need to initialize it. So perhaps we'd call this 
initialization keyword `__CONTEXT__`.

Here's how the client code might look like:

```D
// ctx passed by stack
void infoStack(string msg, __CTX__ ctx = __CONTEXT__) {
     writefln("%s(%s): %s", ctx.file, ctx.line, msg);
}

// ctx passed by pointer
void infoRef(string msg, ref __CTX__ ctx = __CONTEXT__) {
     writefln("%s(%s): %s", ctx.file, ctx.line, msg);
}

void main() {
     infoStack("Hello world");
     infoRef("Hello world");
}
```

Notice that the calls to `infoStack` and `infoRef` will generate 
different assembly code, as `__CTX__` is passed by value when 
calling `infoStack` and by reference when calling `infoRef`.

Here's a full example with some fake context just to give a 
clearer picture:

```D
import std.stdio : writefln;

ref __CTX__ __CONTEXT__() {
     static __CTX__ ctx = __CTX__("mymod.d", 
"/project/src/mymod.d", "mymod", 123,
         "mymod.func", "void mymod.func()");
     return ctx;
}

struct __CTX__ {
     string file;
     string file_full_path;
     string mod;
     int line;
     string func;
     string pretty_func;
}

// ctx passed by stack
void infoStack(string msg, __CTX__ ctx = __CONTEXT__) {
     writefln("%s(%s): %s", ctx.file, ctx.line, msg);
}

// ctx passed by pointer
void infoRef(string msg, ref __CTX__ ctx = __CONTEXT__) {
     writefln("%s(%s): %s", ctx.file, ctx.line, msg);
}

void main() {
     infoStack("Hello world");  // mymod.d(123): Hello world
     infoRef("Hello world");    // mymod.d(123): Hello world
}
```

I think it should be possible to receive `__CTX__` by reference 
from the compiler as all of the default parameters must be known 
at compile-time. So the context could be stored in `RODATA` or 
somewhere and an address taken from it (But I'm a bit out of my 
depth here).

I can think of some downsides:
- Passing the `__CTX__` by stack can make you capture more than 
you're interested in. Perhaps you're only interested in file + 
line and don't care about module name, function name, etc. That 
means you're suddenly consuming a lot more stack space.
- Counter-point: You can always continue to use `__FILE__` and 
`__LINE__`. I'm not suggesting to deprecate or remove these.
- It's possible `__CTX__` could later grow if the compiler 
developers decide to add a new context field to it. This would 
somewhat negatively impact all user-code which takes `__CTX__` by 
value as it would grow the stack usage.
- Therefore people might opt to use `ref __CTX__`, but that adds 
pointer indirection.

I believe context is often used in more expensive situations so 
passing by reference might be okay in many cases, for example:
- File and line contexts are used when throwing exceptions. 
Passing the file+line through a pointer indirection (`ref 
__CTX__`) isn't that expensive compared to throwing the exception 
itself.
- File and line contexts are used when logging messages. If this 
involves File I/O as it often does then the pointer indirection 
is a fraction of the total cost of the operation.

-----

Any other benefits / drawbacks, or unforeseen complications you 
can think of?


More information about the Digitalmars-d mailing list